Sage Journals: Discover world-class research

Abstract

We motivate and present the methodology of vignette studies. The primary contribution of this paper is our proposal of a novel vignette study design: “SMART vignettes.” Our design has two notable features: the first is its use of sequential randomization, which conceptually originates from the sequential multiple assignment randomization trial (SMART) design developed by Murphy (2004). The second feature is adaptive allocation. These new features in vignette studies offer unique advantages not offered by traditional vignettes: (1) valid causal inferences on the conditional distributions of the primary outcome of interest, given other factors, (2) balanced allocations across groups, and (3) a greater degree of interactivity for the survey respondent. We illustrate the utility of our method using a case example of a vignette study used to probe physicians’ attitudes toward an AI-embedded clinical system. In this example, a SMART vignette was used to randomize hypothetical scenarios to gain a better understanding of the causal impact of physician attitudes, given emerging evidence that a range of factors including previous decisions, play a role in influencing clinical decisions. We simulated hypothetical vignette studies under both SMART and conventional (i.e. single randomization at baseline) designs. We varied the number of factors for each study and fixed each factor to have two levels. Relative loss was used to compare the degree of imbalance between groups. Both designs had smaller relative losses with larger sample sizes. The SMART study design had lower loss than its conventional counterpart for all values of $N$ for all studies, indicating better balance. As demonstrated by the relative loss in our simulations, our proposed SMART vignette design has an advantage over the conventional design. This method holds promise in generating new knowledge in decision making scenarios occurring over multiple and discrete time points.

Keywords

Vignette survey research sequential randomization decision making AI physician interaction

Introduction

Vignettes are hypothetical scenarios meant to prompt respondents and elicit reactions without exposing respondents to actual scenarios. Vignette studies have been used widely in the fields of political science, human computer interaction, and social science, as a branch of survey methodology. Hypothetical scenarios can be particularly useful when used as a means to probe perspectives regarding emerging hypotheses or to test hypotheses regarding the reactions of individuals to specific scenarios.

In this paper, we motivate and present the methodology of vignette studies. The primary contribution of this paper is the introduction of a novel vignette study method, SMART vignettes, an adaptation of the sequential multiple assignment randomization trial (SMART) design developed by Murphy (2004), applied to online survey designs. While this method is not a clinical trial, it borrows the critical element of sequential randomization from the methodology. We use a primary motivating example of clinical decision making in AI to demonstrate the utility of the method in allowing inferences on physician reactions toward AI-embedded clinical decision making systems. We argue that our resultant novel method is well poised to probe decision making contexts occurring over discrete time points. Our proposed vignette approach leverages two new design characteristics, namely sequential randomization and adaptive allocation. While conventional vignette studies have been widely deployed in various fields, our newly proposed methodology offers several benefits not yet afforded by vignette studies, by virtue of the novel design features. These features offer three new advantages which are distinct from traditional vignettes: (1) valid causal inferences on the conditional distributions of the primary outcome of interest, given previous outcomes and other contextual factors, (2) balanced allocations across groups, and (3) a greater degree of interactivity for the survey respondent.

Proposed method

Background

Vignettes are narrative stimuli describing hypothetical situations and are intended to elicit reactions from survey respondents. A vignette study may feature a number of key factors of importance, each of which may vary by a fixed number of levels. In a full factorial design, a vignette population is considered to be the complete set of vignettes consisting of all possible combinations of levels for each factor. A single vignette is thus a specific combination of levels from each factor. As the size of the vignette population grows exponentially with each factor, randomization can be used to assign a respondent a single vignette rather than the whole vignette population (Atzmüller and Steiner, 2010). For example, a survey respondent can be randomly assigned to either one of two levels, agreement or no agreement (Table 1).

Table 1.

Example vignette for testing whether level of information impacts response.

Physician agreement	No physician agreement
The dermatologist tells you that, based on their professional experience, they agree with the results of the computer aided detection (CAD) tool.	The dermatologist tells you that, based on their professional experience, they do not agree with the results of the computer aided detection (CAD) tool.

In conventional vignette designs, vignette assignment occurs at baseline prior to survey distribution. The vignette is presented in its entirety to the respondent, followed by the question set. With such a design, inferences can be made about the joint distribution of responses. For example, Figure 1 shows a traditional vignette study with two factors $(X_{1}, X_{2})$ , each with two levels ( $X_{i} = 1$ or $- 1$ for all $i$ ). In this setup, randomization to a vignette (i.e. $X_{1}$ , $X_{2}$ allocations) is done by sampling from the vignette population with equal probability at baseline. The vignette is given to the participant in its entirety followed by the questions, allowing us to infer how $Y_{1}$ and $Y_{2}$ vary together based on $X_{1}$ and $X_{2}$ (i.e. $P (Y_{1}, Y 2 | X_{1}, X_{2}))$ .

Figure 1.

Setup of a traditional vignette study. Randomization is performed by sampling from the vignette population with equal probability at baseline. The path outlined in bold represents one possible vignette that is assigned to the participant. The vignette is presented to the survey respondent followed by a set of questions.

An online SMART vignette survey

We propose an vignette survey study design that is meant to be delivered online and in a dynamic fashion.

We propose a novel vignette method with two features. The novelty is twofold:

Sequential randomization. Our design involves taking one factor of the vignette at a time, breaking it down into multiple time points (Figure 2). At each time point, the respondent is randomized to only a subsection of the vignette, from here on referred to as a subvignette, and is then prompted with a set of questions. The delivery of the vignette is thus done in a sequential manner that enables inferences on the outcomes whose order carries information on other subsequent outcomes.

Adaptive allocation. The assignment of subvignettes may depend on covariates or contextual information. Based on previous responses, we can adaptively allocate subvignettes to promote the balance between groups using adaptive allocation strategies such as Efrons biased coin.

Figure 2.

Setup of study with sequential randomization. Randomization occurs at multiple time points throughout the study. Respondents are randomly assigned to the first subvignette that characterizes one factor of the vignette then are prompted with a set of questions; the process repeats for other factors. The full vignette is delivered in a sequential fashion, that is as a sequence of subvignettes. The path outlined in bold shows one possible vignette comprising first (solid red line) and second (dotted magenta line) subvignettes.

Sequential randomization

In the clinical trial setting, the sequential multiple assignment randomized trial (SMART) design randomly assigns a patient to an initial line of treatment, and then re-randomizes the patient at a later stage based on the patients intermediate responses to the treatment (Thall et al., 2000). When applied to surveys, SMART vignettes can be viewed as adaptive surveys in the framework of sequential randomization (Montgomery and Cutler, 2013). Instead of presenting a full vignette with a pre-determined combination of characteristics at once, we present a portion of the vignette reflecting a particular characteristic of the vignette at a time. Hence multiple randomization time points are needed, one for each characteristic of the vignette (i.e. subvignette). At each time point, the respondent is randomized to a single subvignette, which is followed by a set of questions. After the response, the participant gets re-randomized to the next subvignette.

Sequential randomization is especially useful when the order of outcomes carries significance, or when subsequent outcomes depend upon the previous outcome. The repeated randomization allows valid causal inferences concerning the relative effectiveness of the intervention (or narrative stimuli) options (Rubin, 1974).

Adaptive allocation

While randomization enables causal inference, it does not guarantee balance across subgroups defined by the vignette population. Complete randomization often results in imbalance. To address this issue of imbalanced groups, Efron (1971) introduced the biased coin design as an adaptive allocation strategy. The concept of this allocation strategy is that it assigns an incoming respondent with a fixed probability $(p = 2 / 3)$ to the group that is underrepresented.

The overall role of an adaptive randomization strategy in this setting is to encourage sufficient cell counts for all vignette and response combinations. Biased coin methods promote balance by encouraging allocations to the under-represented group. Various extensions of the biased coin design have been since proposed and can be used, such as the urn design (Wei, 1978) and the generalized biased coin design proposed by Smith (1984) as a large class of designs covering the biased coin design as a special case (Atkinson, 2014).

Advantages

SMART vignettes allow the researcher to answer questions that are distinct from those that can be answered via traditional vignettes. Our proposed approach is suitable for a research problem for which it is helpful to infer the conditional distribution of the primary outcome, given all other contextual variables. Traditional vignettes with randomization allow for valid causal inference on the joint distribution of multiple outcomes conditioned on the factors (i.e. $P (Y_{1}, Y_{2}, Y_{3} | X_{1}, X_{2}, X_{3})$ ), whereas SMART vignettes allow for causal inferences on univariate distributions of each outcome conditioned on factors (i.e. $P (Y_{1} | X_{1})$ , $P (Y_{2} | X_{1}, Y_{1})$ , $P (Y_{3} | Y_{2}, X_{2}, Y_{1}, X_{1})$ ).

Moreover, conventional designs can burden the participant by administering a full set of vignettes. In studies with many factors, only a sample of the set is presented to reduce response burden, but doing so comes at the cost of imbalance. SMART vignettes administer a sample of the full set without sacrificing balance between groups. In addition, while interactivity is not unique to the SMART design, web-based surveys can allow for interactivity between the survey and the participant.

In the next section, we describe the setting of a health care scenario as a motivating example in which the conditional expectation (distribution) of the primary outcome (i.e. decision making) is of interest, instead of the joint distribution of outcomes. This type of outcome may be more salient in settings featuring a decision making process in which a provider or patient receives information across different time points.

Case example: Application to AI and healthcare

We present a case example of an online survey for physicians, which implements the SMART vignette design to probe physicians’ reactions in decision making scenarios featuring an AI-embedded clinical system. This example serves to demonstrate the utility of the proposed method in allowing salient inferences of stakeholders reactions toward AI-embedded clinical decision making systems. In this example, a SMART vignette design was used to randomize hypothetical scenarios to gain a better understanding of the causal impact of physician attitudes, given emerging evidence that a range of factors, including physician understanding, transparency of AI, and trust in AI, among others, may play a role in influencing clinical decisions.

Figure 3 depicts the flow of an online survey by Kim et al. (2024) utilizing the SMART vignette design. This survey for physicians is currently in recruitment. The survey features three randomization time points and characteristics (or factors) of interest $(X_{1}, X_{2}, X_{3})$ . The vignette characteristics are as follows: patient risk (high risk versus low risk), information on AI (high information versus low information), and certainty of AI output (high certainty versus low certainty), in that order. For the risk subvignettes, the respondents answer questions regarding their assessment of the patients condition $(Y_{1})$ . They are then presented with a scenario that provides either a high or low amount of information on the AI tool and are asked for their evaluation $(Y_{2})$ . After this scenario, the respondents must make a clinical decision based on all the information they are given throughout the vignette $(Y_{3})$ .

Figure 3.

Setup of our SMART vignette study. The above design shows the setup of our SMART vignette study. There are three factors in the vignette: patient risk, information on AI, and certainty of AI output, respectively.

A unique advantage of the sequential delivery of vignettes is that it affords the researcher the ability to infer the conditional distribution of the outcome, which in the example of our physician survey, is the clinical decision based on prior outcomes and variables. This is distinct from the type of inference afforded by traditional vignettes, which is restricted to merely the joint distribution of the outcomes and not the conditional distribution. The second feature of adaptive allocation serves to encourage balanced and sufficient cell counts for inference.

For example, inferences on the joint distribution will answer the question: based on a patients risk profile, their attitudes toward the AI tool, and the AI output, how likely will a physician make a certain assessment about the patient, think a certain way about the AI tool, and make a certain clinical decision simultaneously (i.e. considering all outcomes as a joint event)? On the other hand, inferences on the conditional distribution would allow us to answer the question: how likely will a physician make a certain clinical decision based on their assessment of a given patients risk profile, their attitudes toward the AI tool, and the output of the AI tool? In the context of studying clinical decision making, the latter is more suitable. While the application of this type of design was motivated by a clinical scenario, it can also be generalized to any scenario which takes place over discrete time points.

Implementation

The general steps for implementing the SMART survey are as follows.

Randomly allocate the first subvignette $(X_{1})$ .

Observe response to the first subvignette $(Y_{1})$ .

Adaptively allocate the second subvignette $(X_{2})$ based on prior history of subvignette allocation and response $(X_{1}, Y_{1})$ .

Observe response to the second subvignette $(Y_{2})$ .

Adaptively allocate the third subvignette $(Y_{3})$ based on prior history of subvignette allocations and responses $(X_{1}, Y_{1}, X_{2}, Y_{2})$ .

Dropouts, if any, do not impact adaptive allocation because each allocation requires counting the number of people with the same assignments and responses (i.e. referred to as their history). If a given participant drops out and is therefore missing a response $(y)$ , they will not be considered in the randomization. For a more detailed justification, refer to the Supplemental Material.

For the example online survey, a web application was developed and is hosted by Stanford University under the domain https://smartvignettes.stanford.edu/ (refer to Supplemental Material). The application was implemented using Python Flask and deployed using Amazon Beanstalk and EFS, with survey responses stored on a secure SQLite database. This platform allows sequential randomization and adaptive allocation, which are features not yet offered by other survey software such as Qualtrics or REDCap.

Simulation study

We simulated multiple hypothetical studies using Python to compare two designs: (1) SMART and (2) a conventional study using baseline randomization. We assumed each hypothetical vignette study to have two levels for each factor, and we varied the number of factors (to up to 4) for each study. Participant responses to any given subvignette were assumed to be binary and were generated by sampling from a Bernoulli distribution. The probability of the Bernoulli distribution was assumed to be the inverse logit of prior responses and subvignette allocations. We evaluated the two study designs on their imbalance of groups.

Conventional implementation

To simulate the conventional approach to vignette studies, we generated all possible vignettes (i.e. all $X_{i}$ allocations) and sampled from this list with equal probability for each participant. The assignments of the subvignettes were thus independent of one another.

SMART implementation

At the first randomization point of the SMART study, participants were assigned to a subvignette at random, with a probability of 0.5. For each subsequent time point, adaptive randomization was used to allocate the subvignettes. This means that all prior responses and allocations (i.e. history) were compared across subvignettes (Figure 4). In the case of unequal allocation across two subvignettes, Efrons biased coin was used to allocate the current participant (Efron, 1971). The biased coin would favor the subvignette with fewer participants with pre-specified probability $p$ . If there was an equal allocation across the subvignettes, an unbiased coin was used (i.e. $p = 0.5$ ).

Figure 4.

Example of adaptive allocation. Subvignette allocation realizations $(X_{i})$ are either $- 1$ or 1 and participant responses are 0 or 1 . If a participant with history $(- 1, 0)$ (depicted in bold solid line) is being randomized at the second time point, the number of prior participants who fall into the bold dotted boxes (i.e. individuals with histories $(- 1, 0, - 1)$ and $(- 1, 0, 1))$ are compared. Efrons biased coin favors the subvignette $(X_{2})$ with fewer people.

Simulation results

We adjusted the parameters for the weights and biases of the logit function, as well as the probability of Efrons biased coin. We also defined relative loss in order to quantitatively compare the imbalance between groups for each method (refer to Supplementa Material). For each combination of parameters, the relative loss was calculated under both SMART and conventional study designs at each randomization time point. The conventional design only involves randomization at baseline, whereas the SMART design involves multiple points of randomization. Figure 5 shows the loss as a function of number of respondents $N$ across five iterations for one such combination, after all randomizations have occurred.

Figure 5.

Relative loss as a function of number of respondents $N$ after all randomizations. Different line styles indicate each study with different number of factors (2–4). The simulation was run five times for each study. The thick lines indicate the average relative loss across all iterations. The shaded region indicates relative loss within 1 standard deviation.

The thick line indicates the average relative loss across all iterations and the shaded region indicates relative loss within 1 standard deviation. Different line styles denote each study varying in number of factors (2–4). When $N$ increased, the relative loss decreased for both the SMART method and for the conventional method. The SMART study design had lower loss than its conventional counterpart for all values of $N$ for all studies.

Discussion

In this paper we presented a novel method, SMART vignettes, as a means to generate stakeholder perspectives to address a critical gap in the current AI healthcare field. A unique aspect of our proposed design of SMART vignettes is the sequential delivery of questions that is adaptive by nature. As demonstrated by the relative loss in our simulations, our proposed approach has an advantage over conventional design.

Strengths of our proposed approach include its potential utility in generating data for decision making scenarios in discrete time settings, and in particular scenarios that are of high risk or are in need of testing before deployment. In the example of AI-embedded clinical care, hypothetical scenarios help probe physician perspectives and generate an understanding of when and under what conditions physicians might accept or ignore AI-augmented output in advance of deployment.

This paper has a few limitations. One is that the novel design features we presented are not yet widely available in existing survey platforms (e.g. REDCap and Qualtrics) and therefore has not yet been widely tested outside simulations. While there is a physician survey currently in progress, making this type of SMART design widely deployable to any research team and domain scientists who wish to construct their own SMART vignette survey is an area of future work. Much like fixed design vignette surveys, responder fatigue may also still pose a challenge. However, we argue that the dynamic nature of the survey allows for balanced statistical analysis and for a greater degree of interactivity which may be favorable for some survey responders. Future studies comparing traditional vignettes with SMART vignettes can be used to evaluate how the methodology impacts the perceived burden experienced by the participants as well as dropout rates.

Studies utilizing SMART vignettes stand to generate data on stakeholder reactions when informed by mixed-methods qualitative work, such as semi-structured interviews, focus groups, and/or surveys to articulate questions of interest and hypotheses. Once such questions are articulated, the researcher can probe further using scenario based testing. We posit that this method holds promise in advancing knowledge about how patients and physicians may respond to novel intelligent systems embedded in their health care.

Supplemental Material

sj-pdf-1-mio-10.1177_20597991241240081 – Supplemental material for A novel experimental vignette methodology: SMART vignettes

Supplemental material, sj-pdf-1-mio-10.1177_20597991241240081 for A novel experimental vignette methodology: SMART vignettes by Jane Paik Kim and Hyun-Joon Yang in Methodological Innovations

Footnotes

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research,authorship,and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research,authorship,and/or publication of this article: This work was supported by the National Institutes of Health [Grant R01TR003505].

ORCID iD

Hyun-Joon Yang

Supplemental material

Supplemental material for this article is available online.

Author biographies

Jane Paik Kim is Clinical Associate Professor of Biostatistics in the Department of Psychiatry and Behavioral Sciences at Stanford University School of Medicine. With a PhD in Statistics,she has a unique background of theoretical statistics,over a decade of experience in applied statistics methods in psychiatry and medicine,and has expanded her work to AI ethics from the lens of the inclusion of diverse stakeholder perspectives. Her research focuses on the application of statistical methods to improve public mental health,psychiatric ethics,and the development,testing,and personalization of adaptive behavioral interventions delivered through technology.

Hyun-Joon Yang is a Research Data Analyst in the Department of Psychiatry and Behavioral Sciences at Stanford University School of Medicine. He joined the department after receiving his B.S. degree in Computational Neuroscience where he provides research support in database management,data processing,and statistical modeling. Currently,he is involved in multiple studies investigating ethics of medical AI,eating disorder interventions,and neurodevelopmental disorders.

References

Atkinson

(2014) Selecting a biased-coin design. Statistical Science 29(1): 144–163.

Atzmüller

Steiner

(2010) Experimental vignette studies in survey research. Methodology 6(3): 128–138.

Efron

(1971) Forcing a sequential experiment to be balanced. Biometrika 58(3): 403–417.

Kim

Yang

Kim

,et al . (2024) Understanding physician perspectives on ai in health care: Protocol for a sequential multiple assignment randomized vignette study (preprint). JMIR Research Protocols. Epub ahead of print 6 February 2024. DOI: 10.2196/54787.

Montgomery

Cutler

(2013) Computerized adaptive testing for public opinion surveys. Political Analysis 21(2): 172–192.

Murphy

(2004) An experimental design for the development of adaptive treatment strategies. Statistics in Medicine 24(10): 1455–1481.

Rubin

(1974) Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of educational Psychology 66(5): 688.

Smith

(1984) Properties of biased coin designs in sequential clinical trials. The Annals of Statistics 12(3): 1018–1034.

Thall

Millikan

Sung

(2000) Evaluating multiple treatment courses in clinical trials. Statistics in Medicine 19(8): 1011–1028.

10.

Wei

(1978) An application of an urn model to the design of sequential controlled clinical trials. Journal of the American Statistical Association 73(363): 559–563.

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

0.45 MB