Abstract
Decisions based on average measures of cost-effectiveness may lead to incorrect treatment recommendations for specific subsets of the population. 1 This is because a treatment that is cost-effective for one type of patient may not be so for others. This type of heterogeneity can be ascribed to both individual and contextual-level factors, 2 which, if taken into account by the decision maker, would support a more efficient allocation of resources.
Although the concept of subgroup analysis has a long history,3,4 its adoption has been met with caution in some areas of health care decision making, possibly due to concerns related to low statistical power and multiple statistical testing.5–7 Many authors have indicated their preference for using an average measure of treatment effect4,8,9 and for findings from subgroup analyses being considered exploratory in nature.8,10 More recently, there have been arguments against the use of strict inferential rules in assessing the effects of interventions for health care decision making.11–14 This emphasizes the importance of considering evidence on subgroups of patients when making probabilistic statements about the (cost-) effectiveness of a given treatment strategy.2,15 Several authors have made important contributions in this area in recent years. Phelps 16 introduced the idea that heterogeneity in cost-effectiveness can be explained by different factors (baseline risk, treatment efficacy, costs, and patient preferences), Coyle and others 15 proposed methods to quantify the potential health gains facilitated by making different decisions for different subgroups (stratified analysis), and Basu and Meltzer 17 extended this concept to decisions at the individual level (expected value of individualized care [EVIC]). All of these contributions have focused on the value of understanding the reasons for variability (i.e., translating variability into heterogeneity explainable by observable characteristics), but none has fully addressed how variability and uncertainty interact. 18 Nor have the implications for the relative priority of different types of research or the most appropriate level of discrimination (stratification) in differential access to care been fully explored.
Several national health care agencies19,20 responsible for issuing recommendations about the adoption of new medical technologies support the use of subgroup analyses when making decisions about new health technologies. For example, the methods guidance for technology appraisal issued by the National Institute for Health and Care Excellence (NICE) for England and Wales 19 states that “for many technologies, the capacity to benefit from treatment will differ for patients with differing characteristics. This should be explored as part of the reference-case analysis by the provision of estimates of clinical and cost-effectiveness separately for each relevant subgroup of patients.” Similarly, the technical guidance of the Canadian Agency for Drugs and Technologies in Health recommends “stratified analysis of smaller, more homogeneous subgroups, where appropriate, if there is variability (heterogeneity) in the target population.” 20 However, no specific guidance is offered for how to explore and reflect heterogeneity when conducting subgroup cost-effectiveness analyses to inform decisions.
Given the current policy debate and government agenda relating to the personalization of health and social care services in the United Kingdom 21 and elsewhere,22–24 a framework to support and guide decision making in different groups of patients is rapidly becoming a key policy need. This article presents a conceptual framework to explore heterogeneity between patients, consistent with the objective of maximizing population health subject to the resources available to a health care system. The framework builds on earlier work in this area,15,17 adding the following elements: First, it introduces the efficiency frontier for subgroup analysis, an analytical tool that can be used to guide the choice of the optimal subgroup definition. Second, it characterizes 2 dimensions of the value of understanding heterogeneity: 1) the expected health gained because of stratified decisions and 2) the additional value of further data collection aimed at resolving subgroup-related uncertainty. The proposed framework is tested using a policy-relevant analysis as a simple extension of current cost-effectiveness methods. The article ends with a final section discussing the strengths and weaknesses of this work.
Subgroup Cost-Effectiveness Analysis Under Current Information
Net Benefits for Subgroup Cost-Effectiveness Analysis
Classical decision rules in cost-effectiveness analysis (CEA)
25
state that, under current information, the optimal strategy among
That is, the optimal strategy is the one with the greatest expected net benefit (NB). If the total (present and future) patient population expected to benefit from the intervention is defined as
where
Coyle and others 15 showed that by considering heterogeneity in treatment effect between patient subgroups within this framework (i.e., due to the presence of observed treatment effect modifiers), different recommendations could be made for different subgroups. This results in a greater expected NB compared with decisions based on the average across the patient population as a whole. Later, Basu and Meltzer 17 proposed EVIC, a metric that represents the additional value, in terms of NB, of making decisions at the level of the individual (patient) compared with that of the average population. EVIC can also be estimated for individual parameter(s), indicating the value of categorizing the population based on a particular (set of) parameter(s) to make individualized decisions about health care interventions.
Using a slightly different notation from that used by Coyle and others,
15
the total incremental net benefit (TINB) (with
where
and the INB gained from reflecting heterogeneity in decisions—what Coyle and others
15
termed
The estimation of the total net benefits (
which is the weighted sum of the maximum NBs for each subgroup, and the Δ
which corresponds to the difference between the weighted sum across subgroups (equation 7) and the average NBs.
Definition of Subgroups and Sources of Heterogeneity
Cost-effectiveness analysis needs to assess heterogeneity in a wider set of parameters than those typically considered in clinical studies.2,26 The analysis of clinical trials generally focuses on inferences about treatment effects for the patient population defined by the study's inclusion criteria. Interest in heterogeneity is generally confined to treatment effect moderators. 10 In addition, there may be clinical interest in heterogeneity in the underlying (or baseline) risk of adverse clinical events associated with a disease. 27 This can lead to subgroup differences in the absolute benefit conferred by a treatment offering a common proportionate risk reduction across the entire patient population. There may also be situations in which baseline risk is correlated with the relative treatment effect. 28 These sources of heterogeneity relating to the intervention and the disease are also important in CEA. In addition, resource use (and hence costs) may systematically vary between individuals based on their characteristics and the geographical location of their treatment. 29 Finally, heterogeneity in individual preferences (and hence benefit) is increasingly recognized as another key source of heterogeneity in cost-effectiveness.3,26,30
There are, however, some potential constraints on subgroup analysis for CEA. One is the need to consider the costs of implementing subgroup-specific guidance in the health system. These could include the costs of acquiring relevant characteristics of individual patients and the costs of monitoring whether clinicians’ practice is consistent with the guidance for patients with those characteristics. There are also potential ethical and equity constraints when conducting subgroup CEA for decision making. For example, NICE considers unethical the use of age as a source of heterogeneity in its decisions unless it directly affects the efficacy of an intervention. 19 These constraints should be made explicit when subgroups are being defined.
Thus, the first issue decision makers must address in this context is the choice between subgroup specifications (
Consider an evaluation of the cost-effectiveness of 2 alternative treatments for non-ST elevation acute coronary syndrome with
The goal is to identify relevant subgroups and associated specifications that produce the highest expected NB, resolving the following maximization problem:
Equation 9 suggests that the specification with the greatest total expected NBs should be preferred, given current information. These points would lead to an efficiency frontier similar to that depicted in Figure 1, which represents the range of possible expected NBs achievable using alternative specifications for each given number of subgroups.

Efficiency frontier for subgroup analysis. Note: The dotted line joins the potential best specifications for each number of subgroups (
The dotted line represents the frontier showing the set of the most efficient specifications for each number of subgroups. Figure 1 introduces 2 further important elements: first, the notion that there may be instances in which consideration of subgroups does not add any further societal benefit (in terms of expected NB) compared with what is achieved by providing a given treatment to the whole patient population. This would happen if the same treatment decision is appropriate for all subgroups (A). Second, in other cases, further exploration of subgroups offers an additional societal benefit, even given current information. This could be exclusively explained by the effect of a different specification for the same number of subgroups (B) or because additional numbers of subgroups have been taken into account (C).
Decision Uncertainty and the Value of Additional Research
The framework has so far not considered the role of uncertainty in CEA, which can be classified as structural and parameter uncertainty.18,32 Although parameter uncertainty is the primary focus here, the same method of analysis can be applied if structural uncertainties are quantified or parameterized. 33 Parameter uncertainty reflects imperfect (imprecise) knowledge about the true mean value of a (set of) parameter(s) in the model. The existence of parameter uncertainty implies the possibility of making a wrong decision about which intervention is expected to be cost-effective (on average) for a target population or subpopulation of patients. Therefore, additional evidence is valuable because it can inform future decisions that will benefit future patients. Value of information (VoI) methods can be used to quantify the expected health that might be gained if the uncertainty surrounding decisions about the coverage or reimbursement of new health technologies were resolved.14,34–36 This quantification requires an estimate of the value of making decisions once this uncertainty has been resolved (i.e., with perfect information or a sample size such that the probability of making the wrong decision is expected to be zero). In this circumstance, the decision maker would be able to select the intervention that maximizes NBs at the true value of the vector of parameters θ, that is,
Since the true value of θ is unknown, only the expected value of this quantity can be estimated by averaging the maximum NBs over the joint distribution of θ,
The expected value of perfect information (EVPI) in equation 12 is the difference between the NBs derived from a decision made with perfect information (equation 11) and the NBs derived from decisions under current information (equation 1). This represents the expected gain (in NBs), for a single patient, from collecting further information and resolving existing uncertainty.
Notice that the EVPI for the patient population can be derived by multiplying equation 12 and equation 2.
Claxton 14 showed that EVPI represents an upper bound for a new research proposal aimed at resolving the current levels of uncertainty. As long as obtaining new information is less costly than the population EVPI, there may be a positive potential payoff from further research. 37 Thus, a necessary condition is achieved when this payoff is positive; otherwise, investing in further research does not represent a good use of available resources.
If mutually exclusive subgroups are considered, different decisions can be made for different subgroups. Thus, under current information, the decision maker will need to choose for each subgroup
with the expected value of the decision for subgroup
and the EVPI for the subgroup
The
The population EVPI can be estimated by multiplying equation 16 by the future population of patients expected to benefit from the new information, which for
where
which represents the maximum amount of resources that the health system should be willing to pay for further research given a particular cost-effectiveness threshold.
This framework establishes a direct link between current decision uncertainty and the value of future research. Rather than considering uncertainty as a constraint for decision making, VoI highlights that its resolution is of value as a potential source of health gain.14,34,39 This article extends this concept to encompass the value of resolving those aspects of variability (as opposed to uncertainty) that can be understood as heterogeneity and coins the term
Both of these concepts can be illustrated graphically. Figure 2a illustrates the concept of EVPI. Here, for the whole population, the empty diamond marker represents the expected NB, expressed in health terms (net health benefit [NHB]) under current information, while the solid diamond marker indicates the maximum expected NHB achievable under perfect information, for one particular threshold value. The difference between these 2 quantities corresponds to the population-average EVPI.

Value of information, static and dynamic value of heterogeneity. (a) Value of information concept. Empty diamond represents the maximum expected net benefits (max EθNB), and the filled diamond shows the expected maximum net benefits (Eθmax NB). The difference between them is the expected value of perfect information (EVPI). (b) The distance A represents the EVPI for the average population. B is the EVPI for the population considering different decisions for 2 subgroups. C represents the static value of heterogeneity. D is the dynamic value of heterogeneity. In this case, where (A = B), acknowledging heterogeneity does not lead to a change in the expected value of further research. (c) The distances A and B represent the EVPI for the average and 2-subgroup case, respectively. In this case, B is greater than A. Distance C is equal to zero; hence, there is no static value of heterogeneity. D is greater than zero, representing a positive dynamic value of heterogeneity that is given by heterogeneity. (d) The distances A and B represent the EVPI for the average and 2-subgroup case, respectively. In this case, A is greater than B. The static value of heterogeneity (C) and the dynamic value of heterogeneity (D) are positive; however, D is smaller than C.
Figure 2b shows the case with 2 subgroups when there is value in identifying and reflecting heterogeneity using existing evidence, over and above the value associated with undertaking further research. Here, the total expected NHB with current information (represented by the empty markers) is greater when subgroups are considered. The difference between the total NHB of a decision for the entire population and the total NHB when considering subgroups (represented by the vertical distance C) is the static value of heterogeneity. A formal expression for this concept is equation 8. Notice that the scenario depicted in Figure 2b indicates that, even if additional data could be collected through new research to resolve any decision uncertainty for the subgroups, the expected NHB to be gained would be similar to what could be derived from the population-average case. This is indicated by the fact that the vertical distance B is equal to the vertical distance A. In this scenario, a policy maker might be interested in making different decisions for different subgroups, according to the evidence available, and investing in further research would still be worthwhile if this were aimed at resolving uncertainties not associated with heterogeneity.
Figure 2c shows a further scenario. Here, the same decision would be made for 2 subgroups under current information, which would yield the same total expected NHB as for the whole population. However, the estimate under perfect information obtained when considering subgroups (represented by the solid square marker) is greater than the NHB under perfect information derived from considering the population as a whole (represented by the solid diamond marker). The difference between these 2 quantities (indicated by the vertical distance D) is the dynamic value of heterogeneity. This value corresponds to the additional (population) health that is expected to be achieved when a sufficiently large sample is collected to resolve current uncertainty for a given stratification. This captures 2 sources of value: 1) the value of resolving uncertainty in the estimates of conditional parameters (parameter estimates conditional on the subgroup category determined by a particular specification) and 2) the value of estimating conditional parameters if uncertainty in their estimation could be resolved. The first of these values refers solely to uncertainty, that is, the difference between the expected maximum NHB and the maximum expected NHB (i.e., EVPI). The second is the value of heterogeneity with perfect information about mean parameter values.
Finally, Figure 2d illustrates a situation in which there are both static and dynamic values of heterogeneity (C > 0 and D > 0). However, a particular feature of this example is that the EVPI when considering subgroups (distance B) is less than the average EVPI (distance A). This would occur when the specification used to define subgroups is informative about heterogeneity. In this situation, the effect is not only observed under perfect information but also under current information (positive static value). Hence, the difference between current and perfect information is lower than the average.
Presentation of Subgroup Analysis and Choice of the Optimal Number of Subgroups: The Role of the Cost Function
By considering the static and dynamic dimensions of heterogeneity simultaneously, the trend in expected NHBs, as a function of the number of subgroups, can be shown graphically for both current and perfect information. Several alternative specifications are available for each number of subgroups. These range from no subgroups (indicated by the vertical bar on the left in Figure 3) to decisions at the individual level, where the treatment is chosen according to the comparison between the observed outcome and its counterfactual (indicated by the vertical bar on the right in Figure 3). It is, therefore, possible to plot the most efficient specifications, that is, those with the highest NHB for each number of subgroups. In addition, those specifications that have lower NHB under current information but are expected to produce higher NHB with perfect information might also be included in the graph.

Representation of the gaining in expected net health benefit with current and perfect information when heterogeneity is considered. Note: The continuous line shows the theoretical efficiency frontier for subgroup analysis. The dashed line illustrates the potential health gains if decisions with perfect information were made for each level of disaggregation. Both curves do not converge completely at the individual level because the counterfactual will never be measured without uncertainty. The transaction cost function shows 2 kink points illustrating a segment where the additional costs are higher than the additional gains. This represents the case in which the optimal number of subgroups is the lower between those 2 points.
In principle, if there is no residual unexplained heterogeneity (i.e., there is complete knowledge of the individual characteristics that determine variability), the decision maker has the best possible information to allocate resources efficiently (the right end of Figure 3). At the other extreme, the maximum value of exploring heterogeneity is observed when no subgroups have been taken into account. In terms of uncertainty, it is not necessarily the case that the maximum decision uncertainty is at the average population level. Indeed, increasing the number of subgroups could increase or decrease uncertainty, depending on how informative the specification used is and the reduction of the sample size due the number of subgroups considered. It would generally be expected, however, that if informative specifications are used to explain heterogeneity, then decision uncertainty should decrease as more heterogeneity is revealed. This uncertainty will, however, never be completely resolved because the true value of the individual treatment effect can never be measured, as the counterfactual can never be observed.
The optimal number of subgroups would tend toward
In general, the implementation, monitoring, and effective enforcement of a guideline will tend to become more costly as finer stratification is made. Therefore, the optimal level of stratification depends on the marginal costs and benefits associated with finer stratification. This might be expressed in terms of the ratio between the incremental net benefits and the additional transaction costs of 2 adjacent levels of disaggregation (e.g., 1 and 2). If the ratio is lower than 1, then the next relevant comparison is levels 1 and 3. If the ratio is greater than 1, then 2 subgroups are better than 1, and the next comparison should be 2 against 3.
The key qualitative conclusion, however, is that individualized care is not necessarily optimal. The most appropriate level of stratification will depend on context, such as the nature of the health care system (e.g., ease of monitoring and enforcement possibilities available to the decision maker), the nature of the characteristics that can be used to stratify (that must be easily observable and not easily manipulated), and the type of incentives faced by patients and their clinicians (e.g., third-party payment combined with fee for service tends to increase the incentives for moral hazard and the opportunity costs associated with it). 40
Case Study: Subgroup Analysis of the Cost-Effectiveness of an Invasive Treatment for Acute Coronary Syndrome
Background and Methods
The applicability of the framework and methods described so far is demonstrated using a case study. The efficiency frontier for subgroups, the EVIC, and the static and dynamic value of heterogeneity were estimated for a set of relevant specifications. The example is a CEA that used data from the multicenter trial RITA-3, which compared an intensive versus a conservative strategy for the management of patients with non–ST-elevation acute coronary syndrome.
41
Briefly, the study used estimates derived from the individual patient data (
The model was used to estimate the individual NHB of an invasive and conservative intervention using individual participant data from the randomized controlled trial. Between-individuals variation could be characterized thanks to the fact that each parameter of the model was estimated conditional to the profile of each individual. The mean costs and QALYs were averaged across individual estimates to calculate the ICERs. For subgroup analysis, the mean values were obtained as the average across those patients who belong to one particular category (e.g., diabetics). This provided the information to estimate the EVIC, which can be calculated as the difference between the average of the maximum individual NHB and the maximum of the average NHB.17,43
Parameter uncertainty was propagated through the model using probabilistic sensitivity analysis, which entailed running 1000 random draws from the (set of) parameter(s) characterizing each patient in the data set. This corresponds to the uncertainty surrounding the effect of the covariate on the estimation of the parameter of interest (e.g., transition probability, costs, or quality-of-life weights). This generated a total of 1 810 000 realizations; that is 4 matrices of 1000 by 1810 (2 for expected costs and 2 for expected QALYs of the invasive and conservative strategy, respectively). The model was implemented in Microsoft Excel 2007, and macros were written in Visual Basic (Microsoft Corporation, Redmond, WA). These matrices, each in a different Excel sheet, provided the data needed to implement the analytical framework. The uncertainty relating to the overall mean results was estimated by averaging each of the 1000 iterations across the 1810 individuals, producing a unique vector of 1000 iterations.
Subgroup CEA considered all covariates used in the regression equations in the original analysis. 44 The potential for each covariate to inform different decisions based on the cost-effectiveness was also assessed. A logit model was developed to examine the effect of each covariate on the probability that the new strategy is cost-effective, in 1 particular individual, at a cost-effectiveness threshold of £20 000 per QALY. All covariates were significantly associated with a greater probability of the invasive strategy's being cost-effective (Table 1).
Average Marginal Effects of 9 Covariates on the Probability That the Invasive Strategy Is the Most Cost-Effective at a Threshold of $20 000 per QALY
Note: Results are based on a multivariable logit model.
Six covariates were selected based on clinical plausibility, feasibility of implementation, ethical constraints, and the probability of being informative of cost-effectiveness according to the analysis presented in Table 1. Sex and age were excluded, because decisions that differentiate reimbursement based on those specifications are likely to be subject to ethical criticism. The numerical variable of pulse was excluded as a covariate because there is no consensus about how to categorize it, and it would be very difficult to implement alternative decisions based on an arbitrary definition. A further subgroup specification was defined based on a baseline risk score, as used in the original analysis. 44 The score was estimated from the trial data to predict the primary outcome. This baseline risk score was also used by the original cost-effectiveness study to explore heterogeneity in subgroups.
Parameter uncertainty for subgroups was analyzed using the same approach described for individuals, but separately for each specific subgroup. These estimates provided the basis to calculate the static and dynamic values of heterogeneity. All of the results shown in this case study were calculated for a threshold (λ) value of £20 000/QALY gained and expressed for an estimated population of 556 723 patients. This is based on an annual incidence of 59 756 patients, a time horizon of 10 y, and a discount rate of 3.5% per year. 45
Results
Table 2 reports the results of this analysis. As in the original study, the invasive strategy was found not to be cost-effective, on average, at λ = 20 000/QALY (ICER = £21 960/QALY). The expected NHBs yielded by the most cost-effective strategy (conservative strategy) are 4 397 388 net-QALYs. If further research is undertaken to resolve the current uncertainty, the expected NHBs are 4 408 143 net-QALYs. The EVPI is, therefore, 10 755 net-QALYs (4 408 143 minus 4 397 388). The total population EVIC was estimated at 14 349 net-QALYs. This corresponds to the difference between the expected maximum individual NHBs (4 411 737 net-QALYs) and the maximum expected NHBs (4 397 388 net-QALYs). The individualized analysis also provided evidence that the new strategy should be implemented in 591 patients (32.65%) in the sample if the cost-effectiveness decision rule were applied to each patient.
NHBs under Current and Perfect Information, and EVPI for the Specifications on the Efficiency Frontier for Subgroup Analysis
Note: The estimates consider a cost-effectiveness threshold of £20 000/QALY gained. NHB is expressed as QALYs net of costs. DM = diabetes mellitus; EVPI = expected value of perfect information; LBBB = left bundle branch block; NHB = net health benefit; PMI = previous myocardial infarction; QALY = quality-adjusted life-years; all covariates (including DM, LBBB, PMI, smoking, depression of the segment ST, and severe angina).
To identify those patients, subgroup analysis was conducted for 6 binary specifications: diabetes mellitus (DM), previous myocardial infarction, left bundle branch block, smoking, depression of the segment ST in the electrocardiogram, and severe angina. They were combined to explore specifications for 2, 4, 8, 16, and 64 subgroups. The expected NHB for current and perfect information and the EVPI of those specifications with the highest NHB for each level of disaggregation (specifications on the efficiency frontier) are reported in Table 2. Using all (6) covariates to characterize subgroup specification yielded 49 potential subgroups, since the remaining 15 subgroups were not represented in the sample. This analysis produced the highest expected NHBs (4 408 359 net-QALYs), which corresponds to a static value of heterogeneity of 10 971 net-QALYs, accounting for 76.5% of the total EVIC (10 971/14 349). By way of comparison, subgroup analysis based on patients’ baseline risk score was conducted in a similar way to the original cost-effectiveness study. 44 Five subgroups were defined (4 quartiles, with the upper one divided in 2 eights). The expected NHBs were 4 407 074 net-QALYs, which is less than what is obtained by using a guideline that combines all covariates.
Figure 4 shows the efficiency frontier (maximum expected NHB for those specifications with the highest NHB) and the expected maximum NHB (expected net health with perfect information) for the same specifications on the efficiency frontier. These results are consistent with the hypothesis that the efficiency frontier for subgroup analysis shows diminishing marginal returns in terms of NHB; that is, as additional levels of stratification are assessed, the marginal gains between adjacent levels (e.g., 3 versus 2 or 5 versus 4 subgroups) are lower. The figure also shows that the EVPI tends to decrease with higher level of disaggregation. An additional element presented in Figure 4 is the only subgroup specification (DM and smoking) in which less NHB is obtained with current information but more health can be expected with perfect information. In this case, the decision about adoption or rejection is not affected (because it corresponds to the specification on the efficiency frontier); however, if further research is planned, this additional specification might be taken into account since greater health is expected when the uncertainty, conditional on that specification, is resolved.

Expected net health benefits under current and perfect information for different levels of disaggregation. Note: Expected net health benefits (NHBs) are expressed in quality-adjusted life-years (net of costs). Empty diamonds represent the expected NHB achieved with current information, only for the specification that presented the maximum net health for each level of disaggregation. The dotted line across those points illustrates the efficiency frontier for subgroup analysis estimated from the data for 2, 4, 8, 16, and 49 subgroups. Filled diamonds represent the expected NHB that might be achieved for its corresponding specification on the efficiency frontier if the parameter uncertainty was resolved (perfect information). The empty and filled squares correspond to the expected NHB under current and perfect information for the specification defined by diabetes and smoking (dm&smoking), which was the only case in which, despite not being on the efficiency frontier, greater net health can be expected with perfect information. The gray circles (projected on the secondary axis) are the expected value of perfect information, which is the difference between current and perfect information.
The transaction costs of implementing guidelines for the different levels of disaggregation reported here were not estimated directly. However, because all of these covariates are part of routine clinical assessment, their implementation as part of a guideline is not expected to be associated with high transaction costs. This is not the case for the baseline risk score examined here, which has not been clinically validated and, therefore, would be difficult to implement in practice. A guideline based on the analysis that produced the highest NHB is presented in Figure 5. This combines all 6 covariates considered in this analysis. It is based on 49 subgroups provided by the study, which were grouped when they led to the same decision. The diagrams illustrate that a potential complex scenario can be simplified to a manageable clinical guideline.

Guideline for the specification that combines 6 covariates based on the results of the case study.
Discussion
The article contributes to frameworks and analyses to inform decisions regarding where health care resources should be invested: providing early access to new technologies, ensuring the findings of existing (or commissioned) research are (or will be) implemented, conducting research to provide additional evidence about particular sources of uncertainty in some (or all) subgroups, or conducting research that can lead to a better understanding of variability in effects. This type of research may be very different from the type of evaluative or comparative effectiveness research that commonly reduces uncertainty only about estimates of treatment effect. For example, it might include diagnostic procedures and technologies, pharmacogenetics, analysis of observational data, and treatment selection as well as novel trial designs that can reveal something of the joint distribution of effects. 40
Importantly, the framework also informs policy makers about the assessments that need to be made when considering finer stratification of access to treatment or promoting individualized care and patient choice. The key implication is that individualized care is not necessarily optimal, and the most appropriate level of stratification will depend on context.
Ethical and equity constraints have also been mentioned in this article, in the context of their relevance when defining subgroups. Although recommendations based on cost-effectiveness subgroup analysis are supported by strong ethical principles (e.g., the fair use of limited resources across different beneficiaries of the same health care system), it should be acknowledged that ethical and equity criteria reflected in social values other than efficiency should also be considered. It seems reasonable that these criteria should be defined in advance, in methods guidelines or in the scoping phase of technology assessment.
The article's more specific contributions are, first, to identify the best potential specifications for resource allocation decisions by choosing those that maximize NHBs for different levels of disaggregation. The set of specifications for different levels of disaggregation represents the efficiency frontier for subgroup analysis, which provides a tool for decisions about adoption of technologies in different subgroups. Second, the value of heterogeneity has been conceptualized as a bidimensional concept: the value of making different recommendations across subgroups (static value) and the value of potential future research conducted to resolve parameter uncertainty for different levels of heterogeneity (dynamic value). Third, it provides an application of these concepts to a case study as an extension of the classical methods used for cost-effectiveness, which makes this framework feasible for a wide implementation.
Static value has been previously presented in the literature15,17 and represents the health gained due to understanding heterogeneity (i.e., observable characteristics that explain differences between subgroups). 18 Basu and Meltzer 17 have previously presented this static value in a framework in which decisions can be made either with or without cost internalization. Although our work has been focused on the estimation of static value with cost internalization (i.e., decisions take into account not only account benefits but also opportunity costs), it is expected that the static value will be lower without cost internalization, as presented by Basu and Meltzer. 17
In addition, EVIC for specific parameter(s) (EVICθi) has been proposed as an informative metric to implement a subgroup-based policy. 43 The advantage of the EVICθi approach is that it can provide an estimate of the static value for a set of several parameters. The methods presented in this article add information to EVICθi, because they provide detailed information about the decisions that should be made in specific subgroups. Furthermore, this method provides a feasible approach to estimate the static value and the parameter uncertainty simultaneously, which is another important complement to the EVIC framework.
The dynamic value, on the other hand, is the additional value of resolving second-order uncertainty in the future when we compare 2 adjacent levels of disaggregation. This value might be associated with an increase or decrease of EVPI. Because the value of estimating conditional parameters is also observed under current information, it is to be anticipated that the difference between the expected maximum NHB and the maximum expected NHB (i.e., EVPI) is smaller than the average (or the previous level of disaggregation). This was the case for the results shown in the case study. In contrast, if a particular specification contains limited information or the amount of data (sample size) to examine its effect is too limited, EVPI can increase. Of course, in the limit as more sources of variability are observed, the value of additional evidence will fall. Indeed, if all sources of variability could be observed, then there would be no uncertainty or value of information.
Two issues that have not been addressed here are the value of the relevant metrics for specific parameters and the value of sampling information. Both correspond to different dimensions of the research space, and future extensions of this work might focus on these elements. The expected value of perfect information for parameters (EVPPI) provides information about the value of resolving the uncertainty of the effect of a given parameter θi on the net health outcome, shedding lights on which parameters should be in the focus of future research. A related issue to clarify is that there is also uncertainty surrounding the categorization of patients into alterative subgroups (e.g., moderators of treatment effects). For example, we may be certain that a given patient has a genetic polymorphism (complete information), but there is uncertainty about the effect of this characteristic on the patient's expected (net) health outcome. More realistically, there may be uncertainty about both the effect of the polymorphism on the outcome and whether the patient has the polymorphism (since genetic tests are not 100% sensitive and specific). It is, therefore, important to emphasize that EVICθi assumes that the information at the individual level is accurate (i.e., there is sensitivity and specificity of 1 at the individual level). 43 In principle, the uncertainty around the value that a covariate takes can be represented as another EVPPI for parameters relating to the diagnostic test. The EVPPI calculation is computationally demanding for the population analysis and may be even more difficult in the case of subgroups.
An important concept mentioned in this article is that subgroup-specific parameters can be exchangeable or nonexchangeable. One parameter (θi) is exchangeable if the information used to estimate θi|s conditional for one subgroup can be used to estimate θi|(1-s) for another subgroup. The
