Abstract
Markov decision-analytic models1–3 are a widely used modeling approach in cost-effectiveness analysis 4 and are typically built in spreadsheet-based packages or commercial packages such as TreeAge. 5 Spreadsheets, especially Microsoft Excel, have the advantage of being familiar, widely available, simple to use, and easy to share with others. However, the calculations are often over several (linked) sheets rather than being contained together on one page. It can be hard to keep track of modifications to the model and the resulting output when cells are changed, especially if altered accidently. In addition, each time a parameter is changed, the previous analysis is lost, so retaining a full record of the analysis requires multiple workbooks to be kept.
This article demonstrates how a cost-effectiveness analysis can be carried out within a multi-state modeling survival analysis framework using the statistical software
The aims of this tutorial article are 1) to introduce the “state-arrival extended” multi-state model as a tool to test the Markov property and 2) to provide a step-by-step guide to how multi-state modeling can be used for carrying out a cost-effectiveness analysis, including discounting of costs/benefits and deterministic and probabilistic sensitivity analyses. The
Decision-Analytic Modeling Expressed in the Multi-state Model Survival Analysis Framework
Multi-state modeling builds survival regression models for each of the transitions. Survival times are treated as continuous variables, rather than being measured in discrete cycles as is usually the case in decision-analytic modeling. Therefore, the arrows normally seen in a model diagram that leave and reenter a state—to indicate patients who remain in that state for the length of a cycle—are not applicable.
Two ways to treat time in multi-state modeling are the clock-forward and clock-reset approach. 10 With the clock-forward approach, time is measured from the initial state, whereas with the clock-reset approach, every time a patient reaches a new state, the clock is set back to zero, thereby only measuring time in the current (new) state. The clock-forward approach is a Markov model because the property that movement from the present state does not depend on history is inherent. The timescale of the clock-reset approach does depend on history, and models fitted using this approach are referred to as semi-Markov rather than Markov. 10
One way of testing whether the Markov property is violated is to include in the model a covariate representing history. Such models have been termed
Figure 1 shows an algorithm that can be used to perform health economic modeling in a multi-state modeling survival analysis framework. All of the functions included in Figure 1 are adaptions written by the authors to the functions already available in the

Algorithm for health economic modeling using a multi-state modeling framework.
An explanation of each of the steps is given as follows.
Step 1: Deciding Whether to Accept the Markov Property
Building a Cox Markov state-arrival extended multi-state model can aid the decision of whether the Markov assumption is reasonable or not. Such models are Cox in the usual semi-parametric sense that the baseline hazard does not follow a specified distribution, and Markov in the sense that time is measured from the initial state for every transition. The state-arrival extended aspect of the modeling of a particular transition could, for example, include a covariate for time spent in the immediate previous state. A statistically significant covariate effect would then provide evidence that the Markov property does not hold. However, if the effect is small such that it is not of practical importance, then the analysis could safely proceed as if the Markov property did hold.
Step 2: Building Parametric Multi-state Models
Many models contemplate death as an absorbing state (i.e., a state from which a transition to another state is not possible). If the period of follow-up of the study is such that for every patient, his or her whole lifetime since entry into the study is represented, then the functions in
Often death is not observed for every patient due to limitations of follow-up. When this is the case and the analysis has a lifetime horizon, extrapolation of survival is required, as recommended by the UK National Institute for Health and Care Excellence (NICE) Decision Support Unit.
11
This is necessary because the cost-effectiveness calculations need an estimate of mean survival. A popular choice to achieve the necessary extrapolation is to assume a parametric distribution when modeling the hazards. The
Step 3: Calculating State Occupancy Probabilities
The
Step 4: Visual Assessment of Fits
Visual assessments of fits can help in choosing the distribution to use for each transition. A balance between a good fit to the observed data and the necessary extrapolation to the time horizon is desirable. This can be assessed by plotting the observed proportion in a given state alongside the predicted probability of being in that state from the model(s) over the observed period of the trial and then again extended to the target time horizon.
The
Step 5: Estimating Life Years and Quality-Adjusted Life Years
The mean life years in a particular state for a particular treatment can be calculated using the
Step 6: Estimating Incremental Costs and Cost-Effectiveness
An
Step 7: Structural/One-Way Sensitivity Analysis
One-way sensitivity analysis can be achieved by repeated analyses, modifying arguments in the
Step 8: Uncertainty in the Survival Regression Model Parameters
The function
Step 9: Estimating Life Years and QALYs
The
Step 10: Estimating Incremental Costs and Cost-Effectiveness
Cost parameters are usually also considered in probabilistic sensitivity analyses.
4
While no
Step 11: Visualizing Results I: Cost-Effectiveness Plane
The incremental (discounted) QALY needed for a plot of the cost-effectiveness plane
4
can be obtained using repeated evaluations of the
Step 12: Visualizing Results II: Cost-Effectiveness Acceptability Curve
The total (discounted) QALY for each treatment arm needed for a plot of the cost-effectiveness acceptability curve
16
can be obtained using repeated evaluations of the
Illustrative Example
Data Set Used for Illustration
The data used in this article are based on a trial comparing rituximab in combination with fludarabine and cyclophosphamide (RFC) v. fludarabine and cyclophosphamide alone (FC) for the first-line treatment of chronic lymphocytic leukemia (CLL-8). 17 It was the main source of data used by the manufacturer in their submission to NICE in the United Kingdom for the specific technology appraisal TA174.18,19 The trial reported the outcomes of progression-free survival and overall survival for each patient, allowing focus to be on 3 states (progression-free, progression, and death) and the transitions between them. There were 403 patients in the RFC arm and 407 patients in the FC arm. There were 106 progressions, 23 deaths after progression, and 21 deaths without progression among those in the RFC arm. In the FC arm, there were 148 progressions, 27 deaths after progression, and 26 deaths without progression. Patients were in the trial for up to 4 years, and not all were observed to the end of their lives. This meant extrapolation of survival was necessary to obtain a representation of the whole duration of life since entry into the trial. It was estimated that only 1.3% of the cohort would survive beyond 15 years19(p109) and therefore a time horizon of 15 years was used. The published Kaplan-Meier curves in the manufacturer’s report 19 were digitized using Enguage 20 to generate the data for the analysis.
Transitions Modeled and Initial Modeling
The state transition model is illustrated in Figure 2.

Transition diagram for multi-state model showing the 3 transitions: 1) progression-free to progression, 2) progression-free to death, and 3) progression to death.
The Web page http://www.gla.ac.uk/hehta/reports/cwilliams includes the syntax with the
The analysis began by considering whether the proportional hazards (PH) assumption was reasonable for each transition, to determine whether it was appropriate to consider PH models. Figure 3 shows a log-log plot for treatment, the only covariate under consideration, for each of the respective transitions.

Log-log plots for each transition. FC, fludarabine and cyclophosphamide; RFC, rituximab, fludarabine, and cyclophosphamide.
It can be seen in each of the plots (Figure 3a–c) that the lines were reasonably parallel, with any crossing of the lines due to the lack of a treatment effect rather than any major violation of the PH assumption.
Figure 4 shows a cumulative hazard v. time plot for treatment in each of the respective transitions.

Cumulative hazard plots for each transition. FC, fludarabine and cyclophosphamide; RFC, rituximab, fludarabine, and cyclophosphamide.
The progression → death plot (Figure 4c) reflected the lack of a treatment effect. The other plots (Figure 4a,b) showed lines that diverged, indicating considering a distribution that facilitates increasing hazards would be appropriate.
Since there was no suggestion of a severe violation of the PH assumption from Figure 3 or 4, proportional hazards models were considered for the analysis.
To help decide whether the Markov property held, a Cox Markov state-arrival extended model for progression → death was initially fitted. Table 1 shows the results of fitting this model.
Results of a Cox Markov State-Arrival Extended Model
The time spent progression-free was found to have a statistically significant association with death after progression (
Base-Case Analysis
Parametric semi-Markov models were then fitted to allow both a relaxation of the Markov assumption and extrapolation of survival. The choice of distribution for each transition was considered. The progression → death transition was considered first because it did not involve a competing risk scenario.
Figure 5 shows, over the trial observation period, the observed and predicted proportion of deaths after progression using 6 different candidate distributions. For brevity, only the RFC treatment arm is shown. It can be seen that the predictions were reasonably similar. The lowest AIC value was seen for the log-logistic distribution, suggesting that it provided the best fit of the candidate distributions, although there was little to choose between the distributions (Table 2).

Observed and predicted proportions for progression → death: trial observation period (RFC treatment only). KM, Kaplan-Meier; RFC, rituximab, fludarabine, and cyclophosphamide.
Akaike Information Criterion (AIC) Statistics from Modeling Progression → Death
The time horizon of the model was 15 years. Therefore, in addition to comparing the fit over the period of the observed data, interest was also in assessing how the survival estimates extend out to 15 years. Figure 6 shows observed v. predicted probabilities over 15 years. The Weibull, Gompertz, and exponential distributions all appeared to represent a time horizon of 15 years in the sense that the probability of death was close to 1 by 15 years.

Observed and predicted proportions for progression → death: extrapolation to 15 years (RFC treatment only). KM, Kaplan-Meier; RFC, rituximab, fludarabine, and cyclophosphamide.
After inspecting the plots for the other group (not shown), a Gompertz distribution appeared to be a reasonable distribution to use for this transition. Consideration of the distributions to use for the progression-free → progression and progression-free → death transitions is contained in Appendix 1 as online supplementary material. These 2 transitions involved competing risks, and because AIC statistics are not appropriate when modeling transition hazards in a competing risks scenario, the choice of distributions to use was based on visual assessment of the plots alone. The Gompertz and generalized gamma distributions appeared to provide a reasonable fit for the progression-free → progression and progression-free → death transitions, respectively, over the trial observation and extrapolation period.
Table 3 shows the mean costs used with the multi-state modeling approach.
Assumptions Used for Mean Costs
FC, fludarabine and cyclophosphamide; PFS, progression-free survival; RFC, rituximab, fludarabine, and cyclophosphamide.
Source: British National Formulary 56. http://www.medicinescomplete.com/mc/bnf/current/
Source: Department of Health. NHS reference costs 2006/2007. http://webarchive.nationalarchives.gov.uk/20130107105354/http://www.dh.gov.uk/en/Publicationsandstatistics/Publications/PublicationsPolicyAndGuidance/DH_082571
Sources: Agrawal S, Davidson N, Walker M, et al. Assessing the total costs of blood delivery to hospital oncology and haematology patients. Curr Med Res Opin. 2006;22(10):1903–9. Curtis L. Unit costs of health and social care. Personal Social Services Research Unit, Kent, United Kingdom; 2007.
Most of the mean costs were not related to the time spent in relevant health states. However, cost of supportive care while progression-free, cost of supportive care while in progression, and cost of second-line and subsequent therapy while in progression were all associated with time spent in relevant states. Therefore, the mean life years in the appropriate states from the multi-state model were used in the calculation of these costs. All other costs were taken from the original manufacturer’s submission.19(pp127–31)
The cost per life year or QALY gained, commonly known as the incremental cost-effectiveness ratio (ICER), was then calculated. Table 4 shows the results of the base-case analysis in terms of the mean life years in each state, the mean QALYs in each state, the mean costs, the cost per life year gained, and that per QALY gained. Utilities of 0.8 and 0.6 were assumed for progression-free and progression, respectively. 21
Base-Case Analysis Results
FC, fludarabine and cyclophosphamide; QALY, quality-adjusted life year; RFC, rituximab, fludarabine, and cyclophosphamide.
An increment of mean life years (QALYs) while progression-free was found of 0.74 (0.59). However, a relatively large decrement of mean life years (QALYs) of −0.53 (−0.32) was found while in progression. Therefore, the benefits overall of mean life years gained of 0.21 and mean QALYs gained of 0.28 were relatively small. This led to a cost per QALY gained of close to £38,000, in excess of the £30,000 willingness-to-pay threshold commonly used in the United Kingdom, and therefore the treatment was deemed not cost-effective.
Structural/One-Way Sensitivity Analyses
Table 5 details the base-case assumptions and one-way sensitivity analyses. A more comprehensive sensitivity analysis specifically considering the distributions used for each transition is contained in Appendix 2 as online supplementary material.
Structural/One-Way Sensitivity Analyses
FC, fludarabine and cyclophosphamide; IV, intravenous; PFS, progression-free survival; QALY, quality-adjusted life year.
It is not always appropriate to assume that the treatment effect observed within the trial persisted beyond that period. Therefore, as part of the sensitivity analysis, we consider an alternative assumption whereby the treatment effect no longer persists in the unobserved period. It can be seen from Table 5 that there was some uncertainty with regard to the cost-effectiveness, especially if a willingness-to-pay threshold of £30,000 per QALY gained was used. When the gap between the utilities was widened, the cost per QALY gained was £25,808. In all other analyses, there was no change in the conclusion that the treatment was not cost-effective.
Probabilistic Sensitivity Analysis
Each cost parameter involved in the manufacturer’s probabilistic sensitivity analysis was assumed to follow a Beta Pert distribution. 22 For the purposes of this illustration, we used the same distributions. However, analysts are free to consider a range of different distributions, as appropriate, with their own studies. Table 6 shows the mean base-case estimates together with the ranges used to generate the distributions. The particular Beta Pert distribution chosen for the cost of monthly supportive care and second-line and subsequent therapy while in progression was dependent on the mean life years in progression. All other distribution parameters values were as presented by the manufacturer.19(pp137–8)
Beta Pert Distributions Used in Probabilistic Sensitivity Analysis for Cost Parameters
PFS, progression-free survival.
Source: Department of Health. NHS reference costs 2006/2007. http://webarchive.nationalarchives.gov.uk/20130107105354/http://www.dh.gov.uk/en/Publicationsandstatistics/Publications/PublicationsPolicyAndGuidance/DH_082571
Source: Agrawal S, Davidson N, Walker M, et al. Assessing the total costs of blood delivery to hospital oncology and haematology patients. Curr Med Res Opin. 2006;22(10):1903–9.
Figure 7 shows the cost-effectiveness plane in this illustration. The probabilistic sensitivity analysis involved 1000 draws with 10% excluded due to computational difficulties related to differences in cumulative hazards between consecutive time points that were greater than 1. All draws resulted in the treatment of interest being more costly, and therefore the northwest and northeast quadrants are shown. The cost-effectiveness acceptability curve is shown in Figure 8. It can be seen in Figure 8 that, given a maximum willingness to pay of £100,000 per quality-adjusted life year gained, the probability that the treatment was cost-effective compared with the control was 0.60.

Cost-effectiveness plane. QALY, quality-adjusted life year; WTP, willingness to pay.

Cost-effectiveness acceptability curve.
Discussion
This article has introduced the state-arrival extended multi-state model and highlighted its potential as a tool to formally test the Markov property. We demonstrated its use in aiding the decision as to whether the Markovian assumption was reasonable, through the inclusion of a covariate representing patients’ history. In our illustration, the Markov property was not thought to be reasonable, and therefore the test result helped direct us to a semi-Markov, rather than Markov, approach to modeling.
The article has also provided a step-by-step guide to carrying out cost-effectiveness analysis in a multi-state modeling survival analysis framework and has provided
The existing functions in the
In using state-arrival extended models specifically to test the Markov property, we have used the effect of a covariate for time in the previous state to help decide whether the property held. A balance needs to be struck between the hypothesis testing results (assessment of a
Proportional hazard models have been involved in the modeling of transition hazards. Before considering such models, it is important to check that the PH assumption was reasonable for each of the covariates in the model. This illustration has involved the eyeballing of log-log and cumulative hazard v. time plots. We also recommend the use of the Grambsch-Therneau test for a more objective assessment of the PH assumption. If proportional hazards are not met, it is still possible to conduct an analysis. Options include fitting accelerated failure models or, with covariates that did not satisfy the PH requirement, fitting separate models for each level of a covariate, using time-dependent covariates or including an interaction with time.
When data at the individual patient level are not available, data can be obtained by digitizing published Kaplan-Meier curves, using, for example, the software Enguage,
20
as in this illustration. Alternatively, Guyot and others
23
have produced an algorithm that can be used to approximately reconstruct the data from a published Kaplan-Meier curve so that analysis can still take place. However, regardless of the method, there would need to be enough survival curves to represent all event times of interest. For instance, for the model illustrated in this article, a Kaplan-Meier estimate of overall survival, progression-free survival,
If individual patient-level data are not available, and survival curves are not comprehensive enough to approximate the data accurately enough, it may still be possible to undertake some analysis. If survival regression models are required, parameter estimates from previously published models can be used. If results of a semi-parametric Cox regression model are available, then cumulative hazards at each time point of interest can be derived and used with the built-in functions in
This particular illustration was used primarily to demonstrate an application of the method rather than to focus on the results of the cost-effectiveness analysis. In this example, clinical information was used to determine the time horizon. More generally a commonsense approach is needed to decide on the time horizon and the extrapolation to that point, particularly when there is uncertainty surrounding the cost-effectiveness in the unobserved period as in this case. External information such as registry data and/or expert opinion can be used to help provide extrapolation that is sensible.24,25
The state-arrival extended approach demonstrated in this article to test the Markov property required individual patient-level data in the sense that knowledge of (some function of) time in the previous state was necessary. If analysts wish to use a state-arrival extended approach, we recommend careful consideration of the covariate used to ensure it is clinically relevant.
In the probabilistic sensitivity analysis, we excluded 10% of the 1000 draws, due to computational difficulties related to differences in cumulative hazards between consecutive time points that were greater than 1. Therefore, bias may have been introduced. Including more time points at which to evaluate the transition probabilities would have rectified this. However, this was beyond the scope of this illustration. We believe the problem related to the cumulative hazards was primarily due to the shape and scale parameters in the Gompertz distributions used and that other distributions would not exhibit the problem to the same extent.
Although health economic modeling in spreadsheet packages has its advantages, we consider this multi-state modeling approach an attractive alternative worth considering. We hope this article, and the accompanying functions, will encourage health economists to use this approach.
Footnotes
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
