Abstract
Single case designs (SCDs) are increasingly recognized as important tools in behavior modification research and other fields, enabling researchers to model changes in psychological or behavioral variables over time (Franklin, Allison, & Gorman, 2014). Common approaches to analysis of SCDs are based on comparing means before and after an intervention or modeling the slope of a change using regression-based techniques (Manolov & Moeyaert, 2017a). These techniques have the advantage that they are familiar to many researchers and are readily available in statistical packages. However, the models underlying these most commonly employed analysis techniques have assumptions that are often violated. Specifically, an intervention effect on a psychological construct typically manifests neither as a discontinuous shift from one value to another (the model underlying comparison of means), nor a linear unbounded change over time (the model underlying linear regression). Instead, intervention effects often reflect a shift in a psychological construct where both the initial and the final values are more or less stable over time. Accurate modeling of this shift provides more information about treatment effects than comparison of means or estimating the slope of a change. In this article, we introduce a technique for such modeling as well as freely available user friendly functions implemented in R (R Core Team, 2018). We illustrate this technique through the use of two data sets and provide a brief tutorial to make these techniques widely accessible.
SCDs are important because they provide a means to determine the effectiveness of interventions at an individual level (Barlow, Nock, & Hersen, 2009). Much methodological research has been devoted to effect size measures in SCD because an accurate effect size supports the development of evidence-based interventions (Parker et al., 2005; Parker & Hagan-Burke, 2007; Parker, Vannest, & Davis, 2011). An effect size can be considered accurate if it provides a reliable indication of, for instance, the improvement of a patient after or during treatment. The type of effect size is closely related to what type of analysis of SCDs is chosen (Lenz, 2015; Vannest & Ninci, 2015). Two basic classes of analyses can be distinguished: first parametric regression-based methods, including multilevel analysis (Baek et al., 2014), and second nonparametric methods. A recent overview of analysis techniques for SCD is given by Heyvaert and Onghena (2014).
Comparison of means before and after an intervention represent the most straightforward analysis, but the underlying model holds that a change manifests as an instantaneous shift from one stable value to another, which is often not realistic. In addition, this analysis cannot infer from the data when such a shift may occur: The user must specify which data points to aggregate in each mean. Thus, although this approach’s familiarity may partly explain its prevalence, this analysis’ assumption of instantaneous change from one otherwise stable value to another is rarely realistic, and it yields little information about the treatment.
More advanced regression-based approaches for analyzing SCDs usually (but not necessarily) consider a linear model, and, therefore, can accommodate incremental change, no longer imposing an instantaneous shift of the criterion from one value to another. For example, in a pre–post design, the piecewise regression (PWR) model (Center, Skiba, & Casey, 1985; Huitema & McKean, 2000) can be used to model a linear trend separately for two phases: one before the intervention and one during or after the intervention. This model then compares the intercepts and slopes between both phases of the design, and intervention effects are derived from the differences in slopes and intercepts. Although many more sophisticated PWR models exist, for example, piecewise splines (e.g., Friedman, 1991), these more complex methods remain relatively uncommon, perhaps, ironically, because their sophistication renders them less accessible to many researchers. The commonly used two-phase method suffers from the problem that it assumes that the change manifests in a linear fashion: the slope only changes once and then remains the same. This is unrealistic for two reasons.
First, in many situations, for instance, when the effects of a therapy are monitored, the criterion is measured by some sort of questionnaire or other operationalization that has a limited range. The observed or measured improvement of clients in a therapeutic setting is, therefore, artificially constrained by the scale of the instrument. A 7-point Likert-type scale is an example of such an instrument. If clients rate how they feel with a maximum score of seven, there is no further room for improvement. The score of seven in this example constitutes a ceiling in the therapy effect. Likewise, such an instrument has a floor, which is the minimum value of the scale. A straight line would break through the ceiling (or floor, when the criterion decreases over time), unless the therapy has no effect.
Second, treatments and interventions in the clinical and health psychology practice are often protocolled, and as a consequence, have a natural limit as to their effectiveness. This is the case because they are designed based on knowledge of the behavior, cognitions, or affective associations that are targeted. If implemented properly, they will affect the areas of human psychology for which they were designed, thereby improving the target behavior or condition. However, no psychological theory or combination of theories explain behavior or psychopathology completely. Therefore, evidence- and theory-based interventions and treatments are necessarily limited in terms of the effect they can have: at most, they can have the maximum achievable effect in all areas they target, and they do not target the whole of human psychology. This characteristic manifests as a constraint for treatment effectiveness. For example, an exposure therapy treatment for an anxiety disorder based on inhibitory learning (Craske, Treanor, Conway, Zbozinek, & Vervliet, 2014) cannot be expected to address dysfunctional self-regulation patterns that may have emerged over the course of the anxiety disorder. Theory-based treatment, being based on theory, and theory by definition dealing only with a bounded aspect of reality, necessarily is constrained in its maximum effectiveness. Such a constraint on effectiveness means that the association between time in treatment and treatment effectiveness is unlikely to be linear: change is likely to slow as treatment approaches its maximum possible effect.
Thus, treatment effects realistically manifest as a shift in the targeted construct(s) from a more or less stable level to a new more or less stable level. This shift likely decelerates as the treatment reaches its maximum possible effect. This underlying model is neither accurately captured by comparison of means, nor by a two-phase PWR model. Using more sophisticated spline regression models can address this, but these suffer from two disadvantages. First, they are often not accessible to researchers without more advanced statistical training (which may partly explain the prevalence of the mean comparison and two-phase PWR models despite their relative inadequacy). Second, they are relatively unparsimonious and require estimating a large number of parameters compared with the small number of data points often available in SCD data sets.
In this article, we present a technique that addresses both these points: it is freely available and designed to be easily accessible and usable for a wide variety of researchers and potentially practitioners, and it estimates the same number of parameters as a two-phase PWR model. Specifically, this model is based on an optimization function applied to a generalized logistic model. This enables the estimation of effects in a pre–post SCD design when the criterion is constrained (e.g., has a floor or a ceiling). We will first present an example of the model and show its mathematical characteristics. Then we present two clinical examples in which we compare the proposed model with the PWR model. Finally, we discuss some possibilities for future research.
The Problem With Ceilings
This distribution shown in Figure 1 illustrates a likely model for an intervention process, with the

Example with generated data from generalized logistic model for
A simple model assuming a linear relationship seems to predict these data rather well, see Figure 2. The deviance (sum of the squared residuals) of the linear model is

Example with generated data from generalized logistic model with linear fit added (
However, the residuals from the “straight-line” model seem to show a cyclic or auto-correlated pattern, as Figure 3 clearly shows. One of the assumptions for unbiased parameter in linear regression estimates is homogeneity of the residuals and in this example this assumption is violated. This is an indication that the “straight-line”model is not the correct model to describe these data.

Residuals from linear model of example data.
Despite the high squared multiple correlation, the line misses some important information, in particular the strong increase in scores somewhere between the 15th and 20th point. It is good practice to test the model assumptions. When these assumptions are violated, another model should be fitted to the data.
Consequences for Tests of the Intervention Effect
To test the effect of the intervention, a naive approach is to compare the two means before and after (or during) the intervention. The effect size of the intervention in this approach is Cohen’s (1992)
However, claiming an intervention effect because the means in both phases are different is not correct (Huitema & McKean, 2007). When there appears to be a trend in the data (e.g., scores increase over time, independent of the intervention) simply comparing the means of the outcomes in the two phases may lead to wrong conclusions (Center et al., 1985). The trend, instead of an intervention effect, may be responsible for the different means in the two phases. Therefore, it is important to incorporate a trend effect in a research model for SCD data.
To adequately model such trends, a PWR model (Center et al., 1985; Huitema & McKean, 2000) can be used. PWR models two linear trends, separately for both phases. That is, the intercepts and slopes of two regression lines are compared before and after the intervention. See Figure 4 for an illustration. This model is given by
where

Piecewise regression on example data with trend effect.
In the PWR model,
When only a level effect is present in the data, as assumed by the mean comparison approach and shown in another example in Figure 5, the

Piecewise regression on example data with instantaneous phase effect.
For this PWR method, the following effect size is defined (Parker & Brossart, 2003):
Where
In many situations, it is not only important to know that there exists an effect and how strong it is, but also at what point in time the improvement due to the intervention started, how fast the change occurred, and when the improvement stabilized. For such questions, it is better to fit a curve to the data, which has the form of a sigmoid function, because it reflects the empirical process more accurately and allows for more flexibility.
The Generalized Logistic Model
A sigmoid function can be defined in many ways. Here we choose the generalized logistic (GL) function, which is defined as follows:
This model has the advantage that it is parametrized relatively straightforwardly: the analysis estimates the initial plateau and the postintervention plateau as well as when the change starts and stops. Specifically, the variable
The generalized logistic function was fitted on the example data (Figure 6) with the

Generalized logistic function fitted to the example data.
At about measurement 17 (12 measurements after the intervention started), the rate of increase in scores is largest. The growth rate is 0.2.
A general effect size could be defined in line with Cohen’s
where SD(
This effect size indicates the proportion of the scale that is improved according to the floor and ceiling of the fit function: in this case, ES

Examples of growth rates for fixed
The function
The
Default starting values for the parameters are for
Empirical Examples
Example 1: Singh Data
In their extensive review paper about SCD and methodologies to analyze them, Manolov and Moeyaert (2017a) analyzed a data set from Singh et al. (2007), see Figure 8. In this article, we will also use these data to illustrate the GL model and compare the results with those presented in the Manolov and Moeyaert paper. The data were obtained from three individuals measuring their verbal and physical aggression before and after an intervention, which consisted of mindfulness training for controlling aggressive behavior. The individuals were diagnosed with several mental disorders such as depression, schizoaffective disorder, borderline personality, and antisocial personality. These data are considered representative for single case data in the literature (Shadish & Sullivan, 2011).

Representation of the six data sets obtained from Singh et al. (2007).
In Table 1, the results are presented of the three effect size statistics and the deviance obtained from the GL analyses and these are compared with the effect size measures of the PWR analysis and Cohen’s
Comparison of Fit Measures and Effect Sizes Between GL, PWR, and Cohen’s
The question we want to answer in this example is whether the most important characteristics in Figure 9 are captured by the fit measures. Does the information obtained from fitting the GL model provide us with another kind of insight compared with the PWR or Cohen’s
Comparison of the Model Parameters Between GL and PWR Model.

Data from Singh et al. (2007) analyzed with the GL model.
From visual inspection, we learn that Jason has made the biggest improvement, both with respect to verbal and physical aggression. However, this effect is based on only three measurements in the baseline phase. First we notice that the
Both Tim’s aggression behaviors are fitted less well than the other subjects’ behaviors. This is true for GL and PWR, but GL fits slightly worse than PWR as can be seen from the
ES
The ES
Example: Sex Therapy Data
The data for this example were obtained from a study about the effectiveness of sex therapy (van Lankveld, Leusink, & Peters, 2017). The data are from a single person who provided scores on several variables at 38 time points during a year. The measurement points were not equally spaced in time. During the baseline period, a measurement was obtained every few days, after which the intermeasurement intervals were gradually increased to up to a month at the end of the study. Because dates were available for each measurement, it was possible to take these differential intervals into account when modeling the treatment effects.
In this example, we will show three of the eight variables that were measured in this study, specifically self-esteem, intimacy toward the partner and experience of masturbation. The GL and PWR model were used to analyze these variables. The relevant output of the GL analysis of the three variables is presented in Table 3.
Result of the Analyses of the Sex Therapy Data by the GL Model.
For the variable self-esteem, the deviance compared with the PWR is slightly better (
The therapy effect size for self-esteem is larger than for the other two variables. Figure 10 demonstrates the results graphically. We zoomed in on the small effects for intimacy and masturbation experience (note the small range on both

Analyses of self-esteem, intimacy, and experience of masturbation in sex therapy study.
For self-esteem, the GL model does not appear to be appropriate, as the fitted curve is almost a straight line. The PWR model seems more appropriate here although the fit values are also quite low.
In analyzing these data, we found that changing the start and boundary values may influence the outcomes. A small change in start values may result in a different curve. This implies that the optimization process for fitting the GL suffers from local minima. To explore the influence of local minima one should run a sensitivity analysis. This can be done by simply setting different start values and then inspect the fit and effect sizes. The analysis yielding the largest fit with the data should then be taken as the preferred one. In the appendix, we provide visual tools to inspect how the default and tweaked values for the start values of the parameters influence the resulting estimates.
Discussion
This article discusses a new method to analyze experimental single case data based on a generalized logistic model. The underlying assumption of this method is that intervention effects represent the shift of an individual’s scores from one plateau to another, and that the individual’s scores are limited by floors and ceilings, which are caused by the measurement instrument and by natural limits of the process under study. This implies that the linear models to estimate the intervention effect are at best suboptimal because their assumptions are violated, and, relatedly, they fit the data poorly. The generalized logistic model seems better equipped to deal with these floor and ceiling aspects of the measurement instruments. Another new aspect in this model is the estimation of the onset and the end of the intervention effects.
To test the proposed method we built the
Based on a well-known single case data (Singh et al., 2007), we illustrated the generalized logistic model. The Singh data are also discussed in Manolov and Moeyaert (2017a) and used to compare a wide variety of single case methods. The model was applied to these data and compared with the PWR model. The generalized logistic model provided sensible outcomes that seem to add to the understanding of the intervention process. Based on these analyses, we recommend that one should combine the result of the model fit with that of the estimated growth parameter and the second effect size, which is based on the range of the data, to obtain informative outcomes.
A second example (van Lankveld et al., 2017) also illustrated that the generalized logistic model can be helpful in analyzing the data. On the contrary, this example also made it clear that in some situations given start and boundary values can be very influential. The parameter estimates of the generalized logistic model are not robust in the sense that they depend on parameter constraints and starting values. With relatively few data points and four parameters to estimate this is not surprising. Fixing the top and ceiling values after visual inspection can improve the robustness of the remaining parameters. We also recommend to run sensitivity analyses to explore to what extent the outcomes depend on the start values of the optimization process.
For valid interpretation of the GL results, we recommend to first inspect the deviance and the
When the data contain many discontinuities, for instance scores go up and down several times, other approaches, such as PWR splines, are flexible alternatives for fitting the data. Splines are more general, because they could fit discontinuities, which might be a necessary property for fitting data that show complex patterns. However, for the generalized logistic model we assume situations, such as therapy situations, in which there is a more or less gradual increase (cq decrease) in behavior or attitude. Furthermore, flexible cubic splines need more parameters to estimate than the GL model, which may become problematic when there are only a small number of data points as is common in single case research (James, Witten, Hastie, & Tibshirani, 2013). Finally, the interpretation of the coefficients from the spline approach is more complex than for the GL model.
With multiple single case data (i.e., replicated
In this article, we have presented another tool to add to the already wide collection of SCD approaches (Manolov & Moeyaert, 2017b). It is based on the idea that most effects of interventions have a natural limit. Based on this simple premise, we have proposed a model that would represent this idea. The software we have presented is Free and Open Source Software, implemented in the popular statistical environment R, and easy to apply, with some additional support in a short tutorial (see the appendix and https://osf.io/8gcjz/?view_only=5b7a3c11bf8d4fe7a85410ad0a3d1447)
Supplemental Material
Appendix__applying_the_generalised_logistic_model_in_single_case_designs__modelling_treatment-induced_shifts_(1) – Supplemental material for Applying the Generalized Logistic Model in Single Case Designs: Modeling Treatment-Induced Shifts
Supplemental material, Appendix__applying_the_generalised_logistic_model_in_single_case_designs__modelling_treatment-induced_shifts_(1) for Applying the Generalized Logistic Model in Single Case Designs: Modeling Treatment-Induced Shifts by Peter Verboon and Gjalt-Jorn Ygram Peters in Behavior Modification
Footnotes
Declaration of Conflicting Interests
Funding
Supplemental Material
Author Biographies
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
