Abstract
Keywords
1 Introduction
Perinatal epidemiological studies investigating the possible effects of some gestational exposure on a postnatal condition can roughly be split into two categories. Those where the condition is observable immediately after birth and those where it may take years before the condition is ascertained, if ever. This paper is concerned with the latter. Smoking and low birth weight; infant supine position and sudden infant death syndrome; and foetal alcohol spectrum disorders fall in the first category. The association between prenatal marijuana exposure on neuropsychological conditions 1 and the association between prenatal exposure to pharmaceuticals and neurodevelopmental disorders belong to the second category. The present study was motivated by the hypothesis linking gestational exposure to paracetamol and an increased risk of neurodevelopmental disorders, attention-deficit hyperactivity disorder (ADHD) in particular,2–5 hypotheses that are pertinent examples of the latter category.
From a statistical modelling perspective, the main difference between these two types of hypotheses is that the data in the latter are plagued by censoring. That is, the outcome in studies in the second category may be unknown at the time of study due to a lack of follow-up. Thus, for studies in the first category, standard regression analysis is a natural choice (e.g. linear, Poisson, logistic), while for the latter type of studies, the need to handle censoring often leads to survival analysis methods being employed (e.g. the Cox model). A consequence of opting for a survival analysis model is that the outcome is defined as the
In the statistical literature, cure models have received much attention in recent years.6–12 The name stems from medical applications where some patients never experience a relapse of the disease under study, and these patients are therefore considered cured. Cure models have also been proposed in the field of reproductive epidemiology to account for the possibility of some of the individuals under study being sterile. 13
It is worth noting that the motivation typically underlying cure models is rather different from the argument we put forward in this paper. Typically, cure models are solidly anchored in the survival analysis world, while our approach, which is focused on the probability of belonging to the susceptible group, is more akin to a misclassification- or missing data problem. In other words, in this paper, we are less interested in survival quantities such as hazard rates and survival functions per se, but view them as nuisance parameters that must be tended to in order to make inferences on the parameters determining whether a child is born susceptible or not. See Farewell 14 for an early paper advocating for cure models in a similar manner.
The article proceeds as follows. In Section 2, we provide a brief introduction to the cure model, and motivate this class of models in light of the hypothesis linking paracetamol and ADHD (hereafter referred to as the paracetamol–ADHD hypothesis). This section also contains some theoretical results on simple logistic and Cox models when such are fitted to data that contain a cure fraction. These results are illustrated with two small simulation studies. In Section 3, we fit different cure models to the data on gestational exposure to paracetamol and ADHD, and compare these with a logistic regression and a Cox regression model. The aim of this application is to investigate whether our reading of the paracetamol–ADHD hypothesis finds empirical backing, and illustrate the fact that all three classes of models are likely to lead to rather similar conclusions about the paracetamol–ADHD hypothesis.
2 The cure model and ADHD
In this section, we first, using the paracetamol–ADHD hypothesis as our example, elaborate on why we find the class of cure models appropriate for the perinatal studies discussed in this paper. Subsequently, we give a brief introduction to the standard mixture cure model.
2.1 The paracetamol–ADHD hypothesis
The use of cure models in perinatal epidemiological studies can be motivated by the directed acyclic graph (DAG) in Figure 1. In this DAG,

A DAG illustrating the data generating mechanism presented in Section 2.1. The exposure of interest (paracetamol) is
The
2.2 The standard cure model
As above, let
Both
For the perinatal studies that are the object of this paper, two features of the model in equation (1) should be pointed out. First, by using this model, we are assuming that the nonsusceptible individuals are never diagnosed with the condition in question, that is, we assume that there are no false positives in the sample. In the case of ADHD, this assumption may be questioned. In the US, there is evidence of ADHD overdiagnosis in some communities, 15 meaning that the prevalence of ADHD is higher than the standard 3–5% prevalence estimate.16–18 In the data set we analyse in Section 3, only about 2.3% of the children are diagnosed with ADHD. Since this number is well below the standard prevalence estimates, it would lead one to believe that false positives are not a major issue in our data.
Notice that if
The log-likelihood function of the model in equation (2) is
If
2.3 Fitting logistic and Cox models to cure data
In this section, we provide some insight on the bias incurred in the parameter estimates when the data stem from a cure model, but a logistic regression model or a Cox regression model, is chosen.
Suppose that the data
Consider fitting a logistic regression model to independent data
If
Now, consider fitting a Cox regression model with hazard rate
Let
If the second term on the right is zero, which it is when the model
In summary, when it comes to estimating
To illustrate this, we performed two simulations studies with varying parameter values. In both, the data were simulated from a cure model of the form given in equation (3), with the parameter of interest set to
In the simulations reported in Figure 2, we set

Upper panel: estimates of
In the simulations reported in Figure 3, we set

Upper panel: estimates of
In the data set we analyse in Section 3, only
3 Data analysis
The cure models we fit to the paracetamol–ADHD data have population survival functions
For comparison, we also fit a logistic model and a Cox model to the paracetamol–ADHD data. As elaborated on in Section 2.3, we have reason to expect a nominal similarity between the exposure estimates from these models to those of the corresponding cure models. This is because the prevalence of ADHD in the data is low, and because most children diagnosed with ADHD are diagnosed quite early in life.
The data used in this analysis stem from the Norwegian Mother, Father and Child Cohort Study (MoBa) conducted by the Norwegian Institute of Public Health. Information about the ADHD diagnoses was obtained from the Norwegian Patient Registry (NPR). The analyses of this section are motivated by and use essentially the same data as Ystrom et al., 2 and a more elaborate discussion of the MoBa and the NPR can be found therein.
After having removed observations with missing values, the sample consisted of
The 11 birth year cohorts included in the data, size of cohort and number of children within each cohort with a diagnosis of ADHD.
Note: The last column is the percentage of mothers in the data who consumed paracetamol at least once during pregnancy.
The cure models we fit have population survival functions of the form (equation (5)), with the baseline survival function being either nonparametric or that of a gamma distribution with density
Summary of covariates.
ADHD: attention-deficit hyperactivity disorder.
Note: All the covariates are binary (0–1). For an individual, a value of 1 means, respectively, that paracetamol was consumed at least once during gestation, the mother has higher education, the mother consumed alcohol at least once a month during gestation and that paracetamol has been consumed to alleviate fever.
We fitted three different cure models for each of the two specifications of the baseline survival function
Table 3 reports the parameter estimates and estimated standard errors of these for all eight models. Figure 4 displays estimates of the proper survival functions (that is,
Estimates based on
Note: The semiparametric models were fitted using the smcure-package in R, with standard errors based on 100 bootstrap samples. The gamma density of the three parametric cure models is

Estimates of the survival curve of the susceptible children, i.e. the proper survival functions
In Table 3, the first thing to notice is that in the cure models that include covariates on both the logistic and the survival part (Gamma 2 and Semipara. 2), the estimated effects of paracetamol on the logistic part are significant at the
The results reported in Table 3 are interesting because they can be seen as corroborating the reading of the paracetamol–ADHD hypothesis expounded in Section 2.1, namely that gestational exposure to paracetamol determines whether or not a child is susceptible, while being unimportant for the time to diagnosis. In other words, given susceptibility the time to diagnosis appears to be independent of the exposure.
The important issue of identifiability of the semiparametric cure model should be pointed out. Loosely speaking, for the fraction of susceptible children to be accurately estimated, we must assume that the (covariate dependent) distribution function of the survival times of the susceptible individuals reaches unity before the distribution function of the censoring times. 12 In effect, for identifiability reasons, when we fit the semiparametric cure models, the survival functions are set to zero for all survival times above the largest observed diagnosis time. No such fix is demanded when fitting fully parametric cure models. See Section 4 for further discussion of these issues, and Amico and Van Keilegom 12 for a thorough discussion of identifiability in semiparametric cure models.
4 Discussion and concluding remarks
In this section, we briefly discuss the above findings and introduce some topics for possible future research.
The cure model was motivated by arguing that the scientifically interesting question in many perinatal studies is how the exposure relates to a partly unobservable variable indicating whether or not the child is susceptible to the condition or disease of interest.
The empirical analysis of the paracetamol–ADHD hypothesis of Section 3 indicates that the diagnosis times are independent of the exposure when susceptibility is accounted for. These findings have important implications for studies on most childhood long-term outcomes as there will always be a fraction of the children that is never diagnosed with the condition studied, and among these many should, for all practical purposes, be regarded as nonsusceptible to the condition in question. When a fraction of the offspring are nonsusceptible, conventional survival analysis methods will give biased effect estimates.
4.1 Remark 1
As discussed in Section 2.1, when using the cure model, we assume that we do not have data on the absence of the condition or disease, i.e.
If
4.2 Remark 2
A class of survival models that can give estimates of continuous levels of susceptibility are so called first hitting time models.22,23 One example is the following. Consider a Wiener process
4.3 Remark 3
We have argued that in the perinatal studies discussed in this paper, the quantity of scientific interest is
