Abstract
Keywords
Introduction
Longitudinal data occurs frequently in clinical settings where patients are monitored over time. The combination of longitudinal data with a time-to-event outcome has gained popularity due to the increasing use of joint modeling techniques.1,2 However, in certain clinical settings, the outcome is considered
This study is motivated by a real-world clinical example in which a cardiac biomarker is measured sparsely (2–5 times) and irregularly in the first 24 hours after coronary artery bypass grafting (CABG) surgery to determine the presence of a complication in the form of perioperative myocardial infarction (PMI). Detecting a PMI at an early stage enables clinicians to promptly intervene and mitigate potential harm. 10 Therefore, the development of a model capable of dynamically classifying patients as PMI or non-PMI based on accumulating information from biomarker measurements could greatly assist clinicians in the early detection of PMI.
Longitudinal data presents an additional challenge compared to cross-sectional data, as it requires the selection of an appropriate model for the longitudinal profile. In the literature, commonly described approaches involve fitting a linear mixed-effects (LME) model to the longitudinal profile and utilizing the output of the LME model in a discriminant function.3,11-14 Extensions to multiple longitudinal profiles have also been developed using multivariate LME models,15-17 as well as non-parametric approaches in the form of functional discriminant analysis.18,19 Instead of employing a discriminant function for classifying individuals into groups, alternative approaches involve using summary measures derived from an LME model, such as subject-specific slopes or intercepts, as covariates in a logistic regression model. This can be achieved through a two-stage approach20,21 or a joint modeling approach,4,5,22,23 where the parameters of the mixed model and the logistic regression model are estimated simultaneously. In the two-stage approach, some researchers have employed non-parametric or non-LME models to effectively capture longitudinal trajectories.24-27 In addition to mixed-effects models, a more straightforward approach disregards the multilevel structure and serial correlation (i.e. historical information) of the longitudinal data. This is achieved by using a varying-coefficient model, in this case a (logistic) regression model with an interaction term between time and the covariate of interest.28,29 Finally, in clinical settings, reference growth charts are widely utilized as a tool to differentiate abnormal growth patterns in infants. Using specific percentile threshold values, measurements can be categorized as normal or abnormal, helping to identify atypical growth patterns. 30 Standard reference growth charts, which do not take into account covariates or past history, are not specifically intended for screening purposes. In contrast, conditional growth charts that consider growth history are recommended as a diagnostic tool for detecting and screening unusual growth patterns.31,32 Although growth charts are inherently designed to detect abnormal growth, the underlying technique of quantile regression (QR) can be applied to any covariate, extending its applicability beyond growth assessment. 33
This article is organized as follows. First, we describe the different modeling approaches that are compared. Secondly, we describe the method to simulate data from a mechanistic model and present the results of the different modeling approaches applied to the simulations. Finally, we apply the approaches to the motivating example as an illustration and show the potential clinical benefit. The article is concluded with a discussion.
Methods
We consider the situation in which a single cardiac biomarker is measured sparsely (for some patients down to one or two measurements) and irregularly up to 24 hours after surgery. During this period, a complication in the form of a PMI can occur. However, the exact time at which the complication occurs is unknown and the diagnosis is (dis)confirmed at a later time (taking multiple clinical factors into account). Therefore, the outcome is considered binary, indicating whether the patient was (at some point) diagnosed with a PMI. Our goal is to actively monitor patients for a potential PMI, based only on the accrual of information from biomarker measurements. To monitor patients, we require a
Raw value
Arguably the most straightforward approach is to use the raw value
Modeling approaches
Since we are interested in the potential benefit of different statistical modeling approaches, we use a flexible model for the longitudinal profile of
SGC can be estimated through QR. QR aims at fitting the
In this study, we implement the CGC as described by Wei et al.
32
in the QGAM framework of Fasiolo et al.
36
We expand (4) as follows:
An alternative to the growth chart approach that directly models the probability of the outcome
The GFLM is described by Müller
24
for observations observed on dense grids of points.
24
The general idea is to reduce the dimension of the longitudinal data by an orthogonal expansion of the random effects and use the first few components of the expansion as covariates in a generalized linear model, for example, logistic regression model. This procedure can also be applied to our study, with some modification as observations are irregularly and sparsely observed. We model
The LDA approach consists of two steps. In the first step, the inherently infinite-dimensional curves are projected onto a low-dimensional space, in the second step, the low-dimensional representation is used to perform discriminant analysis. In this study, we implement both a covariance pattern longitudinal discriminant analysis (COV-LDA) as described by Roy et al. 39 and a functional linear discriminant analysis (F-LDA) as described by James and Hastie. 18
COV-LDA
The COV-LDA model consists of a linear additive model with a parametric intercept term, a factor smooth interaction term, and a covariance pattern (i.e. correlation structure) to model dependence among observations within a single patient. The factor smooth interaction term allows for separate smooths for both PMI and non-PMI patients:
The alternative F-LDA approach does not assume a correlation structure for the residual error but utilizes random effects to capture variability between patients. In the F-LDA approach, we model the profile
After fitting (11) and (12), we can obtain estimates for a new subject given a set of measurement times
To compare the different approaches, we simulate data based on a biexponential (pharmacokinetic) model that reflects the release of a cardiac biomarker after surgery and clearance from the circulation by the kidneys; see the following equation:

Plot of simulated data from the non-linear mixed effects (NLMEs) bi-exponential model. Left: marginal/fixed effects for perioperative myocardial infarction (PMI) and non-PMI patients. Right: fixed effects combined with sampled subject-specific effects and residual error.
Since the goal of this study is to perform dynamic longitudinal classification, we focus on the performance of the approaches in a dynamic fashion. We generate separate training and test sets and fit all the approaches to the training sets. Then, we evaluate the time-dependent area under the ROC curve (AUC) by calculating the AUC at each
Implementation
All approaches are implemented using
Results
The time-dependent AUC is plotted over time for different degrees of sparsity in Figure 2 and for different event rates in Figure 3. Numerical values for different degrees of sparsity/event rates can be found in the Supplemental Material. The functional regression approaches (GFLM and FLDA) show a clear benefit in discriminative ability starting from

Time-dependent area under the ROC curve (AUC) under high and low degrees of sparsity. Each approach was fitted on a training set and performance was evaluated in a time-dependent fashion on a test set. This process was repeated 50 times to obtain means and standard errors. SGCs: static growth charts; CGCs: conditional growth charts; VCMs: varying-coefficient models; COV-LDA: covariance pattern longitudinal discriminant analysis; F-LDA: functional longitudinal discriminant analysis; GFLM: generalized functional linear model.

Time-dependent area under the ROC curve (AUC) under low and high outcome (PMI) event rates. Each approach was fitted on a training set and performance was evaluated in a time-dependent fashion on a test set. This process was repeated 50 times to obtain means and standard errors. PMI: perioperative myocardial infarction; SGCs: static growth charts; CGCs: conditional growth charts; VCMs: varying-coefficient models; COV-LDA: covariance pattern longitudinal discriminant analysis; F-LDA: functional longitudinal discriminant analysis; GFLM: generalized functional linear model.
The dynamic classification AUC is reported in Tables 1 and 2 for different degrees of sparsity and event rates, respectively. This table reflects the performance when taking the maximum value of the monitoring statistic for each patient in the time interval
The AUC with standard error in round brackets, when using each approach to dynamically classify new patients under a high degree of sparsity (75% missingness) and a low degree of sparsity (10% missingness).
Each approach was fitted on a training set and performance was evaluated on a test set. This process was repeated 50 times to obtain means and standard errors. AUC: area under the ROC curve; SGCs: static growth charts; CGCs: conditional growth charts; VCMs: varying-coefficient models; COV-LDA: covariance pattern longitudinal discriminant analysis; F-LDA: functional longitudinal discriminant analysis; GFLM: generalized functional linear model.
The AUC with standard error in round brackets, when using each approach to dynamically classify new patients under a low event rate (5%) and a high event rate (20%).
Each approach was fitted on a training set and performance was evaluated on a test set. This process was repeated 50 times to obtain means and standard errors. AUC: area under the ROC curve; SGCs: static growth charts; CGCs: conditional growth charts; VCMs: varying-coefficient models; COV-LDA: covariance pattern longitudinal discriminant analysis; F-LDA: functional longitudinal discriminant analysis; GFLM: generalized functional linear model.
In Tables 3 and 4, the sensitivity, specificity, threshold, and ARL of each approach is given under varying degrees of sparsity and outcome event rates. The threshold is based on the Youden index. The F-LDA and GFLM approaches show the best results in terms of combining a high sensitivity, specificity, but at the expense of a somewhat longer ARL than the raw value.
Sensitivity, specificity threshold, and ARL in hours with standard error in round brackets, for each approach for high and low degrees of sparsity.
A threshold based on the Youden index was used as a stopping rule. That is, if, for a patient, the monitoring statistic of an approach rises above this threshold, the patient is classified as positive. Note that the monitoring statistic represents a value in case of the raw value, a quantile in case of the SCG and CGC approaches and a probability in the VCM, GFLM, COV-LDA, and F-LDA approaches. The mean time until a positive classification is represented by the ARL. ARL: average run length; SGCs: static growth charts; CGCs: conditional growth charts; VCMs: varying-coefficient models; COV-LDA: covariance pattern longitudinal discriminant analysis; F-LDA: functional longitudinal discriminant analysis; GFLM: generalized functional linear model.
Sensitivity, specificity threshold, and average run length (ARL) in hours with standard error in round brackets, for each approach for low and high event rates.
A threshold based on the Youden index was used as a stopping rule. That is, if for a patient the monitoring statistic of an approach rises above this threshold, the patient is classified as positive. Note that the monitoring statistic represents a value in case of the raw value, a quantile in case of the SGC and CGC approaches and a probability in the VCM, GFLM, COV-LDA, and F-LDA approaches. The mean time until a positive classification is represented by the ARL. ARL: average run length; SGCs: static growth charts; CGCs: conditional growth charts; VCMs: varying-coefficient models; COV-LDA: covariance pattern longitudinal discriminant analysis; F-LDA: functional longitudinal discriminant analysis; GFLM: generalized functional linear model.
As described in Section 1, this study is motivated by the need to detect patients who experience PMI after having undergone CABG surgery, based on serial measurements of a cardiac biomarker. After surgery, cardiac biomarkers are repeatedly sampled in patients to detect a possible PMI. A PMI is defined as a procedural myocardial infarction whose pathogenesis is multifactorial and can be graft-related or non-graft-related.42,43 Examples of graft-related PMI include graft failure due to occlusion, kinking, or overstretching. Non-graft related PMI can result from procedural difficulties like trauma from surgical manipulation or inadequate myocardial protection. The post-operative rise of cardiac biomarkers, in particular cardiac troponin (cTn), can reflect myocardial damage originating from either (early) graft failure or non-graft related causes. In the former case, minimizing the time to a re-intervention is crucial to save viable myocardium. However, all patients experience an unavoidable increase in cardiac biomarkers, simply as a result of the procedure itself. For this study, a dataset of 639 patients who underwent CABG surgery at Catharina Hospital in Eindhoven, The Netherlands, is available. For more details on the study, see Deneer et al.
44
For each patient, cTnT was sampled up to 24 hours after surgery and the outcome (PMI yes/no) was recorded. Sampling of cTnT was irregular and more frequent in the first 6 hours after surgery, see Figure 4(b). Patients with a PMI generally show a sustained release of cTnT from damaged myocardium, instead of a rising-and-falling trend in the first 24 hours after surgery,44-46 see Figure 4(a). As this is a clinical case study and there are only a small number of cases, leave-one-subject-out cross-validation was used to estimate performance. To calculate the time-dependent AUC, only patients who had at least two measurements before

Overview of studydata. (a) Spaghetti plot of time after aortic unclamping against the

Predictions from the varying-coefficient model (VCM) and generalized functional linear model (GFLM) approaches. (a) Contour plot with contour lines in red, reflecting the predicted probability of a perioperative myocardial infarction (PMI) by the VCM approach. Note that the predictions are on the linear predictor scale which can be converted to a probability by applying the logit function. For example, the “0” contour line, represents the line with a probability of a PMI of 0.5. (b) This plot shows the three eigenfunctions that explain 95 % of the variance, extracted by the FACEs approach of the GFLM. The first eigenfunction
The area under the ROC curve (AUC) and the 95% confidence interval for each of the non-historic approaches applied to the study dataset using leave-one-subject-out cross-validation and information up to time
cTnT: cardiac troponin T; SGCs: static growth charts; CGCs: conditional growth charts; VCMs: varying-coefficient models.
The area under the ROC curve (AUC) and the 95% confidence interval for each of the historic approaches applied to the study dataset using leave-one-subject-out cross-validation and information up to time
cTnT: cardiac troponin T; COV-LDA: covariance pattern longitudinal discriminant analysis; F-LDA: functional longitudinal discriminant analysis; GFLM: generalized functional linear model.
The area under the ROC curve (AUC) and the 95% confidence interval of the maximum value for the study dataset using leave-one-subject-out cross-validation.
cTnT: cardiac troponin-T; SGCs: static growth charts; CGCs: conditional growth charts; VCMs: varying-coefficient models; COV-LDA: covariance pattern longitudinal discriminant analysis; F-LDA: functional longitudinal discriminant analysis; GFLM: generalized functional linear model.
Performance of different modeling approaches when defining a threshold equal to the sensitivity of the cTnT guideline.
TP: true positives; FP: false positives; TN: true negatives; ARL: average run length; cTnT: cardiac troponin-T; SGCs: static growth charts; CGCs: conditional growth charts; VCMs: varying-coefficient models; COV-LDA: covariance pattern longitudinal discriminant analysis; F-LDA: functional longitudinal discriminant analysis; GFLM: generalized functional linear model.
In this study, we described and compared several popular semi-parametric modeling approaches that combine irregularly and sparsely sampled measurements with a binary outcome. Our results show that functional regression models that implicitly incorporate historic information through the estimation of a covariance function, outperform models that do not incorporate historic information. The GFLM performed best of the approaches that incorporate historic information, while the VCM and static growth charts performed best of the approaches that do not incorporate historic information. The degree of sparsity has an effect on the ability of the functional regression approaches to outperform the non-historic approaches. Under settings with very high degrees of sparsity (down to a few measurements per subject) there is little benefit in incorporating historic information through (functional) random effects. The event rate of the outcome also has an impact on discriminative ability; lower event rates reduce discriminative ability, but this effect is equal across approaches. Except for settings with a high degree of sparsity and conditional growth charts, all modeling approaches show a benefit in terms of discriminative ability in a dynamic classification setting, compared to the clinical practice of using a fixed threshold on the raw measured value.
CGCs appear to be less suitable for (dynamic) classification of irregularly and sparsely sampled curves. In part, this is due to the fact that growth charts are not developed with classification in mind.
32
The CGC approach, which explicitly incorporates historical information through AR terms, seems to offer a benefit in early detection of cases but not for later time points (see Figures 2 and 3). In this study, the CGC model as defined in (6) is referred to as a ‘‘global model” by Wei et al.
32
This model is restrictive in the sense that it assumes that AR coefficients are linear functions of measurement time distances. Wei et al. describe several generalizations of the global model, for example, allowing the AR coefficients to be functions of measurement time distances. These generalizations could improve the performance of the CGC model. Moreover, in this example, an AR(1) model was used restricting historic information to the previous measurement only, including higher-order lags could improve performance. This also explains the counterintuitive finding depicted in Figure 2, where the CGC approach performs worse under
The functional regression approaches (F-LDA and GFLM) performed best on the simulated data. Although the F-LDA and GFLM approaches perform similar in this study, this may not always be the case. In a study by Hughes et al. 47 that compared three different approaches to calculate a patient’s posterior group membership based on random effects, they concluded that the marginal approach (comparable to our F-LDA approach) performs best when the mean profile is noticeably different between groups. The GFLM approach could be a better option if the difference between the groups is characterized by the variability around the mean profile. The GFLM approach uses the principal component (PC) scores as covariates in a logistic regression model. However, using PC scores as predictors is not without downsides. With PC scores, there is no guarantee that the groups are separated best in the direction of the PC scores with the highest variance. 48 In this study, we choose PC scores based on the percentage variance explained, but an alternative approach could be to apply a variable selection technique to choose PC scores based on their ability to separate groups. The F-LDA also outperformed the COV-LDA approach, this could be a result of the continuous time AR(1) correlation structure being too simple to effectively model the data. Different correlation structures could improve the performance of the COV-LDA approach, as well as incorporating class-specific correlation structures.
In different simulation scenarios, we investigated the (potential) effect of the degree of sparsity and event rates on the performance of the dynamic longitudinal classification approaches. When comparing high versus low degrees of sparsity (the high scenario corresponds to an average of four measurements per subject and the low scenario to 12 measurements per subject), the functional regression approaches suffer the most in terms of discriminative ability. This is not surprising, as there is less historic information available to estimate random effects in a sparse setting. However, this is not to say that the functional approaches offer no benefit in very sparse situations, but rather that it takes longer for this benefit to appear in a longitudinal setting. As can be seen in Figure 2, the functional approaches outperform the other approaches at
The GFLM and F-LDA approaches did not outperform the growth chart and VCM approaches in the clinical case study. However, this does not invalidate the conclusions from the simulations for several reasons. First, the clinical case study is comparable in the number of measurements per subject to the simulated data with a high degree of sparsity. Therefore, approaches based on historic information are more affected in terms of predictive performance, as can also be seen in the simulation results (Table 1). Second, while the clinical case study is comparable to the high degree of sparsity in terms of measurements per subject, the sparsity in the clinical case study is not evenly distributed over the time interval. There is more dense sampling at
Copyright statement
Please be aware that the use of this LATE X2ɛ class file is governed by the following conditions.
Copyright
Copyright © 2023 SAGE Publications Ltd, 1 Oliver’s Yard, 55 City Road, London, EC1Y 1SP, UK. All rights reserved.
Rules of use
This class file is made available for use by authors who wish to prepare an article for publication in a
This class file is provided on an
Supplemental Material
sj-pdf-1-smm-10.1177_09622802251374288 - Supplemental material for A comparison of semi-parametric statistical modeling approaches to dynamic classification of irregularly and sparsely sampled curves
Supplemental material, sj-pdf-1-smm-10.1177_09622802251374288 for A comparison of semi-parametric statistical modeling approaches to dynamic classification of irregularly and sparsely sampled curves by Ruben Deneer, Zhuozhao Zhan, Edwin Van den Heuvel, Astrid GM van Boxtel, Arjen-Kars Boer, Natal AW van Riel and Volkher Scharnhorst in Statistical Methods in Medical Research
Supplemental Material
sj-pdf-2-smm-10.1177_09622802251374288 - Supplemental material for A comparison of semi-parametric statistical modeling approaches to dynamic classification of irregularly and sparsely sampled curves
Supplemental material, sj-pdf-2-smm-10.1177_09622802251374288 for A comparison of semi-parametric statistical modeling approaches to dynamic classification of irregularly and sparsely sampled curves by Ruben Deneer, Zhuozhao Zhan, Edwin Van den Heuvel, Astrid GM van Boxtel, Arjen-Kars Boer, Natal AW van Riel and Volkher Scharnhorst in Statistical Methods in Medical Research
Footnotes
Declaration of conflicting interests
Funding
ORCID iDs
Supplemental material
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
