Abstract
Introduction
Recognizing sport-related concussion (SRC) is multifactorial and involves reviewing the mechanism of injury and kinematics, assessment of athlete signs and symptoms as well as a thorough clinical examination. The examination of an athlete suspected of sustaining a concussion has evolved and includes a variety of assessments inclusive of those that evaluate symptoms, clinical signs, more detailed ocular motor, vestibular function, and neurocognitive testing.1,2 Many of these examinations are compiled within the Sport Concussion Assessment Tool-5 (SCAT5),3–5 including the Standardized Assessment of Concussion (SAC). 6 The SAC is reportedly the most utilised objective concussion test but a recent study identified that the SAC had a poor diagnostic accuracy and reliability. 7
Another assessment instrument with increasing interest is the King-Devick test (K-DT), a rapid number naming task, that functions as a pseudo-reading test, broadly capturing aspects of afferent visual function, attention, language, visual fixation, and saccadic eye movements.8,9 An advantage to the K-DT is that it can be utilized by attending clinicians with reasonable ease and has been validated as a sensitive sideline performance measure for concussion detection.1,10–17 The K-DT has been reported to be a high performing objective sideline concussion screening test having the highest reliability, sensitivity, and specificity 7 and is the only vision and ocular assessment tool which has had diagnostic accuracy investigated within an adult sporting population. 1 The summary sensitivity and specificity of the K-DT was 0.77 and 0.82, respectively. 1 However, in contrast, Fuller et al. 18 showed a sensitivity of 59.6 and specificity of 39.2 in rugby players diagnosed with concussion and it is suggested that the K-DT should be one of several assessments used in concussion recognition and not a stand-alone test 18
Outside its application for concussion, the K-DT exhibited a strong test-retest reliability in Eddy et al. 19 study with an exercise or rest intervention, however it reported a high false-positive rate thereby the authors suggesting clinicians use caution when interpreting its results. The association between the K-DT and exertion or exercise is important as its application is intended for use with sporting activities which have exertion as an integral component and may play a role in its performance.20–22 For example, Rist et al. 23 showed improved K-DT performance after a 15 min of high intensity (80% of predicted maximal hear rate) exercise but not moderate intensity (65% of predicted maximal heart rate) exercise. This is supported by a meta-analysis and systematic analysis 24 that showed improved K-DT scores of 1.4 s, in the pooled dataset, following vigorous exercise in the absence of concussion.
The K-DT was expanded to include supplementary hardware to its computer-based number-reading test in 2014; the K-DT with integrated infrared video-oculography-based eye tracker (K-D ET) leverages a portable light bar that directly mounts onto a laptop
Although, the K-D ET single-package unit has been available for a few years, there has been a paucity of literature reporting its validity against other similar or even research-grade high resolution eye-tracking units to establish inter-tracker validity, and reliability in research trials on concussion or with exercise or exertion. A few exceptions were the Onge et al. 30 study that utilized two identical K-D ETs with first-generation eye trackers along with participants alternating between the systems on successive trials. In a secondary phase of the Onge et al. 30 study, participants performed the testing on five successive days and twice on each system. The results showed that several of the systems’ ET measurements lacked face validity and they concluded the systems could not be used for scientific research. However, the K-D ET unit utilized in the Onge et al. 30 study has been discontinued. Hecimovich et al. 31 utilized the current second-generation eye tracking unit, an nfrared-based, video-oculographic rig (120Hz VT3-Mini, EyeTech Digital Systems, Mesa, AZ) and a laptop, as part of their study on K-D ET on youth (ages 13-14) Australian footballers. Their study reported participants that had sustained a head impact recorded a slower mean total K-D ET time, fewer mean total saccades and had more mean blinks when compared with their baseline scores. The Hecimovich et al. 31 study identified that the total mean blinks were the most sensitive measure for potential SRC.
In light of the increasing interest in the K-DT and therefore by extenssion its combined rapid number naming task and eye tracking unit, coupled with limited reliability evidence, for the K-D ET system to gain greater use in the recognition for SRC, it is vital to establish the test-retest reliability. Therefore, the aim was to provide evidence on the measurement properties of the K-D ET system. Specifically, the objectives were to measure the test-retest reliability of the King-Devick Eye Tracking system and determine if exercise influences the variables assessed by the King-Devick Eye Tracking unit.
Methods
Participants
Participants (N = 61; 26 male, 35 female; age range 19-25) were recruited on a university campus during the 2020-2021 academic year. Exclusion criteria included the use of bifocal, progressive, or other multi-focal corrective lenses at time of testing, or presence of an intraocular implant. Participants were required to have normal or corrected vision and ability to perform two cardiovascular activities at a vigorous rate. There were no exclusion criteria based on gender, ethnicity, race, or other demographics. Participants were allocated to the exercise or sedentary groups dependent upon the first contact with the primary investigator (i.e. first to exercise, second to sedentary etc.) until all participants were enrolled. Those in the exercise group (n = 31) were instructed to wear clothing and shoes for treadmill walking while those in the sedentary group (n = 30) were requested to bring a smartphone, with earphones, in order to listen to music. The project was approved by the Institutional Review Board (IRB 20-0131).
King-Devick Eye tracker test
Each participant sat in front of the K-D ET System, including an eye tracking unit, an infrared-based, video-oculographic rig (120Hz VT3-Mini, EyeTech Digital Systems, Mesa, AZ) and a laptop. Measurements from head to camera were made in order to ensure proper eye contact with the infrared camera. The K-D ET rig was positioned close to the bottom of the laptop screen while ensuring that it did not cover the display. A tape measure was used to position the subject 60 centimeters away from the K-D ET rig in a high back chair. Green lights on the K-D ET indicated correct positioning. For optimum conditions, the overhead lighting in the room was dimmed and outside window glare was minimized by darkened window shades. The participant was positioned in front of the screen and their eyes aligned to the center of the computer screen, in focus and clear. Participants were instructed to complete a calibration or pre-test by looking straight ahead and then at yellow targets in each corner of the screen. If the pupil, corneal reflections and/or crosshairs were not visible, out of focus, or unsteady in any of the viewing locations, the participant was cycled through the spatial location again to optimize the settings. At this point calibration commenced. With this, the participant followed a red target across the screen with their eyes.
Following calibration, a validation was conducted by verifying the eye position at 5 spatial ‘check’ points on the screen. Accurate viewing with a proper calibration changed the spatial check points blue. If successfully accomplished, the K-D ET test began. If the calibration was not accurate, calibration was repeated for best results. Corrective lenses were worn only if required for reading. Participants were instructed to read aloud the single digit number displayed from left to right, top to bottom, as quickly as he or she could without making any errors. Participants were instructed not use their hand or finger to help follow the numerical pattern. If the participant made an error and quickly corrected it, no error was recorded. An error was recorded for each omission, commission, and reversal. The K-D ET system provides post hoc eye movement and test analysis for the following measures: total saccades, average saccade velocity, peak saccadic velocity, total fixations, inter-saccadic interval (average fixation duration), average fixation polyarea, individual card, the number of blinks and overall completion time.
Intervention protocol
Participants were allocated into either an exercise group or sedentary group (see Figure 1). Prior to testing all participants had their vision assessed with a Snellen Eye Chart at 20 feet with corrected lenses, if applicable. Those who were assigned to the exercise group completed the 2020 PAR-Q + , an evidence-based pre-participation screening tool to determine if participants are able participate in physical activity or exercise. 32

Participant allocation and exercise / rest protocol for the king-devick Eye tracking tests.
Participants in the exercise group completed two baseline K-D ET measurements and then moved into a separate room with a zero-grade horizontal treadmill. Once positioned onto the treadmill the participants were instructed to inform the investigator when they reached a level of 6 of 10 on a rate of perceived exertion scale (Borg CR-10 Scale)
33
indicating a level between strong (5) to very strong (7). This was achieved by increasing the speed of the treadmill in a relatively short period (< 1 min) until they reached the level of 6. Once they reached level 6, the participant walked for 10 min, without rest, with periodic adjustments to the speed to maintain the 6 level. This was classified as exercise session 1
Participants in the sedentary group completed two baseline K-D ET measurements and remained in the testing room and instructed to engage in listening to music on their own electronic device, with earphones, and not permitted to use the electronic device visually. During this session, the room lights were dimmed, and they remained in the room listening to music for 10 min. Upon completion of the sedentary session 1, they were tested on the K-D ET system and re-tested approximately 5 min later under identical conditions. Upon completion of sedentary session 1 and the K-D ET test, they were instructed to listen to their own music without electronic visual stimulation for 10 min. Upon completion of sedentary session 2, they were tested on the K-D ET system and re-tested approximately 5 min later under identical conditions. In all, each participant was tested six times on the K-DET system.
Statistical analysis
Data analysis were conducted using SPSS v27 (IBM Corp. released 2020). The following dependent variables were computed by the K-D ET test; time to complete (seconds), total saccades, average saccade velocity (°/s), total fixations, inter-saccadic interval (msec) (average fixation duration) and average fixation polyarea (mm2). The K-D ET test records the variables for each card within the test and the variables were subsequently transformed in different ways depending on what they represented: time to complete was the overall time to complete across all cards, total saccades were the sum of the values from all three cards, average saccade velocity were the average of the values from all three cards, peak saccadic velocity was the average peak saccadic velocity across all three cards, total fixations were the sum of the values from all three cards, inter-saccadic interval were the average of the values from all three cards, average fixation polyarea were the average of the values from all three cards and total blinks were the sum of the values from all three cards. An assessment of normality was performed using the Shapiro-Wilk test The mean (SD) for all dependent variables were determined.
The test-retest reliability of the K-D ET dependent variables were calculated from Trial 1 and Trial 2 (e.g. trials before introducing the intervention). Reliability was not assessed using the repeated measures (e.g. trials three through six) following the introduction of the intervention as the variables were deemed reliable from the pre-intervention trials and this method avoided potential confounders related to the intervention. The ICC and 95% confidence interval of two measurements ICC (3, 2) were calculated using a two-way, consistency, mixed-effects model. The ICC was considered to be poor, moderate, good and excellent if they were <0.50, 0.50-0.75, 0.75-0.90 and >0.90. 34
The correlation between completion time and the variables assessed by the K-D ET test (total saccades, average saccade velocity, peak saccadic velocity, total fixations, inter-saccadic interval, average fixation polyarea and total blinks) were assessed using a Pearsons r. The trial selected to perform the analysis of correlation was the final trial (trial six) due to the possible learning effect reported within this study and was therefore the trial most likely to be represent the relationship between time to complete and the K-D ET variable.
The baseline score for the time to complete the test was determined as the fastest time from T1 and T2. For all other dependent variables, the baseline score was calculated as the average of the two pre-intervention K-D ET tests. The difference following intervention trial one was calculated as the ‘baseline score minus the post intervention score (T3)’ and are presented as mean (SD). The difference following intervention trial two was calculated as the ‘post intervention score for trial one after rest (T4) minus the post intervention score (T5)’ and are presented as mean (SD).
The differences following intervention one and intervention two were included within a generalized estimating equation (GEE) as the dependent variable and the trial was included as a repeated measure with subjects included for within the GEE to allow for repeated measures. The intervention (either rest or exercise) and participant gender (female or male) were included as factors. Goodness of fit was examined using the Quasi likelihood under Independence Model Criterion (QIC) with a lower the value representing a better fit.
Results
A total of 61 participants (mean age 21.3
King-Devick eyen-tracker test dependent variables for baseline and subsequent trials reported by mean and standard deviation.
SD = Standard Deviation; Baseline = fastest of trial 1 and 2; s = seconds; n = number; °/s = degrees per second; msec = milliseconds; mm2 = millimeters squared.
There were significant correlations reported between the completion time and five variables of the K-D ET test: total saccades; average saccade velocity; peak saccadic velocity; total fixations; inter-saccadic interval (Table 2). Total saccades, total fixations and the inter-saccadic interval increased as the test duration increased whereas the average saccadic velocity and peak saccadic velocity decreased as the test duration increased.
Correlation of king-devick eye tracking variables to completion time.
*Statistically significant with p < 0.01.
Although the mean ± SD of the differences in the completion time both trials was −0.04 ± 2.75, there were no observable differences within the GEE for intervention (
Generalised estimated equation for differences in the K-D ET by completion time, total saccades, average saccades velocity, fixation’s count, inter-saccadic interval, and fixation polyarea with quasi likelihood under independence model criterion (QIC) for intervention, gender and trial by beta distribution (β), standard error (SE) with 95% confidence interval and probability value (p = .).
(a) = Rest set to 0, (b) = Male set to 0. β = Beta distribution; SE = Standard Error; CI = Confidence Interval; *p < 0.05.
The mean ± SD of the difference of the total fixations before and after the intervention across both trials was 1.04 ± 3.63 and there was an observable difference detected by the GEE in the trial number (
The reliability of the peak saccade velocity and the total blinks was poor to moderate. The average fixation polyarea and total saccades had good test-retest reliability (see Table 4). All other variables tested showed excellent test-retest reliability.
Test-retest reliability of components of the king-devick Eye tracking test.
ICC = Inter-class correlation; CI = Confidence interval.
Discussion
The broad aim of this study was to provide evidence on the measurement properties of the K-D ET system, specifically to measure the test-retest reliability and determine if exercise influences the variables of the combined eye tracking unit.
The results of this study provide evidence on the reliability of the K-D ET, and in particular the eye-tracking components. Although previous research19,23,35 on the number naming task (time only) has shown the K-DT to be a reliable tool under an exercise intervention, these studies did not directly measure eye tracking components. This is vital as the use of the K-DT may be increasing and the combination of the K-DT and eye tracking as one single package highlights the need to specifically measure the reliability of this combined unit.
The evidence of the reliability brought forth in this study needs to be accompanied with supporting validity in future research. It is important to note the K-D ET unit is not classified as a research-grade, or high-speed, system which can range from 250 Hz to 2000 Hz sampling rate.36–40 There is support in the literature that sampling rates below 200 Hz are not able to provide as many measurements as a high-speed system, in particular peak saccade velocity, 38 as they are quick (900 + °/sec)41,42 and brief (< 100 msec) 42 movements of the fovea from one fixation point to another. 43 Thus, the capacity for accurate measurement is limited by the available technology. The current study recorded eye measurements with a 120 Hz sampling rate and an angular error of approx. 0.5° / Drift <0.3°; this was fixed given the infrared oculographic tracking unit from a third-party specifically utilized for the K-D ET. Therefore, the saccadic activity measured by the K-D ET unit may arguably be adequate.39,44 However, it’s reasonable to report on these measurements for the use in future research comparing the combined unit to a research-grade system, keeping in mind that until more definitive evidence is available in both reliability and validity, caution needs to be used when interpreting the results for clinical decision-making.
There are several eye tracking units available on the market in the 30 Hz to 60 Hz range and these may be less than ideal for clinical use and this is supported in the literature. For instance, results from the Ooms et al. 38 study measuring the accuracy and precision of fixations revealed that an eye-tracking unit with a sampling rate at 30 Hz is less accurate in comparison to a 60 Hz system. Raynowski et al. 37 employed the use of the K-DT rapid number naming task with a research-grade eye tracking unit (EyeLink 1000 + ) against low-resolution units at 30 Hz and 60 Hz and reported the low-resolution units yielded significantly less detectable saccades and greater variance of ISI. Contrasting between a 60 Hz and 120 Hz sampling rate, Leube et al. 45 compared these two sampling rates during a reading task measuring saccade characteristics to a research-grade 1000 Hz eye-tracker and showed higher accuracy in the detection of fast eye movements and fixation durations with the 120 Hz system.
Although the aforementioned supports the use of higher sampling rates, there are studies that utilized eye-tracking units with lower sampling rates. For example, Lirani-Silva et al. 46 measured saccadic frequency, mean and peak velocity, and fixation frequency and duration during walking in people with mild traumatic injury using a Tobii Pro Glassess 2, 100 Hz eye tracking unit. They reported participants with mild traumatic brain injury showed reduced saccade frequency, duration and peak velocity compared with health controls. However, in their study, they used a custom-made validated velocity-based saccade detection algorithm which may have yielded more robust data. Recently, research using the K-D ET has been reported. Marchant et al. 47 used the K-D ET unit and reported on saccade velocity, and fixation time (ISI) in their study assessing ankle and finger somatosensation and lower limb muscle activity assessing visuomotor control in both conditions. Tejani et al. 48 utilized the K-D ET combined unit when assessing baseline differential eye movements and visual contrast acuity in competitive athletes and reported on total number and frequency of saccades and completion time, but not saccade velocities. As noted previously, Hecimovich et al. 31 study with young Australian footballers measured number of saccades and blinks.
The use of video-oculographic eye tracking has been studied28,49–52 in relation to concussion and clinicians may employ its use with other eye tracking devices, for example Tobii, SyncThink, and EyeLink. However, having the combined K-D ET unit may make for easier clinical utility, considering how effective the K-DT is for SRC evaluation.1,7 Furthermore, eye movement abnormalities can persist in the absence of other post-concussion symptoms53–55 therefore, monitoring ocular motor function may assist return-to-play decision making. Unfortunately, current evidence does not have sufficient strength to inform clinical decision-making, and research needs to establish guidelines for each specific unit as they differ in task. 1
The results from the current study indicated that there were no observable differences in all eye tracking components analysed when comparing participants gender and the intervention. However, when comparing saccadic changes, there were meaningful difference (saccades,
Although the results for total fixations revealed a meaningful difference in the trial number (
The K-D ET is specific to horizontal reading task and the thus comparisons to studies using other methods are difficult. The K-D ET unit provides measurement on duration (time to take to test), number of saccades, average saccade velocity (°/s), peak saccade velocity (°/s), number of fixations, ISI (ms), average fixation polyarea (mm2) and number of blinks. The average saccade velocity and average fixation polyarea are company-based metrics and were reported in the Onge et al.
30
and Marchant et al.
47
(average saccade velocity only) studies with both using the K-D ET, keeping in mind that the Onge et al.
30
study used the older, discontinued, model. Average saccade velocity was mentioned in Tad et al.
63
study, however, the usefulness of these two metrics needs to be established in the literature. There have been several studies26,60,64,65 that have utilized the rapid number naming component of the K-DT with other eye tracking units. The results reported in these studies, such as the number of saccades and ISI, are similar to ones reported in the current study. For example, in Rizzo et al.
64
study investigating eye movements during sandbagging utilized the K-DT with an EyeLink 1000 Plus infrared-based video-oculographic camera (500 Hz). They reported median ISI in the 300 ms to 400 ms range, number of saccades in a 130-172 range, and average peak velocities of 299.5°/s. Gold et al.
65
assessed the relation between ISI prolongation and prolonged K-DT testing with results showing median ISI values of 379
The limitations in this study need to be highlighted. For instance, initially participants in the exercise group were to perform the exercise activity at 60-70% HRmax, however due to the COVID-19 restrictions set forth from the institutional review board this was not permitted. As a result, the use of a perceived exertion scale (Borg CR-10 Scale) 33 was utilized instead and may not have been sufficient to replicate fatiguing exercise. Participants in this study were in a homogenous group comprised of university students and not representative of a wider population. This study appeared to be appropriately powered with acceptable Quasi likelihood under Independence Model Criterion for most variables within the GEE, however even with a larger sample the effect sizes did not seem clinically relevant. Although reading errors were documented, participants were not required to redo their test if they had two or less errors per test This decision was due to limiting the amount of time participants needed to complete all the requirements. The K-D ET unit utilized in the current study was a second-generation model whereas the Onge et al. 30 study utilized an older, and now discontinued version, highlighting the need for further research on this new version with a focus to validate this against other eye tracking units. We did not measure recovery following the rest period for the exercise group and are unable to determine if ten minutes was sufficient. The 10-min time frame for the exercise sessions may have impacted the results as Galetta et al. 11 reported that the K-DT test has been shown to have learning effects associated with repeat testing improvement of 3.4 s in median times. However, by developing a baseline, and assessing change based on the most recent test not always comparing the time to baseline we do not think this is likely to have impacted results from our generalized estimated equations. Finally, the baseline for the K-D ET duration was the fastest time recorded and replicates previous studies using the K-D test 1 and this may influence results (as opposed to using the mean of two tests or the slowest time) however, given the reliability of the test duration using these measures was reported as an ICC = 0.917 the influence on the results would likely be negligible.
Conclusion
The ICC results indicate the K-D ET system to be a reliable measure; however, due to the testing sessions being close to each other, learning effects may have led to these results. Nonetheless, the combination of the K-DT number reading task and eye tracking unit has potential as a post sideline-assessment tool (to be used after standard sideline tests such as the K-DT number reading task, and SACT5) to aid in detecting SRC and monitoring recovery. However, there lacks empirical evidence to guide clinical decision on the eye movement abnormalities. Furthermore, research now needs to focus on the combined units’ ability to detect changes that occur with SRC to provide confidence for its use clinically.
Future research needs to focus the reliability on the K-D ET combined unit and include a comparison against research-grade high-resolution eye tracking units to establish validity in the advent of widespread clinical use in the athletic setting for concussion recognition and monitoring.
