Sage Journals: Discover world-class research

Abstract

Background

To explore the accuracy of system combining virtual reality (VR) and artificial intelligence (AI) for screening pediatric strabismus.

Methods

A total of 131 subjects aged 3 to 18 years were included in this study, out of which 110 were included in the final data analysis. Among them 28 were normal, 60 patients with external strabismus, 18 patients with internal strabismus, and 4 patients with vertical strabismus. After the patients were independently diagnosed and evaluated by two strabismus and pediatric ophthalmologists, the mean value was used as the gold standard. All patients completed the AI system within 2 minutes. All data were statistically analyzed using SPSS and MedCalc. The agreement between the two methods for diagnosis and classification of strabismus was compared by Kappa consistency test. The agreement between the two methods for measuring ocular strabismus results was assessed by Bland-Altman plots and interclass correlation efficiency (ICC), and linear regression plots were used to analyze the two methods correlations.

Results

The system screened for strabismus with a sensitivity of 83.0%, a specificity of 79%, and moderate agreement with manual results (Kappa = 0.562, p < 0.001). The system performed well in the diagnosis and classification of strabismus (Kappa = 0.749 for external strabismus, Kappa = 0.898 for internal strabismus, and Kappa = 1 for vertical strabismus, p < 0.001). In terms of measuring the angle of ocular deviation, the two methods were strongly correlated (R = 0.7595) and highly consistent (p > 0.05) in the near mode of esotropia, and although the correlation was high (R = 0.7652) in the far mode and the ICC results showed high consistency (ICC = 0.689), the BA charts showed poorer consistency. The two methods showed strong correlation (R = 0.731 and 0.561) in the near and far looking modes in the exotropia group, but the agreement was low (ICC < 0.4, p < 0.05). In the vertical strabismus group in the near mode, the correlation between the two methods was weak (R = −0.2455), and the ICC was not statistically significant, although the agreement was good (p > 0.05 for the BA chart). There was no correlation or agreement between the two methods in the far mode in the vertical strabismus group (R = 0, p < 0.05, ICC not statistically significant).

Conclusions

The system combines VR and AI can be used clinically to screen pediatric strabismus with high sensitivity and specificity. The system performs well in the diagnosis and classification of strabismus and can accurately calculate the ocular deviation angle in patients with esotropia. However, the calculation of ocular deviation angle in patients with exotropia and vertical strabismus is still deficient and needs further development.

Keywords

Virtual reality artificial intelligence strabismus pediatric

Introduction

Strabismus is a common pediatric ophthalmic disease,¹ which to presents as ocular deviation,² leading to impaired visual function and affecting the physical and mental health of children. Yoon et al.³ showed that there was a moderate association between each strabismus type (esotropia, exotropia, and hypertropia) and anxiety disorder, schizophrenia, bipolar disorder, and depressive disorder. During clinical treatment, we found that most of the children with strabismus came to the clinic when their parents found out their strabismus during medical check-ups or in their daily life. In some cases, the development of stereoscopic vision is affected due to late detection of strabismus. Studies have shown that visual development in children occurs from birth through 7 to 8 years of age, and eye disease during this period can lead to irreversible consequences.⁴ Therefore, early detection and treatment of strabismus is essential. Currently, the commonly used clinical examination for strabismus is the prismatic covering test, which determines the type of strabismus by alternating covering test (ACT) and the cover-uncover test (CUT), and the angle of strabismus is determined by alternating prism covering test (APCT). Considering with the young age and poor cooperation of children with strabismus, it is difficult for nonstrabismus specialists to diagnose strabismus quickly and accurately through APCT. However, the small number of specialists in strabismus and pediatric ophthalmology and the large base of children in China cannot meet the demand for large-scale screening, and there is currently no clinical medical equipment that can replace manual strabismus diagnosis.

Artificial intelligence was first proposed by McCarthy in 1956, referring to technology used to mimic human behavior. Based on this, Arthur Samuel introduced the concept of machine learning (ML) in 1959, emphasizing the importance of systems learning automatically from experience rather than being programmed. And deep learning (DL), a subfield of ML, allows the use of neural networks to study potential features in data from multiple processing layers, like the human brain.⁵ DL has been applied in ophthalmology for image recognition (e.g. to classify diabetic retinopathy, retinopathy of prematurity, etc.⁶), to improve the accuracy and efficiency of examinations⁷ (e.g. visual field examinations for glaucoma⁸), etc. Studies showed that AI technology can quickly build models, process data, and even simulate manual strabismus cover test,⁹ which provide the possibility of developing strabismus screening equipment. On this basis, combined with VR to build a virtual scenario, we designed a new technology for strabismus screening and collected data from 110 children aged 3 to 18 years in the clinic to evaluate the accuracy of the device and to explore the accuracy of this system for clinical pediatric strabismus screening.

Methods

Participants

The study protocols adhered to the Declaration of Helsinki and were approved by the Ethics Committee of the Beijing Tongren Hospital, Capital Medical University (No. TRECKY2020-088). This study is a clinical diagnostic trial with a gold standard.

The ratio of strabismus patients to nonstrabismus patients at the first visit to the Strabismus and Pediatric Ophthalmology Department was estimated to be 1:2 based on the number of previous outpatient visits, and the prevalence = 1/3. The pre-estimated sensitivity of the diagnostic test (SN) was 0.9, and the permissible error (L) was 0.10, which made the test level α = 0.05, and gave a total sample size of N1 of 105 cases, according to the formula. The specificity of the pre-estimated diagnostic test (SP) was 0.85, and the permissible error (L) was 0.10, so that the test level α = 0.05, according to the formula, the total sample size N2 was 74 cases. According to the principle of taking the maximum value, at least 105 cases need to be included in this study. Considering the incomplete data caused by the children’s noncooperation during the inspection process related to the improvement project, assuming a 90% completeness rate, a total sample size of N = 105/0.9 = 117 cases is required.

Due to the low rate of completeness of outpatient data, 131 subjects were finally collected consecutively between February 2023 and February 2024 at the outpatient clinic of strabismus and pediatric ophthalmology of Beijing Tongren Hospital, China. These subjects were children aged 3 to 18 years old who were first-time ophthalmologists and had the ability to focus on both eyes. All subjects had normal intellectual development, no mental illness, were able to cooperate with the examination, and their guardians agreed to participate in the project and signed an informed consent form.

System description

The system consists of a main computer, a screen, a head-mounted VR device, and a designed AI analysis software. The head-mounted VR device (Figure 1) used was the “Qiyou 2S” featuring a resolution of 3840*2160 from Aqiyi corporation, which includes an infrared camera that captures eye movement data in real time and displays it on the computer screen, which is used to diagnose strabismus and calculate the angle of ocular deviation through the AI analysis software.

Figure 1.

VR device (Aqiyi Qiyou 2S). VR: virtual reality.

The AI software includes alternating covering test, covering-uncovering test, and ocular motility examination, all of which include two modes of looking near and looking far. Various examination procedures can be selected by the interface of the VR device or projected on the computer screen and selected by the doctor. The time to complete all the examinations is <2 minutes.

We refer to and update the calculations of Yeh et al.¹⁰ and Miao et al.¹¹ by setting up a coordinate system that captures the deviation of the pupil in the horizontal and vertical directions and incorporating it into the following equation: $D e v_{△} = \frac{D E_{m m}}{D E_{p}} * d p M M * D e v_{p}$

where $D E_{p}$ corresponds to the value of the detected iris diameter in pixels. $D E_{m m}$ is the average iris diameter of an adult patient: 11 mm and dpMM = 15Δ, which represents the average value of the refractive index relationship in millimeters. $D e v_{p}$ indicates the deviation value in pixels. $D e v_{△}$ denotes the deviation value calculated in units of diopter.

The AI model used in this study is a medical image segmentation network called SS-SwinuNet, which is used for image segmentation, recognition, and computation. After capturing the eye movement video by the infrared camera that comes with the VR device, the key frames are extracted, and the image of the eye region is segmented by Swinunet. The model recognizes the pupil and iris and finally combines the algorithms of image processing to get the eye movement offsets. The development set of the model is a dataset that includes 10,000 images from TEyeD images and 5000 images of children with clinical strabismus. And 70% of the dataset were used as the training set, 15% as the validation set, and 15% as the test set. TEyeD is a large publicly available dataset of ocular images, while the 5000 images of children with strabismus were obtained from our hospital database. The model achieves an average recognition and segmentation accuracy of 95.36% in the development set. To evaluate the accuracy of SS-SwinuNet in recognizing strabismus in children, an external validation set was created in this study by consecutively collecting eye movement data from 131 patients.

Examination process

All subjects will be examined by two strabismus and pediatric ophthalmologists in separate outpatient clinics and results recorded. Subjects will then be examined in a separate room with VR equipment and results recorded. Subjects wore the VR equipment and gazed at the reticle in a white virtual scene. The operator selected the appropriate mode for the test, simulating cover by switching off the screen. For ACT, the screen was switched off for 2 seconds in one eye and 2 seconds in the other eye for 4 sets. In CUT, the screen of one eye was switched off for 2 seconds, and then restored for 2 seconds, for a total of 1 set, repeated for 4 sets, and then repeated for 4 sets for the other eye, and the system automatically recorded the changes in the eye positions of the two eyes, and the result was calculated.

Figures 2 and 3 show the eye movement data displayed in real time on the screen while the subjects were performing the APCT in near vision mode.

Figure 2.

ACT in near mode. ACT: alternating covering test.

Figure 3.

ACT in near mode. ACT: alternating covering test.

Statistical analysis

All data analysis was done by SPSS 29.0. The type of strabismus diagnosed was decided after discussion between the two doctors in case of disagreement. The angle of ocular deviation was averaged between the two doctors. The manual results were used as the gold standard to compare the agreement between the AI system and the manual results. Strabismus was diagnosed using the Kappa consistency test, and p < 0.05 was considered a statistically significant difference. The agreement of the strabismus angles obtained by the AI system with the gold standard was assessed using Bland–Altman plots, intraclass correlation efficiencies (ICC), and 95% confidence intervals based on bidirectional random effects, absolute agreement, and single measurements.

Result

In this study, 131 subjects were collected consecutively, out of which 10 patients suffered from strabismus both horizontally and vertically and were not included in the statistical analysis. There were also 11 patients with incomplete data, so they were excluded. The final data of 110 subjects were included and their data are shown in Table 1.

Table 1.

Demographic information of all patients (manual result).

Parameter	Patients (N = 131)
Age (year)	7.60 ± 2.90
Sex
Male, no (%)	50 (45.5)
Female, no (%)	60 (54.5)
Type of strabismus^a
Normal, no (%)	28 (25.5)
Exotropia, no (%)	60 (54.5)
Esotropia, no (%)	18 (16.6)
Vertical, no (%)	4 (3.6)

All were diagnosed by PACT at distance (6 m) or near (33 cm). Exotropia: <−10 PD horizontally; esotropia: >10 PD horizontally; vertical: >5 PD vertically; normal: ≤10 PD horizontally and ≤5 PD vertically.

PACT: prism alternating cover test; PD: prism diopter.

The demographic characteristics of the study subjects are listed in Table 1. Mean age was 7.60 ± 2.90 (standard deviation) years; 50 (45.5%) were male and 60 (54.50%) were female. The sensitivity of AI system screening strabismus was 83%, the specificity was 89%, and the accuracy was 82%, displaying good consistency with the manual results (Kappa = 0.562, p < 0.001).

In this study, the data in the exotropia group were tested for normality by the Kolmogorov–Smirnov (K-M) test, and the normal, esotropia, and vertical strabismus groups were tested for normality by the Shapiro–Wilk (S-W) test. The results showed that the AI data of the exotropia group did not conform to normal distribution in both near and far viewing modes (p = 0.0125 and p = 0.0015), and the manual results conformed to normal distribution (p = 0.0697 and p = 0.0175.) The esotropia group conformed to normal distribution in both near and far modes (p > 0.05), and the vertical strabismus group conformed to normal distribution in near mode (p > 0.05), and did not fully conform to normal distribution in the mode of looking far (AI: p = 0.598 and manual: p = 0.0239); the normal group did not conform to normal distribution in both near and far mode (p < 0.05).

The data of the patients with strabismus were further analyzed and the results are shown in Table 2.

Table 2.

Basic information of AI results.

Parameter	Exotropia (n = 60)	Esotropia (n = 18)	Vertical (n = 4)
Age (year)	8.45 ± 2.75	7.28 ± 3.34	6.25 ± 3.40
Sex
Male, no (%)	30 (50.0)	8 (44.4)	2 (50.0)
Female, no (%)	30 (50.0)	10 (55.6)	2 (50.0)
Sensitivity (%)	76.7	88.9	100
Specificity (%)	100	98.9	100
Near mode (33 cm)
Amount of deviation of PACT (PD)^a	−18.9 ± 12.3	24.0 ± 14.6	8.3 ± 2.4
ICC	0.319	0.817	0.221
95% CI	(−0.098,0.655)	(0.561,0.931)	(−0.675,0.917)
p-Value	<0.001	<0.001	0.353
Median	−16.5	25.5	9
Interquartile range	15.75	23.75	4.25
Range	(−55, 2)	(−1, 47)	(5, 10)
Far mode (6 m)
Amount of deviation of PACT (PD)^a	−18.7 ± 14.9	31.3 ± 17.5	24.0 ± 13.7
ICC	0.334	0.764	0.004
95% CI	(−0.067,0.618)	(0.456,0.910)	(−0.316,0.790)
p-Value	<0.001	<0.001	0.496
Median	−15	34.5	23.5
Interquartile range	18.75	31.375	26
Range	(−62, 13)	(−2, 61)	(11, 39)

Mean ± deviation. Exotropia: <−10 PD horizontally; esotropia: >10 PD horizontally; vertical: >5 PD vertically; normal: ≤10 PD horizontally and ≤5 PD vertically.

AI: artificial intelligence; ICC: intraclass correlation efficient; PACT: prism alternating cover test; PD: prism diopter.

The demographic characteristics of the total 60 exotropia subjects are listed in Table 2. Mean age was 8.45 ± 2.75 (SD) years (range, 4–18 years); 30 (50%) were male and 50 (50%) were female. The sensitivity for diagnosing exotropia was 76.7%, the specificity was 100% and the accuracy was 87.2%, with strong agreement with the manual results (Kappa = 0.562, p < 0.001). The mean strabismus prism degree is −18.9 ± 12.3 (SD) PD in 33 cm and −18.7 ± 14.9 (SD) PD in 6 m (range, −10∼ −85 PD). Reproducibility of AI system with manual results, expressed as ICC, was low reproducible for the exotropic near mode (ICC = 0.391, range, −0.098, 0.655) and far mode (ICC = 0.334, range, −0.067, 0.618).

Although the AI system presents low reproducibility in screening exotropia, it has better results in esotropia. As the results listed in Table 2, the mean age for 18 esotropia subjects were 7.28 ± 3.34 (SD) years (range, 3–18 years); 8 (44.4%) were male and 10 (55.6%) were female. The sensitivity for diagnosing esotropia was 88.9%, the specificity was 98.9% and the accuracy was 97.3%, with strong agreement with the manual results (Kappa = 0.749, p < 0.001). The mean strabismus prism degree is 24.0 ± 14.6 (SD) PD in 33 cm and 31.3 ± 17.5 (SD) PD in 6 m (range, 5∼55 PD). ICC was high reproducible for the esotropic near mode (ICC = 0.817; range, 0.561, 0.931) and far mode (ICC = 0.764; range, 0.456, 0.910).

Besides, the demographic characteristics of the total 4 vertical strabismus subjects are listed in Table 2. Mean age was 6.25 ± 3.40 (SD) years (range, 3–11 years); 2 (50%) were male and 2 (50%) were female. The sensitivity for diagnosing vertical strabismus was 100%, the specificity was 100% and the accuracy was 100%, with strong agreement with manual results (Kappa = 1, p < 0.001). The mean strabismus prism degree is 8.3 ± 2.4 (SD) PD in 33 cm (range, 6–10 PD) and 24.0 ± 13.7 (SD) PD in 6 m (range, 10–39 PD). ICC was not statistically significant for the near or far mode results.

Because only the esotropia group conformed to a normal distribution in the near and far viewing modes, the ICC results described above only roughly reflect the agreement between the two methods. To ensure the accuracy of the results, Bland–Altman plots and linear correlation analysis plots were used in this study, and the results are shown below.

The linear regression plots for the two methods in the near and far modes in the exotropia group are shown in Figures 4 and 5. The exotropia group showed a strong correlation between the two methods in near mode (R = 0.731, p < 0.0001) and a stronger correlation in far mode (R = 0.561, p < 0.0001).

Figure 4.

Linear correlation plots of the manual results versus AI results in 33 cm for the exotropia group. AI: artificial intelligence.

Figure 5.

Linear correlation plots of the manual results versus AI results in 6 m for the exotropia group. AI: artificial intelligence.

As shown in Figures 6 and 7, the mean deviation in the exotropia group was −23.2 PD in the near viewing mode, with a 95% consistency limit: −49.8 to 3.5 PD. The mean deviation was −17.9 PD in the far viewing mode, with a 95% consistency limit of −50.2 to 14.4 PD, which is well outside the accurate range of the assessment. Together with the difference between the measurements of the two methods was statistically significant (p < 0.0001) for exotropia in the near and far viewing modes, indicating that there was no consistency between the two methods in the exotropia group.

Figure 6.

Bland–Altman plot of the difference between manual and AI results versus the average of the manual and AI results in 33 cm for exotropia. Upper and lower dotted lines represent the 95% limits of agreement. The solid line represents the mean difference, which was −23.2 PD. AI: artificial intelligence; PD: prism diopter.

Figure 7.

Bland–Altman plot of the difference between manual and AI results versus the average of the manual and AI results in 6 m for exotropia. Upper and lower dotted lines represent the 95% limits of agreement. The solid line represents the mean difference, which was −17.9 PD. AI: artificial intelligence; PD: prism diopter.

Figures 8 and 9 demonstrate the linear regression plots for the esotropia group in the near and far viewing modes. The esotropia group showed a strong correlation between the two methods in both near and far viewing (R = 0.7595, R = 0.7652, p < 0.001).

Figure 8.

Linear correlation plots of the manual results versus AI results in 33 cm for the esotropia group. AI: artificial intelligence.

Figure 9.

Linear correlation plots of the manual results versus AI results in 6 m for the esotropia group. AI: artificial intelligence.

As shown in Figures 10 and 11, the esotropia group showed excellent agreement between the two modalities in the near looking mode, but not in the far looking mode.

Figure 10.

Bland–Altman plot of the difference between manual and AI results versus the average of the manual and AI results in 33 cm for esotropia. Upper and lower dotted lines represent the 95% limits of agreement. The solid line represents the mean difference, which was 4.3 PD. AI: artificial intelligence; PD: prism diopter.

Figure 11.

Bland–Altman plot of the difference between manual and AI results versus the average of the manual and AI results in 6 m for esotropia. Upper and lower dotted lines represent the 95% limits of agreement. The solid line represents the mean difference, which was −8.7 PD. AI: artificial intelligence; PD: prism diopter.

In the near mode, the difference between the measurements of the two methods was not statistically significant (p = 0.0957). The Bland–Altman regression equation y = 3.3729 + 0.03454 x, the overall intercept and slope of the equation were 3.3729 (95% CI: −8.1734∼14.9193) and 0.0345 (−0.3572 to 0.4263) contain 0, with p-values greater than 0.05, and the difference between the intercept and slope and 0 is not statistically significant, indicating that there is no proportional difference between the two methods of measurement. The Coefficient of Repeatability (CR) and its 95% CI were 21.3161 (16.1067 to 31.5228). In the far mode, mean deviation is −8.7 PD and the 95% consistency limit from −30.8∼13.5 PD. The difference between the two methods’ measurements was statistically significant (p = 0.0046), indicating that there was no consistency between the two methods.

Figures 12 and 13 show the linear regression plots of the two methods for the vertical strabismus group in the near and far viewing modes. There was no correlation between the two methods in the vertical strabismus group in both near and far looking modes (R = 0.2455, R = 0, p > 0.1).

Figure 12.

Linear correlation plots of the manual results versus AI results in 33 cm for the vertical strabismus group. AI: artificial intelligence.

Figure 13.

Linear correlation plots of the manual results versus AI results in 6 m for the vertical strabismus group. AI: artificial intelligence.

As shown in Figure 14, the mean deviation is 2 PD and the 95% consistency limit from −5∼9 PD in near mode for vertical strabismus. The difference between the measurements of the two methods was not statistically significant (p = 0.3429). The Bland–Altman regression equation y = −2.8525 + 0.5246 x, with 95% CIs for the overall intercept and slope of the equation of (−45.6751 to 39.9702) and (4.0051 to 5.0543) inclusive of 0, with a p-value of greater than 0.05, respectively. The difference between intercept and slope with 0 was not statistically significant, indicating that there was no proportional difference between the two methods of measurement. The CR and its 95% CI was 7.2015 (4.3147 to 20.6939). As shown in Figure 15, mean deviation is −15 PD and the 95% consistency limit from −41.0∼11.9 PD in far mode. The difference between the two methods of measurement was not statistically significant (p = 0.1169). The Bland–Altman regression equation y = 17.3101 + −1.9582 x, with 95% CIs for the overall intercept and slope of the equation of (−0.3823 to 35.0025) and (−2.9661 to −0.9503) respectively, contained 0, where the p = 0.0140 for the slope was under 0.05, indicating a proportional difference between the two methods of measurement. The CR and its 95% CI is 37.5226 (22.4810 to 107.8233). This suggests that the vertical strabismus group has high agreement between the two modalities in the near mode of viewing, but there is a proportionality error in the far mode of viewing. The difference between the two modalities decreases with increasing strabismus until approximately 12 PD. After that time, the agreement between the two methods decreases as the degree of strabismus increases.

Figure 14.

Bland–Altman plot of the difference between manual and AI results versus the average of the manual and AI results in 33 cm for vertical strabismus. Upper and lower dotted lines represent the 95% limits of agreement. The solid line represents the mean difference, which was 2 PD. AI: artificial intelligence; PD: prism diopter.

Figure 15.

Bland–Altman plot of the difference between manual and AI results versus the average of the manual and AI results in 6 m for vertical strabismus. Upper and lower dotted lines represent the 95% limits of agreement. The solid line represents the mean difference, which was −4.3 PD. AI: artificial intelligence; PD: prism diopter.

Discussion

This article describes the diagnostic accuracy of a system combining VR and AI for a single type of pediatric strabismus. The system diagnosed strabismus with moderate agreement with manual results (Kappa = 0.562) and high sensitivity (83%) and specificity (79%). The system performed better in the diagnosis of esotropia and vertical strabismus groups, with high sensitivity and specificity, and high agreement with manual results (Kappa = 0.898 for esotropia strabismus group and Kappa = 1 for vertical strabismus). However, the number of patients in the vertical strabismus group was small and this result was not clinically representative. However, in the exotropia group, although the specificity of the system could be as high as 100%, the sensitivity was only 76.7%, which could be related to eye wandering considering the young age of the patients. The sensitivity of only 88.9% in the esotropia group may also be related to this factor. Taken together, the system has good performance in the diagnosis of strabismus and the classification of single strabismus.

In terms of measuring the angle of ocular deviation, the system was most accurate in measuring the angle of strabismus in patients with esotropia looking at the near mode. The agreement between the two modalities was high (ICC = 0.689, p > 0.05) and strongly correlated. This may be related to the clinical presentation of strabismus, which is generally more influenced by the accommodation set and shows a more stable constant strabismus, and therefore a more stable angle of ocular deviation during masking. However, the consistency and correlation between the two methods performed poorly in the other groups. The system had very poor agreement in the exotropia group, although the correlation was high. This may be related to the type of strabismus. There were 46 patients with intermittent exotropia in this study. The oblique angle of intermittent exotropia needs to be affected by control, accommodation, and attention, and its performance is unstable, and it needs to be measured after sufficiently breaking the fusion in order to obtain stable results. Currently, the monocular coverage time of the system is 2 seconds in all cases, which does not sufficiently break the fusion and may lead to inaccurate measurement of the angle of divergence of the eyes in the exotropia group. The results are consistent with the clinical manifestations and pathogenetic features of the disease, so it is feasible to determine the ocular deviation angle with this system, but the exact degree of strabismus in patients with exotropia needs to be determined by further studies by experts. The vertical strabismus group performed poorly in both near and far looking modes, but the results are not clinically representative due to the small sample size.

Clinical measurement of strabismus angle usually needs to be tested by APCT, the accuracy of which is subjectively affected by the level of doctors and the examination time is long. With the development of technology in recent years, many studies have begun to explore the application of eye-tracking technology to strabismus diagnosis to replace APCT. In 2018, Kohen et al.¹² reported a technique to diagnose strabismus by video, which has a high accuracy. However, the study was conducted on adults and the accuracy in children needs to be further investigated. Park et al. developed a technique for recognizing exotropia through video screenshots, which showed an excellent correlation with the manual results (R = 1), and the BA charts also showed no discrepancy, suggesting that the technique is highly accurate in diagnosing exotropia.¹³ The results are complementary to those of the present study, and the technique needs to be learned and updated in subsequent studies. However, the study population in that article was mainly adults and the diagnostic efficacy for strabismus in children is unclear. Rai et al.¹⁴ developed an eye-tracking system that can be used for the diagnosis of strabismus in patients aged 3 to 41 years, and their results were positively correlated and consistent. This suggests that it is feasible to measure strabismus using an eye-tracking system. However, how to require all children to understand and cooperate with the examination is a subsequent problem that needs to be solved. Zou et al.¹⁵ applied eye-tracking technology to children and found that it could be used as an alternative to APCT for the diagnosis and measurement of strabismus in children. With the development of technology, both the VOG technique and the Gazelab technique have matured, and a study by Cerdan et al.¹⁶ showed that VOG and APCT correlate well and can be used as an alternative to manual examination of strabismus. Palazón et al.¹⁷ compared these two techniques with the covering test separately, in which the VOG Perea (VP) had a very high concordance, while the Gazelab (GL) system had a poor concordance. The new system for strabismus screening in this study was more accurate than the GL, but still slightly lower than the VP technique, which needs to be further improved in subsequent studies. In addition to this, VR and AR technologies have been rapidly developed before, and in this study, a virtual scene built with VR technology was used to diagnose strabismus, while Nixon et al. diagnosed ocular deviation with an AR device, and explored the performance of AR technology in 19 patients with strabismus and 7 normal control patients. The results showed moderate correlation (R = 0.62) between the AR technique and the APCT technique.¹⁸ The diagnostic accuracy of this study was not as good as the results of the present study, but the accuracy of the measurements in this study for different types of strabismus still needs to be improved. Similarly, Li et al.¹⁹ investigated the diagnostic efficacy of a video ophthalmoscope with low-cost hardware for strabismus in children. The results showed that the video glasses can do telemedicine, diagnose strabismus and measure the angle of ocular deviation, which provides the possibility of widely disseminating low-cost strabismus screening devices in the future. Currently, this study focuses on the diagnosis of common strabismus, and the eye-tracking technique developed by Orduna et al.²⁰ can examine paralytic strabismus with more accurate results than manual results. Further refinement of the diagnosis of multiple strabismus types is needed in subsequent studies.

This study still has some limitations. First, although this study can meet the need for large-scale screening of pediatric strabismus, it did not consider the effect of the Kappa angle, which may lead to false-positive results and waste of medical resources due to the individual differences in children’s development. Second, there were only four patients in the vertical strabismus group, and none of the results were clinically representative, requiring further study.

Conclusion

Footnotes

Acknowledgements

The authors would like to thank Iqiyi company for providing the equipment. They would also like to thank each author for their hard work in this research.

ORCID iD

Yu-Meng Wang

Author contributions

JF was involved in study design and protocol draft;WWC and XXL in study protocol draft;YMW in research design,research execution,data manipulation,and manuscript preparation;and MXJ and JWC in data acquisition and research execution. All the authors reviewed the study protocol and approved the final manuscript.

Declaration of conflicting interests

The authors declared no potential conflicts of interest with respect to the research,authorship,and/or publication of this article.

Funding

The authors disclosed receipt of the following financial support for the research,authorship,and/or publication of this article: The study was financially supported by Beijing Hospitals Authority Clinical Medicine Development of special funding support,code: XMLX202103 and Beijing Hospitals Authority Clinical Medicine Development of special funding support,code: YGLX202506.

References

Stidwill

. Epidemiology of strabismus. Ophthalmic Physiol Opt 1997; 17: 536–539.

Kraus

Kuwera

. What is strabismus? JAMA 2023; 329: 856.

Lee

Repka

Borlik

, et al. Association of strabismus with mood disorders, schizophrenia, and anxiety disorders among children. JAMA Ophthalmol 2022; 140: 373–381.

Reid

Eaton

. Artificial intelligence for pediatric ophthalmology. Curr Opin Ophthalmol 2019; 30: 337–346.

LeCun

Bengio

Hinton

. Deep learning. Nature 2015; 521: 436–444.

Ting

DSW

Pasquale

Peng

, et al. Artificial intelligence and deep learning in ophthalmology. Br J Ophthalmol 2019; 103: 167–175.

Tao

Khodeiry

, et al. An extensive-form game paradigm for visual field testing via deep reinforcement learning. IEEE Trans Biomed Eng 2024; 71: 514–523.

Tao

Khodeiry

, et al. Deep reinforcement learning for optimized visual field analysis. Invest Ophthalmol Visual Sci 2021; 62: 2.

Yehezkel

Belkin

Wygnanski-Jaffe

. Automated diagnosis and measurement of strabismus in children. Am J Ophthalmol 2020; 213: 226–234.

10.

Yeh

P-H

Liu

C-H

Sun

M-H

, et al. To measure the amount of ocular deviation in strabismus patients with an eye-tracking virtual reality headset. BMC Ophthalmol 2021; 21: 246.

11.

Miao

Jeon

Park

, et al. Virtual reality-based measurement of ocular deviation in strabismus. Comput Methods Programs Biomed 2020; 185: 105132.

12.

Kohen

Orge

. Strabismus evaluation with a new videooculograph device (GazeLab). J AAPOS 2018; 22: 4.

13.

Park

, et al. A quantitative analysis method for comitant exotropia using video-oculography with alternate cover. BMC Ophthalmol 2018; 18: 80.

14.

Rajendran

Rai

Shetty

, et al. Comparison of measurements between manual and automated eyetracking systems in patients with strabismus—a preliminary study. Indian J Ophthalmol 2022; 70: 3625–3628.

15.

Zou

Tian

Wygnanski-Jaffe

, et al. Effectiveness and repeatability of eye-tracking-based test in strabismus measurement of children. Semin Ophthalmol 2022; 37: 502–508.

16.

Canto-Cerdan

Martinez-Abad

Siverio-Colomina

, et al. Comparative analysis of strabismus measurement using a video oculagraphy system and alternate prism cover test. Asia Pac J Ophthalmol (Phila) 2023; 12: 582–590.

17.

Palazón

Ventosa

Moreno

, et al. Study of reliability and validity of VOG Perea and GazeLab and calculation of the variability of their measurements. Arch Soc Esp Oftalmol (Engl Ed) 2021; 96: 127–132.

18.

Nixon

Thomas

PBM

Jones

. Feasibility study of an automated Strabismus screening Test using Augmented Reality and Eye-tracking (STARE). Eye (Lond) 2023; 37: 3609–3614.

19.

Nguyen

Kolin

, et al. Evaluation of streamed hardware-to-software telemedicine strabismus consultations utilizing video glasses. Clin Ophthalmol 2022; 16: 3927–3933.

20.

Orduna-Hospital

Maurain-Orera

Lopez-de-la-Fuente

, et al. Hess Lancaster screen test with eye tracker: an objective method for the measurement of binocular gaze direction. Life (Basel) 2023; 13: 668.

Accuracy of a system combining virtual reality and artificial intelligence for screening pediatric strabismus

Abstract

Background

Methods

Results

Conclusions

Keywords

Introduction

Methods

Participants

System description

Examination process

Statistical analysis

Result

Discussion

Conclusion

Footnotes

Acknowledgements

ORCID iD

Author contributions

Declaration of conflicting interests

Funding

References