Abstract
Keywords
Introduction
Orange juice is a popular product among consumers, with a per capita consumption of 16.6 pounds in 2023. 1 California is the leading cultivator of oranges in the U.S., followed by Florida with a production forecast of 1.84 million tons in 2023.2,3 As 42% of orange production undergoes processing, 2 the industry evaluates a range of parameters that affects nutritional quality, productivity, and accessibility.4,5 Comprehensive analysis is necessary for optimizing processing methods, elevating product quality, maximizing overall efficiency, and determining consumer acceptance.
The quality parameters frequently employed for assessing orange juice are titratable acidity (TA), Brix level, sugar content, and organic acid concentration.6,7 TA measures the total acid concentration in food and serves as a maturity indicator. 8 Organic acids such as ascorbic acid (Vitamin C) and citric acid are important quality indicators as well as nutritional parameters for oranges. 9 Sugars play an important role in the classification of fruits during citrus marketing, contributing significantly to the overall quality. Sugars account for 85% of the soluble solids present in oranges, 10 and glucose, fructose, and sucrose are the primary sugars found in orange juice with proportion approximately at 1:1:2 on weight basis. 11
These quality parameters are often analyzed in laboratory environment involving steps. Careful attention is paid to sample pre-processing, optimizing extraction procedure, and developing methods unique to the sample, all of which requires time and expertise in operating the various equipment. Vibrational spectroscopy overcomes major drawbacks of conventional analytical techniques by minimizing or eliminating sample preparation, producing data over a relatively short collection time, enabling a comprehensive analysis of multiple constituents with a single spectral measurement, and providing “environmentally clean” analysis.12–15 Thus, it increasingly satisfies industry demand for rapid and cost-effective monitoring of quality traits to streamline production processes. 12
Vibrational spectroscopy, i.e., near-infrared (NIR), mid-infrared (mid-IR), and Raman, involves excitation of vibrational modes in molecules using electromagnetic radiation (EMR) ranging from 14 000 cm–1 to 700 cm–1 and analyzing the spectra obtained from the absorbed or scattered electromagnetic radiation (EMR). 16 Mid-IR (4000–700 cm–1) has been extensively used for the characterization of quality parameters from fruit juices and pulps 10 due to its ability to capture fundamental vibrations. However, in a few circumstances, particularly for quantitative analysis, NIR is preferred over MIR because it has a higher pathlength 17 compared to attenuated total reflection Fourier transform infrared spectroscopy (ATR FT-IR) making it suitable for non-destructive analysis of samples involving minimal preparation. Also, advancements in the miniaturization of NIR spectrometers have decreased the size, weight, and cost, making them applicable for in-field and on-site analyses. Finally, NIR bands represent overtones and combinations of the major X–H bonds (typically C–H, O–H, N–H, and S–H) 18 enabling the analysis of a large range of organic materials. 19
In this study, we utilized a handheld NIR spectrometer assembled with seven tungsten halogen lamps, offering optimal power effectiveness and low heat generation. 19 Additionally, the device incorporates a micro-electro-mechanical system (MEMS) based Michelson interferometer to compact the module and a temperature stabilized indium gallium arsenide (InGaAs) detector. 20 The InGaAs detector can recognize the long-wave near-infrared (LWNIR, 1100–2500 nm) region, which exhibits better absorptivity of combination bands. 21 This wavelength range provides superior analytic capabilities compared to other NIR devices that incorporate silicon detectors, yielding a narrower NIR region (SWNIR, 780–1100 nm) and lacking unique signatures of first or second overtones and combination vibrations addressed by LWNIR. 20
Most studies involving miniaturized NIR sensors have used a wavelength range spanning the Vis-SWNIR region, collected through diffuse reflectance or transmission modes. Researchers reported the use of transmission mode 22 and diffuse reflectance spectral analysis 23 for online evaluation of soluble solids content (SSC) of intact oranges obtaining a comparable correlation coefficient (R) and root mean square error of prediction (RMSEP). Zhang et al. 22 worked on predicting SSC by assessing visible–near-infrared (Vis-NIR, 325–1100 nm) full transmittance hyperspectral imaging (HSI) which integrates information from the fruit's image and spectral data to compensate for the size parameter while quantifying SSC. Cen et al. 24 placed juice samples in a glass container and collected reflectance spectra at equidistant positions of the container reaching R ≥ 0.94 and root mean square error of prediction (RMSEP) ≤ 0.596 for citric and tartaric acid while Xia et al. 25 evaluated ascorbic acid using reflectance spectra of intact fruits with a prediction correlation coefficient of 0.95 and RMSEP of 3.9 mg/100 g. Similarly, Li et al. 26 analyzed citric acid, malic acid, and individual sugars of dried orange juice collected in transmission mode, obtaining a notable correlation (r ≥ 0.90). Individual sugars were assessed by Lanza et al. 27 using orange juice operated in transmission mode reporting Rcal greater than 0.68. Previous studies have focused on the application of NIR sensors alone that are limited to research environments and are not suitable for field deployment by end users or involve SWNIR.
The handheld NIR device incorporated in this study collected transflectance spectra across the LWNIR region corresponding to combination rich absorption bands of the X–H bonds. NIR transflectance spectroscopy combines principles of transmission and reflection modes by directing radiation through the liquid sample enclosed within a transflection accessory (spacer) followed by reflection from surface of the spacer.28,29 The light path is roughly twice as long the distance between the entry point and the reflector. The pathlength is defined based on the complete light path traversed through the sample. The main advantage of transflectance over transmission measurements of liquid samples is that it measures the radiation backscattered by the samples before reaching the reflecting plate, providing a reliable measurement of all radiation not absorbed by the sample. 30
An interactive web interface for accessing the regression models in a user-friendly approach provides an opportunity for end users to predict the taste indicators based on the NIR spectra of orange juice. A web application developed in this study by R programming can be freely accessed online on any electronic devices (mobile, computer, or tablet) without requiring programming or data analysis knowledge. Furthermore, it can be utilized simultaneously by multiple users without the need to install R software 31 and any updates or improvements to the model can be easily synchronized to the cloud.
To the best of our knowledge, the potential of a field-deployable handheld NIR device is yet to be explored for the analysis of taste indicators in orange juice. Investigating different pathlengths of light defined by the spacer can result in an improved performance of the models. Hence, the aim of this study is to generate predictive models using a handheld NIR device for simultaneous quantification of multiple taste indicators (soluble solids, TA, ascorbic acid, citric acid, sucrose, glucose, and fructose) 32 in orange juice and compare the model performances with spectra obtained in transflectance mode at different pathlengths. Subsequently, a web application will be developed integrating these models to estimate the taste indicators from NIR spectra of orange juice acquired using a 0.50 mm pathlength spacer.
Experimental
Material and Methods
A total of 123 orange fruits were supplied by Tropicana Farms in 12 shipments over an eight-month period (August 2021–March 2022). They included Hamlin and Valencia varieties. Each shipment comprised approximately 30 fruits, randomly harvested from Tropicana farms in Florida, of which 10 samples per shipment were selected for analysis. The industry primarily focused on fruits that aligned with their needs, ensuring the sample set reflected real-world conditions and variations, encouraging the model's practical applicability. Samples were refrigerated and analyzed within one to two weeks of procurement. Orange fruits were screened for any defects that occurred during transportation; the sorted oranges were squeezed manually using a manual press juicer. The obtained juice was centrifuged using a mini-centrifuge system (Thermo Fisher Scientific, USA) that held six samples rotating at 6000 rpm for 2 min at room temperature (20 ± 5 °C). The supernatant (liquid) was immediately used for characterization (SSC, TA, organic acids, and sugars) and spectral analysis.
Reference Methods
The samples were analyzed for SSC, TA, sugars (sucrose, glucose, and fructose), and organic acids (ascorbic acid and citric acid). For soluble solids analysis, a temperature controlled automatic refractometer, RX 5000i (ATAGO, USA), was used. TA was measured by an automatic titrator system, Easy pH Titrator (Mettler Toledo, USA), using a 0.1 N sodium hydroxide (NaOH) solution. Results were converted into g citric acid/100 g juice using TA data from the titrator. All samples were analyzed as duplicates. Sugar reference concentrations were obtained by reverse-phase high-performance liquid chromatography (HPLC). Liquid was filtered using 0.45 µm pore-sized Whatman nonsterile syringe filters into HPLC vials and frozen until analysis. Filtrated supernatant (2.5 µL) was injected into an Agilent 1100 HPLC (Agilent Technologies) equipped with a G1322A degasser, G1311A quaternary pump, G1313A autosampler, G1316A column compartment, and a G7162A 1260 refractive index detector (Agilent Technologies). Agilent's Hi-Plex Pb column (Agilent technologies) with 300 × 7.7 mm dimensions was used for chromatographic separation of sugars at 80 °C in a 25-min run time. Milli-Q water was used as solvent optimized at flow rate of 0.7 mL/min. External calibration curves were created to determine sugar concentrations using sucrose (2.5–70 mg/ml) and fructose (2.5–50 mg/ml) from Fisher Scientific (Thermo Fisher Scientific), and glucose (2.5–50 mg/ml) from Sigma Aldrich (Sigma-Aldrich). Chromatograms were analyzed using OpenLAB Chem Station software version C.01.11 (Agilent Technologies). For sugar quantification, a single HPLC run was conducted for each sample. Ascorbic acid and citric acid concentrations were determined by reverse-phase Agilent 1100 series HPLC (Agilent Technologies). Around 0.5 g of liquid were extracted with 1 ml of 4.5% metaphosphoric acid (MPA) since ascorbic acid (AA) is more stable in MPA.
33
Any possible formation of dehydroascorbic (DHAA) acid during sample preparation was reversed to AA using 0.1 ml of 100 mM tris [2-carboxyethyl phosphine] (TCEP) reagent. After leaving the samples overnight at 4 °C, they were centrifuged at 10 000 rpm for 15 min at 4 °C. Collected supernatants were filtered through 0.45 µm Agilent polytetrafluoroethylene (PTFE) syringe filters into the HPLC vials. Filtrated supernatants (10 µL) were injected in an Agilent HPLC (Agilent Technologies Inc.) equipped with a quaternary pump, a diode array detector (DAD), an autosampler, and a degasser. A C18-based Prevail organic acids column (5 µm, 150 × 4.6 mm) (Hichrom, UK) was used for separation of different acids during a 30-min run time. Acidified HPLC water was adjusted to pH 2.2 with sulfuric acid and used as a mobile phase at a flow rate of 0.8 mL/min. A calibration curve was created to determine acid concentrations of samples using different concentrations of reagent grade ascorbic acid (Sigma Aldrich) and citric acid (Fisher Scientific, USA). Different wavelengths were used to determine ascorbic (245 nm), and citric acid (210 nm). HPLC chromatograms were analyzed using ChemStation software (Agilent Technologies Inc.). Two sample
Spectral Acquisition
The NIR spectra were acquired by the handheld NeoSpectra Scanner system (Figure 1a) developed by Si-Ware (Egypt) equipped with seven concavely arranged tungsten lamps (Figure 1b) designed to concentrate the illumination on the sample and a centrally positioned detector to ensure the scattered light is efficiently captured (Figure 1c). It uses a single-chip Michelson interferometer with monolithic opto-electro-mechanical structure in its core-engine coupled with a single uncooled InGaAs photodetector. The optical components are intrinsically aligned by lithography on the chip. Spectra collected by this sensor cover a wavelength range of 1350 to 2500 nm with a signal-to-noise ratio (S/N) greater than 2000:1 at 2350 nm and 16 nm resolution.

(a) NeoSpectra scanner and spectra acquisition through a mobile application connected to the NIR scanner via bluetooth. The three reflectors used in the study have been shown along with their respective sample holders; Aluminum reflector from Perkin Elmer with 0.50 mm pathlength, Spectralon reflector customized to provide 0.80 mm light path and Neospectra transflector having a 2 mm path length. (b) Illustrates the optical window located on top of NIR scanner. (c) Schematic of the transflectance approach. The yellow arrows depict the light path. The light source transmits through the sample enclosed within the spacer and reflects off the surface of the transflector. The reflected light travels back through the sample thickness and reaches the detector.
For the spectral analysis, glass dish was placed on top of the optical window of the NIR device (Figure 1b). A juice sample (>0.5 ml) was pipetted onto the glass dish, and a reflector was positioned over the sample to achieve the transflectance approach. The reflectors, along with their corresponding sample dishes, are shown in Figure 1a.
Aluminum diffuse reflectors from two commercial companies were considered, one with a 0.50 mm (Perkin-Elmer) and the other with a 2 mm pathlength (Si-Ware). Since these pathlengths represent two extremes in light path, we also considered a reflector providing an intermediate path length. A PTFE Spectralon reflector was customized to achieve the required 0.80 mm pathlength. While the aluminium (∼93%–98% 34 ) and PTFE diffuse reflectors (>95% 35 ) exhibited different reflectivity due to their surface material, the PTFE Spectralon reflector enabled us to assess the effect of a gradual increase in pathlengths on the performance of the models.
The three reflectors (Figure 1a) were used to develop a transflectance approach in which the incident light crosses the sample captured within the spacer and is reflected off the reflector's surface located on the opposite side, traveling back through the sample before reaching the detector (Figure 1c). Background spectra were acquired using a PTFE standard material for 0.50 mm and 0.80 mm pathlength while for 2 mm path, background spectra were collected using a sample holder containing the 2 mm reflector, as recommended by Si-Ware. The spectra were acquired with each of these spacers in duplicates.
Spectral Data Analysis
Pirouette 4.5 software (Infometrix Inc.) was used to develop regression models based on the partial least squares regression (PLSR) algorithm. PLSR correlates two matrices consisting of independent variables (
Regression models were built for each parameter individually. Calibration models were developed with 80% of the data set while the remaining 20% was used for external validation. The dataset was split into calibration and validation using “Random” algorithm provided by the Pirouette software which employs a random process to generate the subsets. 37 The independent data set (spectra) was mean-centered and transformed by using a Savitzky–Golay (SG) second derivative, SG smoothing, and normalization to optimize the performance of the models. Additionally, spectral pretreatment enhanced outliers’ detection. 38 The number of latent variables corresponding to a low variance was chosen to avoid overfitting by considering a smaller number of factors where no noticeable improvement in the performance was observed with further inclusion of latent variables. Calibration models were internally validated using a leave-one-out cross-validation approach. The performance of the model was assessed using correlation coefficient of cross-validated (rcv) and externally predicted (rpre) models, standard error of cross validation (SECV), and standard error of prediction (SEP).
Developing an Interactive Web Application
Creating a freely accessible website that integrates calibration algorithms for predicting multiple attributes based on NIR spectra encourages utility of these models and expands their application globally. Furthermore, the convenience of accessing the user-friendly interface on any electronic platform enhances the versatility 39 of implementing the models.
The web-interface was developed using the R programming language. 40 Several packages were used to build the regression models, including Shiny, 41 Tidyverse, 42 readxl, 43 PLS, 44 and Signal. 45 The script used to develop the web application is provided in the Supplemental Material. The interactive web application allows users to upload an excel file containing NIR spectral data. It then predicts the output using the regression models and displays the results in a table (Figure 5b).
The Excel (Microsoft) sheet is read by “read_xlsx” function offered by the readxl package, the signal package facilitates the application of SG algorithm for second derivative transformation using the “sgolayfilt” function while the tidyverse package provides functions to efficiently structure the execution of multiple operations.
46
Similar pre-processing treatments (normalization, second derivative with a 15-window size, and mean centering) and spectral regions used in developing models in Pirouette software were applied in building calibration models using the R software. The
The interface is designed to predict the taste indicators of NIR spectra acquired with a 0.50 mm pathlength spacer and is published online at https://spectroscopy.shinyapps.io/Orange/. Note that the NIR spectra of the samples should be arranged row-wise in the excel sheet with 257 wavenumber points ranging approximately from 1350 to 2550 nm.
Results and Discussion
Composition of Taste Indicators
The variation in concentration of the parameters can be attributed to the differences in factors such as maturity index, variety, tree position, 47 environmental, and climatic conditions. The flavor perception of citrus fruits is primarily characterized by the blend of sweetness and sourness related to sugar-acid distribution. 47 As we incorporated both immature and mature orange samples, our data exhibited a wider standard deviation and had higher TA values for each type (0.7 ± 0.4 g CA/100 ml for Hamlin, 1.38 ± 0.6 g CA/100 ml for Valencia) as compared to study by Niu et al., 6 who reported 0.53 ± 0.004 g CA/100 ml for Hamlin and 0.7 ± 0.008 g CA/100 ml for Valencia. Valencia had significantly higher TA and organic acids levels than Hamlin, in agreement with the findings by Niu et al., 6 Nisperos-Carriedo et al., 48 and Jamshidi et al. 49
Overall, the average TA of the oranges found in this study (1.0 ± 0.6 g CA/100 ml) was consistent with the data obtained by Kelebek et al., 50 who reported values of 0.9 ± 0.001 g CA/100 ml for mature oranges of Kozan variety. At 210 nm in the HPLC-DAD chromatogram, CA and AA contributed to 35% and 19% of the total organic acids area, respectively, in a ratio of 1.8:1, making them the major organic acids. The values obtained for Brix (4.3–12.30, average = 8.7° Brix), citric acid (0.15–3.9%, average = 1.2%), and AA (9.2–103.4 mg/100 ml, average = 51.5 mg/100 ml) closely aligned with the findings of Borba et al., 10 who employed IR spectroscopies to determine sugars and acids. These results also corresponded to the studies conducted by H. Kelebek et al. 50 and Myrna et al., 48 both of which explored organic acids detection using HPLC. Jamshidi et al. 51 reported levels of Brix ranging from 7.90 −11.00 (mean = 9.60 Brix) while NIU et al. 6 documented a slightly higher Brix average (10.2 ± 0.20 for Hamlin, 9.7 ± 0.20 for Valencia).
In general, we found that Valencia had significantly less sucrose (29 ± 7 mg/ml) and higher glucose (13.9 ± 3 mg/ml) than Hamlin although comparable levels of Brix and fructose were observed, in agreement with the results reported by Lee and Coates. 52 Since we included unripened samples in our study, lower values of soluble solids, sucrose (11.5–57.4 mg/ml, mean =32.8 mg/ml), glucose (6.3–24.5 mg/ml, mean = 14 mg/ml), and fructose (7.8 mg/ml–31.3 mg/ml, mean = 17.8 mg/ml) were obtained compared to those reported by Borba et al., 10 Kelebek et al., 50 Niu et al., 6 and Lee and Coates. 52 Sucrose was found to be the dominant sugar contributing to 50% of total sugars. The proportion of sucrose, glucose and fructose was 2:0.9:1, which agreed well with the literature findings by Kelebek et al., 50 Lee and Coates, 52 and Niu et al. 6
Spectral Characterization
Figure 2 shows representative spectra of juice obtained using different transflectance accessories (pathlengths of 0.50 mm, 0.80 mm, and 2 mm). The figure indicates that the absorption intensity, measured in absorbance units (AU) is highest for the 2 mm pathlength, followed by 0.80 mm and 0.50 mm. The absorption increases with increase in optical pathlength due to enhanced scattering of light by insoluble solids present in the sample. 53 Considering each spacer, absorption patterns were qualitatively similar for all the samples. The spectra captured unique features for targeted analysis of taste indicators, providing a versatile analytical platform for screening the breeding traits.

Representative NIR spectra of orange juice obtained in transflectance mode using different reflectors providing 2 mm, 0.80 mm, and 0.50 mm pathlengths.
The high-water content and extensive overlapping of numerous bands led to broad peaks in the NIR spectra. The water bands appear broader in the 2 mm pathlength transflectance when compared to 0.50 mm and 0.80 mm spacers. This difference can be ascribed to the amplified absorption of NIR signals resulting from the double pass of light through the aqueous sample leading to a decrease in the signals reaching the detector. 54 Prominent bands around 5000, 5300, and 7000 cm–1 can be associated with the first overtone of O–H, C–H, and O–H stretching respectively, implicit to water and organic substances. 55 A typical pattern at 5000–5300 cm–1 band can be ascribed to saturation effects. 56 The spectral characteristics of orange juice resembled the absorption pattern of apple juice,57,58 bayberry juice, 59 strawberry juice, 60 and commercial fruit juices 61 in the 1350–2500 nm spectral region.
Regression models
Table I shows the central tendency and deviation values for each parameter considered for generating regression models. As can be seen from the table, the mean and standard deviation of the calibration and validation set were comparable. Table II illustrates the pre-treatments, and the number of latent variables used to develop models with the three different spacers, along with their corresponding statistics performance of calibration (Rcal, RMSECV) and validation (Rpre, RMSEP, RPD, RER, SEP/SEC) set for each quality parameter. The optimal combination of wavelengths and preprocessing methods was determined by selecting the PLSR with high correlation coefficient, small RMSEP, and a smaller number of factors while capturing sufficient data variance. 62 During the development of each model, the data were mean centered and derivatized. Mean centering makes it more convenient to observe inter-variable relationships through differences when referenced to the mean, rather than using absolute values,37,63 Second derivatization resolves overlapping peaks and enhances identification of specific peaks associated with response variable, while removing components related to baseline error.64,65 Smoothing and normalization were selectively employed based on the model's performance. Smoothing denoised the high frequency signals leading to an improvement in chemical information 66 while normalization scaled the band signals to a parameter of 100. The important spectral region relevant for estimating sample concentration (Table II) was acquired through the elimination of noisy data and non-selective signals related to the target analyte. This independent variable selection has been associated with improved regression performance.67,68 Outliers were determined based on extreme leverage and studentized residual of the cross-validated objects, on average less than 7% outliers were excluded in generating prediction models.
Statistics (mean, standard deviation, and number of samples) of quality parameters analyzed for calibration and validation data sets.
Pre-processing techniques considered for generating regression models, and model performance of different transflectance accessories (0.50 mm, 0.80 mm, and 2 mm) for assessing taste indicators in orange juice.
Please denote what the numbers mean in parentheses.
The optimal number of latent variables identified at the first local minimum RMSECV 69 ranged from 3 to 8. The quality of prediction relies on selecting an appropriate model complexity that strikes a balance between underfitting and overfitting. Too few descriptors may not capture enough variance to explain the data resulting in underfitting while an excessive number of factors can introduce random noise leading to overfitting of the model.70,71 Among the performance statistics, 0.50 mm and 0.80 mm pathlength transflectance showed better performance compared to the 2 mm pathlength, with Rcal ≥ 0.84 and low RMSE. Furthermore, their performance on the validation set was comparable with Rpre ≥ 0.82. A low RMSE signifies good precision and predictive ability, 70 whereas a high R reflects superior predictive accuracy of the regression model. 71 The correlation coefficient for ascorbic acid (Rcal ≥ 0.83, Rpre ≥ 0.77) was lower compared to other parameters, which can be attributed to its relatively poor stability over the storage period. 72 Tropsha et al. 73 states that a regression model's cross-validated coefficient of determination must be greater than 0.7 to be considered reliable and predictable of the desired attribute. The performance with 2 mm pathlength declined (0.76 ≤ Rcal ≤ 0.90, higher RMSECV) likely due to reduced signals reaching the detector caused by an increase in light scattering solids embedded in the extended spacer.53,54
Regarding Brix, both 0.50 mm and 0.80 mm spacers exhibited similar performance with high correlation coefficients around 0.99, and the RMSE of cross validation and prediction ranged from 0.15 to 0.27. However, when using a 2 mm pathlength, the correlation coefficient dropped to 0.82 with an increase in the RMSEP.
To determine soluble solids content and sugars, specific absorption bands at 1584 nm, 2120 nm, and 2270 nm were utilized. These bands are known to be characteristic of glucose, fructose, and sucrose in aqueous solutions, as previously reported. 57 For TA, 0.50 mm (Rcal = 0.96, RMSECV = 0.18) and 0.80 mm (Rcal = 0.91, RMSECV = 0.18) performed better although 2 mm showed comparable performance with Rcal 0.90 and RMSECV of 0.27. Similar trends were observed for citric acid. The results for citric acid demonstrated comparable performance to the study by Cen et al. 24 (Rp = 0.94, RMSEP = 0.6) who collected spectra from orange juice in reflectance mode using the Vis-NIR source. For ascorbic acid, a lower correlation coefficient ranging from 0.77 to 0.86 was observed when using 0.50 mm and 0.80 mm spacers. It was challenging to establish a strong correlation between spectral features from the 2 mm spacer and the reference values of ascorbic acid; hence the corresponding data were not displayed. Regarding sugars, the 0.50 mm and 0.80 mm spacers delivered better performance achieving Rcal ranging from 0.84 to 0.90. Nevertheless, acceptable results were obtained with 2 mm spacer (Rcal ranging from 0.76 to 0.87), with RMSECV values comparable to those obtained with the other two spacers. For sucrose, fructose, glucose, and soluble solids content, our results showed superior performance to the studies27,74 that analyzed these parameters in orange juice incorporating transmission mode (Rcv ranging from 0.68–0.72 for sugars and Rcv of 0.89 for SSC). This encourages the adoption of a simple, rapid, and handheld NIR device which reliably assesses the taste indicators in orange juice through the transflectance mode.
We assessed the model's performance using additional metrics such as the ratio of performance deviation (RPD), range error ratio (RER), and SEP/SEC. RPD is calculated by dividing the standard deviation of reference values in the test set by its RMSEP. An excellent model typically achieves an RPD greater than 2. 75 A higher RPD is advantageous, although its accuracy and comparability rely on data following a normal distribution. 76 With the inclusion of more unripened fruits, the attributes distribution exhibited right skewness, except for soluble solids, which maintained a normal distribution. Consequently, higher RPD values were attained for Brix using 0.50 mm (RPD = 8.3) and 0.80 mm (RPD = 7.6) spacers. While for other parameters except ascorbic acid, the lower pathlengths (0.50 mm and 0.80 mm) demonstrated a good RPD with values equivalent to or greater than 2. An RER ≥ 4 suggests suitability of model for screening calibration,76,77 apart from ascorbic acid all our validation models had RER equivalent to or greater than 4. The recommended threshold for SEP/SEC is less than 1.2 76 ; our models approximately met this guideline except for the sucrose parameter using 0.80 mm pathlength (SEP/SEC = 2.0) and ascorbic acid. Figure 3 displays PLSR scatter plots for each parameter using the 0.50 mm pathlength. These plots depict associations between measured variables and predicted variables (internal and independent external validation) presenting a visual representation of variability. 78 In the correlation plots, the validation samples represented as lighter points lie within the range of calibration dataset indicating that the trained models encompass the practical variability expected in the parameters.

PLSR correlation plots of TA, Brix, citric acid, ascorbic acid, sucrose, glucose, and fructose regression models obtained using 0.50 mm pathlength. The plot shows associations between measured variables and predicted variables (
and
denote calibration and validation set samples, respectively).
A loading vector plot (Figure 4) visually represents the influence of independent variables on the cumulative factors considered in model development. 37 Wavenumbers with higher relative coefficients (either positive or negative deviations) are significantly related to the response. Examination of the loadings plot for TA and citric acid revealed that absorption around ∼2339 nm and 2281 nm strongly influenced spectral variation, which is associated with combination overtones of –CH, –CH2, and –CH3 groups. 79 Additional features at 2456 nm, 1938 nm, and 1859 nm for ascorbic acid represent overtone or combination bands of fundamental O–H and C–H vibrations. The spectral features observed in the loadings plot for brix and sugars appeared in the region ranging from 2000 nm to 2340 nm, corresponding to characteristic absorbances of carbohydrate O−H and C−H groups, as previously reported by Rodriguez-Saona et al. 80

Loadings vector plot of TA, brix, citric acid (CA), ascorbic acid (AA), sucrose, glucose, and fructose regression models obtained using 0.50 mm pathlength.
Web application
The web interface allows NIR sensor users to navigate an Excel sheet containing the spectra of orange juice (Figure 5a). The server conducts regression analysis on the user-input spectral data and generates an output table in the right panel enlisting the predicted parameters (Figure 5b). Calibration models developed using R gave a similar RMSECV (Brix = 0.19, ascorbic acid = 8, citric acid = 0.25, TA = 0.15, glucose = 1.66, sucrose = 4, fructose = 1.7) when compared to those developed using the Pirouette software. However, comparatively higher RMSEP values (Brix = 0.98, ascorbic acid = 20.2, citric acid = 0.30, TA = 0.19, glucose = 3.6, sucrose = 3.9, fructose = 4) were observed in the R models, despite applying the same pre-processing treatments. This difference may be attributed to the variations in matrix pre-processing, fit methods, and other algorithms inherently utilized by the functions in the software to generate the correlation.

Illustration of the workflow of a web application developed to predict nonvolatile flavor attributes of orange juice from NIR spectral data. (a) The interactive interface prompting the user to browse Excel sheet containing the NIR spectra for prediction (b) The regression model output displaying the parameters predicted (in the right panel) corresponding to the input NIR spectra.
Conclusion
In this study, we validate the application of a handheld NIR spectrometer to rapidly (10 s) quantitate important taste indicators (TA, Brix, organic acids, and individual sugars) in orange juice. We used a transflectance measurement approach and examined model performance using three different spacers providing pathlengths of 0.50 mm, 0.80 mm and, 2 mm. The regression algorithms developed using 0.50 mm and 0.80 mm spacers effectively correlated the NIR spectra with the reference values resulting in correlation coefficient (Rpre) exceeding 0.82. An interactive web application was built to promote the application of these regression models by a wider audience. Notably, our findings using the transflectance mode were comparable to, or in certain instances outperformed, previous studies in which spectra were obtained by transmission or reflectance modes. This result underscores the reliability of novel, miniaturized NIR systems in providing a platform to evaluate multiple flavor characteristics of orange fruit by using few drops of juice, allowing the breeders to screen for unique traits with equivalent levels of reliability and sensitivity as conventional analytical techniques in the field. In addition, the results promote the application of transflectance sensing in screening of breeding traits and in situ quality prediction using field-deployable NIR spectrometers.
Supplemental Material
sj-docx-1-app-10.1177_27551857251356534 - Supplemental material for Rapid Assessment of Taste Indicators in Orange Juice Using a Handheld Near-infrared Scanner in Transflectance Mode
Supplemental material, sj-docx-1-app-10.1177_27551857251356534 for Rapid Assessment of Taste Indicators in Orange Juice Using a Handheld Near-infrared Scanner in Transflectance Mode by Shreya Madhav Nuguri, Celeste Matos Gonzalez, Gio Ijpkemeule, Christopher Ball and Luis Rodriguez-Saona in Applied Spectroscopy Practica
Footnotes
Acknowledgments
The authors would like to thank Tropicana for providing orange fruit samples and supporting the work.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was funded by Pepsico, Grant no. GR125605.
Supplemental Material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
