Sage Journals: Discover world-class research

Abstract

The intricate nature of the global food supply chain and the presence of regulations spanning multiple jurisdictions contribute to an increased likelihood of food adulteration. This underscores the need for effective monitoring methods to guarantee the safety and nutritional quality of our food. In this context, the application of infrared spectroscopy-based techniques emerges as an environmentally friendly, non-invasive, and waste-minimizing solution for authenticating food products. Infrared spectra serve as unique molecular fingerprints, offering a multidimensional representation of how chemical bonds in the material interact with infrared light. Chemometrics, which are primarily linear-based models, play a crucial role in extracting essential information from spectral data, enabling dimensionality reduction, classification, and predictive analysis. Recent progress in the field of big data science and artificial intelligence has brought forth machine learning and deep learning algorithms explicitly designed to uncover features from complex multidimensional data, encompassing both linear and nonlinear relationships. These advancements have the potential to enhance the detection of adulterants in food products. This study assesses the accuracy of various shallow machine learning models and a deep learning model based on a one-dimensional convolutional neural network (1D CNN). The evaluation is conducted using Raman and infrared spectral data obtained from ground turmeric samples that were deliberately adulterated with five distinct substances. The study highlights the improved classification accuracy achieved through the implementation of the 1D CNN model.

Graphical abstract

This is a visual representation of the abstract.

Keywords

Infrared spectroscopy turmeric adulteration machine learning deep learning

Introduction

Spices have traditionally been an expensive commodity. Consumers’ exposure to, and acceptance of, a larger diversity of culinary flavors has increased demand for spices of worldwide provenance, resulting in a more complex supply chain and increasing the business's vulnerability to adulteration.^1,2 According to public opinion polls, food safety is one of the top consumer concerns.^3,4 Although consumers are happy with their access to a variety of spice preferences, safety concerns exist among the general public, as well as regulatory authorities responsible for preserving and ensuring the safety and purity of spices. Recent reports⁵ showed the presence of many adulterants in spices that are considered highly toxic to human health. Cowell et al.⁶ reported widespread poisonous lead chromate (LC) contamination of turmeric (TU) powder sold in U.S. grocery stores with actual cases of lead poisoning in children across many U.S. states linked to the consumption of the contaminated products.

With more stakeholders in the spice supply chain and a larger danger of misrepresentation, effective authenticity testing and monitoring methods are required to deal with the growing challenges. In recent years, much effort has gone into developing vibrational spectroscopy fingerprint techniques for determining food quality and authenticity.^7,8 In terms of analytical speed, cost, and environmental impact, vibrational spectroscopy-based techniques offer a viable and reliable alternative to traditional methods. Furthermore, these approaches rely on direct bulk analysis, which entails working directly on the food item with little or no sample preparation, as opposed to the extraction and concentration steps often used when using traditional testing methods. Fingerprint spectral matching to a large database is the most basic method for substance identification. This approach is very useful when working with pure compounds or with consistent ingredients. When working with randomly mixed materials, however, this strategy becomes less effective. As a result, the strength of infrared (IR) spectroscopy is derived from its combined use with chemometrics, which are primary linear-based models capable of extracting and summarizing underlying features in spectral data sets of both pure and mixed materials.^9,10 With recent advances in data science, a variety of machine learning (ML) techniques for extracting information from multivariate data has been developed. Because ML and chemometrics perform similar tasks, their integration has created a larger pool of algorithms capable of linear and nonlinear feature extraction for improved classification and prediction problems.¹⁰

Turmeric (TU) is made from the rhizomes of the tropical plant Curcuma longa Linn. As one of the most widely used spices, TU has been the subject of adulteration with a range of plant-based substances and synthetic compounds.^6,11 A combination of vibrational spectroscopy and ML has been used in several studies to verify the authenticity of TU^6,11 and other ground species such as paprika¹² and curry powder.¹³

Principal component analysis (PCA) and partial least squares (PLS) regression are most likely the two linear-based approaches most commonly employed to decrease dimensionality and extract features for data exploration, as well as subsequent classification and prediction analyses of food adulterants. In addition to being a novel nonlinear approach, the t-distributed stochastic neighbor embedding (t-SNE) method¹⁴ has also been shown to be very versatile and well-suited for transforming high-dimensional data sets into low-dimensional plots, especially for data exploration tasks.^15,16 As a popular artificial intelligence (AI) tool, t-SNE has found uses in a wide range of scientific disciplines, including medicine and bioinformatics,¹⁷ and it could potentially be used in food safety investigations.

Feature reduction is a technique used to improve the computation efficiency and accuracy of multivariate data. When applied to spectral data, this means reducing the number of wavenumber variables by selecting the most important ones and excluding those that are redundant and have little explanatory value for analysis. The combinations of principal component-linear discriminant analysis (PC–LDA) and partial least-squares-discriminant analysis (PLS–DA) are examples of algorithms that combine feature extraction and classification tasks. Instead of the original data, these algorithms construct classification models based on chosen latent variables from PCA and PLS analysis. The usage of ensemble learning models, which combine the decisions of several models to enhance the final overall output, is another technique used to improve classification and prediction outcomes. Random forest (RF) is a common ensemble learning model based on decision trees that is noted for its performance.¹⁸

Deep learning (DL), a branch of ML and AI,¹⁹ has gained significant popularity in recent years. It is based on supervised algorithms called artificial neural networks (ANNs), which are computational models inspired by the structure of biological neural networks.^20,21 ANNs have been introduced as a data-driven self-learning computing system for decades but were limited in use due to a lack of processing power and other factors such as limited data availability, computational efficiency, and difficulty in optimizing the network architecture and hyperparameters. However, with technological advancements, they have become a leading state-of-the-art algorithm. In recent years, DL applications have taken over numerous critical processes we depend on daily, including computer vision, speech recognition, internet search, fraud detection, email/spam filtering, financial risk modeling, medical diagnosis, self-driving vehicles, navigation, drug discovery, item identification, and more.²²

Convolutional neural network (CNN) is one of the most widely used and established DLs, particularly in the areas of computer vision. DL neural networks such as CNN are described as end-to-end learning with less human intervention. This means the ability to train on raw data without the need for manual data preprocessing, which is an essential step when using ML algorithms.²³ The CNN technique was initially developed for problems that generate two-dimensional (2D) data, such as image recognition, where it is widely used. The effective application of the one-dimensional (1D) CNN, which applies to processes that generate 1D signals such as spectral data, is gaining popularity in several domains.²⁴ In comparison to 2D CNN network topologies, 1D shallow CNN network topologies are considered less complex matrix operations that do not demand highly specialized hardware, making them easier to comprehend, train, and implement. The potential use of spectral data and DL for assessing food authenticity and traceability has been examined.^25,26 So far, studies using the method have indicated that CNN outperforms other ML systems in the classification and prediction tasks,^25–27 which is consistent with the trend found in other disciplines.^16,27

To maximize generalizability and decrease prediction error, DL models are often trained using large training data sets. However, there has been substantial discussion over the amount of data necessary for DL applications. Variables such as the diversity of input data, the nature of the problem the model is designed to address, model complexity, and error margin tolerance are said to influence the size of training data.^28,29 Although the data produced by spectroscopy encompass a wide range of features (wavenumbers), the number of data gathered to address specific scope problems is often smaller, resulting in a low sample size to feature ratio. Recently, a 1D CNN model has been effectively applied to IR spectra-based data with relatively small training data sets.^30,31 Many strategies have been proposed to mitigate the risk of overfitting caused by a small training data set. Regularization techniques may be used to reduce the complexity of the model network and data set features and, thus, reduce the danger of model overfitting,^32,33 where the model learns the training data too well, capturing noise and random variations rather than the underlying patterns. Data augmentation is a strategy of generating synthetic data by adding small disturbances to the actual data to increase the robustness of a DL model.³⁴ In the case of spectral data, augmentation can be achieved by artificially introducing minor variations to the slope or baseline, as well as adding random noise.^34,35

Many spectroscopy-based research papers spend a significant amount of time studying how different pre-processing techniques influenced the quality of their prediction or classification when utilizing ML algorithms. CNN is trained on raw data because of its ability to learn the variable features in the data set.³⁶ This removes the need for extensive spectral data pre-processing, which is required when employing other ML methods. Despite the effectiveness of traditional ML methods on some data sets, investigating the use of CNN can still bring further benefits due to its end-to-end learning capability and the possibility of improved performance.

In this study, Fourier transform IR spectroscopy (FT-IR) and Raman spectra were obtained from powdered TU samples that had been experimentally substituted at different percentage levels with five distinct types of possible adulterants. The outcomes of ML methods’ visual pattern recognition and classification were compared to the outcomes of a 1D CNN implementation with and without data augmentation.

Experimental

Materials and Methods

Sample Preparation. Ground TU samples were obtained from local Toronto, Ontario, supermarkets. For all simulated adulteration experiments, a single ground TU product from a reputed brand was used. Several packages of this brand were combined and thoroughly mixed to ensure that enough uniform material was available for all substitution experiments. The following substances were considered as potential TU adulterants for the study: Starch (ST) powder (Fisher, certified ACS), lead (II) chromate (Thermo Scientific, 98%), metanil yellow (MY; Acros Organics), Sudan III (SU; Sigma-Aldrich, technical grade), and Acid Orange 7 (OR; orange II sodium salt) (Sigma-Aldrich, >85%). Preparation for all samples was similar except for LC power for which the handling was done following the special safety guidelines. Substitutions were made at 0.5 g final weight using the following TU-filler proportions: 0% (pure TU), 1%, 2%, 5%, 10%, 15%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, and 100% (pure adulterant) (w/w). Weight measurements were made using an Ohaus Explorer Semi-Micro analytical balance (Ohaus Corp. ). The substituted samples were put in glass tubes and mixed for 30 s using a Fisher Scientific Mini-Roto vortex mixer. Because vortexing alone tends to result in a density-based separation of the TU and adulterant, the mixture was manually stirred from the top with a rod while vortexing to generate a homogeneous blend. The stirring also aids in the breakdown of small clumps that tend to form during vortexing. The mixed sample was then pressed into a tablet disc by placing the mixed powder in a 13 mm stainless steel Evaluable pellet die (Carver Inc.) and pressed at 725.7 kg (1600 lb.) force using a Carver 4350 manual bench-top pellet press (12-ton) (Carver Inc.). This sample preparation was repeated three times (three replicates).

Lead chromate (LC) was handled following laboratory safety standards, including the use of recommended personal protective equipment and a fume hood. To minimize contamination, aluminum foil was used as a drop sheet during weighing, mixing, and pressing. The resulting mixed sample was formed into a pressed tablet disc and placed in a 10 mL clear soda-line glass vial (VWR). All contaminated trash is collected and labeled for proper disposal. Raman spectroscopy was performed directly on samples in the glass vials. The effect of the glass on Raman's signal was negligible.

Infrared Spectroscopy

For spectrum capture, a modular Thermo Nicolet iS50 FT-IR Spectrometer was used. For attenuated total reflectance (ATR) FT-IR measurement, the spectrometer is fitted with a single bounce Smart iTX accessory with a diamond crystal. A small TU sample was deposited on the ATR crystal and pressed using the ATR press device. The spectra were obtained in the wavenumber range 4000–500 cm⁻¹, with an average of 10 scans accumulation each time and a resolution of 4 cm⁻¹. The measurement is conducted three times at three distinct tablet locations for every sample, with the crystal thoroughly cleaned each time using isopropanol, water, and tissue paper.

Raman spectra were collected by fitting the iS50 FT-IR spectrometer with a Thermo Scientific iS50 Raman Module equipped with an IR 1064 nm laser. The sample disks were placed on the four-cell plate accessory. Data were collected in triplicates by moving the laser focus over the surface of the sample. The TU and adulterated TU samples were sensitive to heating by the 1064 nm laser. In order to minimize heat generation, the laser power was set to its lowest level (0.05 W) for the LC-substituted TU samples. Additionally, the defocusing lens option was utilized to further mitigate the heating effect during spectrum collection. Despite these precautions, some heating effect was still observed that particularly affected the spectrum over the 3000 cm⁻¹ range. The impact of overheating by the IR laser is relatively lower in all other samples, and a slightly higher laser power of 0.10 W was utilized to enhance the strength of the Raman signal. The Raman spectra were recorded in a wavenumber range of 3696–120 cm⁻¹, at 4 cm⁻¹ resolution and each time obtaining an average of 10 successive scan accumulations, at three different spots of the sample.

Due to the safety concerns of an exposed sample during IR spectral measurement, only Raman measurement was done for an LC-substituted TU sample. This is done by placing the sealed glass vials containing the sample disc on the four-cell Raman vail sample plate accessory. Raman measurements of empty vials indicated that the possible effect of the signal from the glass bottles was negligible. The sample preparation and measurements for both IR and Raman spectra were repeated three times.

Data Preparation

Spectral data were organized into a data frame with four label columns (number, group, level, and replicate) and the remaining columns as variables (wavenumbers). The IR spectral data set had 7209 variables × 540 rows and the Raman data set had 3710 variables × 675 rows. Except for CNN, all data analysis was performed in Rstudio v.4.0.2.³⁷ after the following preprocessing steps were applied. Baseline correction was done using the “spmblp” function from the “spftir” package³⁸ with a degree of polynomial = 2, rep = 100, and tolerance = 0.001. The “prospectr” package³⁹ was used to convert the baseline-corrected data into a hyperspec object for smoothing and generating derivative spectra using the gap-segment derivative (gapDer algorithm). The gapDer algorithm from this package performs Savitzky–Golay smoothing within specified segments followed by derivative calculation.³⁹ The gapDer algorithm was run by setting the option for filter length (w), i.e., the spacing between points over which the derivative is computed as 11, smoothing segment size (s), i.e., the range over which the points are averaged as 10, and the order of derivative (m). The first-order derivative (m = 1) was used for all shallow models. For CNN implementation, the data were used in their raw form, undergoing only data normalization, as the sole preprocessing step.

Data Analysis

Unsupervised Pattern Recognition

Principal Component Analysis (PCA). PCA is a linear algebra-based method for dimensionality reduction, which simplifies data analysis by projecting data into a lower-dimensional space. PCA extracts hidden features from multidimensional data into new latent variables called principal components (PCs) that retain the maximum variance features of inter-correlated data such as IR spectra. Each PC explains a portion of the variance in the data set, with the first PC capturing the greatest variance and subsequent PCs capturing decreasing amounts of explained variance. PCA was performed using the “prcomp” function on pre-processed data, and visualization plots were generated using ggplot2,⁴⁰ and the autopilot function from the “ggfortify” package.⁴¹

t-Distributed Stochastic Neighbor Embedding (t-SNE). t-SNE is a nonlinear feature extraction algorithm that provides a simplified visualization of patterns in the complete data set making it ideal for visualizing similarity data and revealing important global structures such as clusters at multiple scales.¹⁶ Unlike PCA, t-SNE retains the local structure of the data. t-SNE reduces data dimensionality by creating a probability distribution on pairs in higher dimensions and iteratively replicating the same distribution on lower dimensions until the Kullback–Leibler divergence is minimized.¹⁴ t-SNE was implemented using the “Rtsne” package in R.

Interactive Application for Visualization of PCA and t-SNE Outputs

An in-house application was developed using Shiny (an open-source app development framework using R code) that enables interactive and visual representation of PCA and t-SNE results. This application is designed to assist in the selection and visualization of the most distinctive spectral band regions after reducing the dimensionality of the input spectral data. Its user interface includes a graphical slider to select data range and spectral band regions. Once a band region is selected, the data from the selected region is used as an input for PCA and t-SNE models, which subsequently generate and display PCA and t-SNE plots for visualization. The slider can be operated manually or set to animate, providing real-time updates to the output display as the band region selection is adjusted. This application is crafted to streamline non-targeted analysis through interactive exploration of spectral data sets. By dynamically selecting different spectral band regions, the method aids in uncovering underlying features that would otherwise be difficult to distinguish using static charts.

Supervised Pattern Recognition

Supervised pattern recognition involves training a classification model, selecting important features, and evaluating the model with a held-back test set. LDA optimizes the ratio of between-class variation to within-class variation, while PCA finds the direction with the largest total variance. PC–LDA combines PCA and LDA by first using PCA to reduce the dimensionality of the data, and then using the selected PCs as predictors in LDA. Sparse PLS (sPLS) is a PLS version that includes sparse linear combinations of original predictors from large complex data sets to improve prediction,⁴² while sPLS–DA uses variable selection to improve prediction in a multiclass classification problem.⁴³ RF is a prominent supervised algorithm for classification and prediction applications that employs an ensemble learning approach to find a solution to complex data sets.^18,43 The study explores the potential application of these techniques in classifying TU samples that were adulterated with various fillers, using Raman and IR spectral measurements.

One-Dimensional Convolutional Neural Network (1D CNN)

Artificial neural networks (ANNs) are made up of different numbers of nodes or active units arranged in multiple layers. The three basic layers are the input layer, where the input data is introduced to the network, and the hidden or processing layer, where the algorithm processes data to establish relationships using matrix multiplication and activation functions. The output layer generates the results for the given input as classification or prediction. In its simplest form, deep neural networks involve multiple hidden layers.⁴⁴ A CNN is a special form of ANN that consists of one or more convolutional layers attached to the fully connected ANN.⁴⁵ CNN uses convolution rather than generic matrix multiplication in the convolution layer, where filters of a given size, known as “kernels”, are moved systematically through the whole input data to extract the key features using the convolution process. This may be done several times with different kernels. The extracted feature summary (feature map) is transferred to the next layer, and the process is continued with different kernels until the end of the convolutional layers present in the model. Finally, the convolutional layer's summary of extracted features is passed to the fully connected ANN, which generates the final prediction or classification results.^36,46

A compact 1D CNN model was implemented in this study using the Keras (version 2.9.0) package, with TensorFlow backend using Python (v.3.10) and Jupyter Notebook in Visual Studio Code (v.1.7.1). The model was adapted from the 1D CNN architectures for spectroscopy by Bjerrum et al.,³⁴ Liu et al.,⁴⁸ and Murphy,³⁵ for this data set. This simple, feed-forward 1D CNN model that was used for both IR and Raman data sets consists of an in-input layer, three 1D convolutional layers, a batch normalization (BN) layer, an activation layer, a 1D pooling layer, a dropout layer, a fully connected layer, and a Softmax layer for classification (Table Ia and b).

Table I.

Structure of 1D-CNN used for (a) Raman and (b) IR data sets.

(a)
Layer	Output shape	Number of parameters	Kernel size	Stride	Activation
GaussianNoise	(None, 3709, 1)	0	—	—	—
BatchNormalization	(None, 3709, 1)	4	—	—	—
Conv1D	(None, 3707, 8)	32	3	1	relu
MaxPooling1D	(None, 1853, 8)	0	2	2	—
Conv1D	(None, 1838, 16)	2064	16	1	relu
MaxPooling1D	(None, 919, 16)	0	2	2	—
Conv1D	(None, 888, 32)	16 416	32	1	relu
MaxPooling1D	(None, 444, 32)	0	2	2	—
Dropout	(None, 444, 32)	0	—	—	—
Flatten	(None, 14 208)	0	—	—	—
Dense	(None, 64)	909 376	—	—	relu
Dropout	(None, 64)	0	—	—	—
Dense	(None, 6)	390	—	—	softmax
Total parameters: 928 282.
Trainable parameters: 928 280.

(b)
Layer	Output shape	Number of parameters	Kernel size	Stride	Activation
GaussianNoise	(None, 7208, 1)	0	—	—	—
BatchNormalization	(None, 7208, 1)	4	—	—	—
Conv1D	(None, 7206, 8)	32	3	1	relu
MaxPooling1D	(None, 3603, 8)	0	2	2	—
Conv1D	(None, 3588, 16)	2064	16	1	relu
MaxPooling1D	(None, 1794, 16)	0	2	2	—
Conv1D	(None, 1763, 32)	16 416	32	1	relu
MaxPooling1D	(None, 881, 32)	0	2	2	—
Dropout	(None, 881, 32)	0	—	—	—
Flatten	(None, 28 192)	0	—	—	—
Dense	(None, 64)	1 804 352	—	—	relu
Dropout	(None, 64)	0	—	—	—
Dense	(None, 5)	390	—	—	softmax
Total parameters: 1 823 386.Trainable parameters: 1 823 320.

Optimization of ML and DL Models

The gapDer transformed derivative spectral data, scaled and centered, were used in subsequent data dimensionality reduction for exploratory data analysis and predictive modeling using the shallow ML algorithms. This was due to their ability to remove baseline offset, separate overlapping peaks, and sharpen spectral features.

For PCA, a graphical technique that is based on the scree plot (not shown), indicating the eigenvalues in decreasing order,⁴⁸ was used to select the optimal number of PCA components for PCA and during PC–LDA and PLS–DA model optimization. LDA and PC–LDA are done using the MASS package.

For implementing sPLS–DA, the MixOmix package⁴⁹ was used. Model tuning was done using the “perf” and “tune.Splsda” functions. The functions were set to compute five-fold of 50 repeat leave-one-out cross-validation scores on a grid to determine optimal values (keepX) for the sparsity parameters.⁴⁹ The final sPLS–DA was determined using optimized values for component (optimal ncomp) and an optimal number of variables (keepX) to maximize classification accuracy.

For t-SNE, perplexity, which refers to the number of close neighbors at each point is an important tuneable parameter. According to van der Maaten and Hinton,¹⁴ it has a typical value that falls between 5 and 50. The t-SNE by the “Rtsne” package was run with 1000 iterations, a perplexity parameter of 10, and a dimension of 2. The data show a wide range of patterns when varying the perplexity values. The perplexity value of 10 was chosen based on repeated trials for the best class separation for both Raman and IR spectral data sets. The output was in the form of a matrix with 540 (FT-IR) and 675 (Raman) rows and two columns corresponding to t-SNE dimension 1 and dimension 2, which were used to generate the t-SNE plot.

For RF, the number of trees was left at the default 500, as an increase in the number of trees beyond the default value did not lead to a substantial reduction in the prediction error or improvement in model accuracy. Therefore, the mtry value was set to the recommended default value as the square root of the number of predictors,⁵⁰ which was 84 and 60 for IR and Raman spectral data, respectively.

The TU spectral data set utilized in this study is relatively small yet comprises a large number of features (spectral wavenumbers), a common situation in spectroscopy studies. To address overfitting, a more straightforward model was developed for CNN. Overfitting may occur when employing a complex model, resulting in accurate predictions on a small calibration data set but insufficient generalization to larger data sets.²⁹ The implemented low-complexity model consists of three convolutional layers with 8, 16, and 32 filters and a kernel size of 3, 16, and 32 (Table Ia and b). The 1D convolution was performed with a default stride length of 1, and the “valid” padding option was utilized, meaning no padding was applied, and the convolution was carried out only on valid input data. The spectral data formatted as a 1D vector was used as an input layer. The target classes (group or adulterants) were encoded with a unique number between 0 and n_class-1, where n = 5 for IR spectral data corresponding to TU, MY, OR, ST, and SU and n = 6 in the case of Raman spectral data for the additional LC adulteration. The encoding was done using LabelEncoder from the Sklearn library. To aid in regularization, i.e., enhancing the input information to improve the model's performance, the data were first run through a Gaussian noise filter,^35,51,52 and BN used before activation function so that the inactive layer distribution satisfies the normal data with a mean of zero and variance of one³³ before being passed through two 1D convolution layers. To improve feature extraction efficiency, a Max Pooling layer was used after each convolutional layer.⁵³ Leaky ReLu was used as the activation function layer to enable both linear and nonlinear learning in the CNN.⁵⁴ The Adam optimizer⁵⁵ was used along with an exponential step decay learning rate starting at 0.01 and updating by −0.01 every five epochs, based on good performance.³¹ Adam is an optimization algorithm that updates network weights during training and is an extension of the classical stochastic gradient descent procedure.^55,56 The PlotLossesKeras callback was used to monitor live training accuracy and loss. Only normalization was applied to the spectral data before being fed into the CNN. The data set was split into 70:30 train:test using Sklearn's train–test–split function, with 30% reserved for evaluating the model's accuracy. During training, the Keras validation-split function was used to reserve 20% of the training data to monitor model performance at the end of each epoch.

Data Augmentation for CNN Model

Data augmentation was used to increase the dimension of the training data by adding slightly perturbed copies of the measured data, which improves model generalization.³⁴ Spectral data augmentation was implemented through perturbations on the vertical and horizontal axes using offsets, slopes, and multiplication, increasing the training sample size by fivefold to 10-fold following the procedure by Bjerrum et al.³⁴ Its impact on CNN model performance was analyzed.

Evaluation of Classification Performance

To assess the performance of both the shallow ML and 1D CNN classification models, the main evaluation metrics employed included accuracy and precision, outlined as follows: $Accuracy = (TP + TN) / (TP + FP + TN + FN)$ (1) $Precision = TP / (TP + FP)$ (2)where the true positive (TP) indicates correct identifications of positive instances, the true negative (TN), signifies accurate predictions of negative instances, the false positive (FP) represents incorrect predictions of positive instances (false alarms), and the false negative (FN) denotes incorrect predictions of negative instances (missed detections).

Accuracy quantifies the ratio of correctly classified samples (TP + TN) to the total number of samples (TP + FP + TN + FN). Precision measures the ratio of TPs to the sum of TPs and FPs (TPs + FPs). In the context of this study, accuracy gauged the overall classification performance. Precision assesses the accuracy of positive predictions by capturing the ratio of correctly predicted positive observations to the total predicted positive observations.

Balanced accuracy, a metric in classification model evaluation, becomes especially relevant in scenarios of class imbalance. It provides a balanced perspective on a model's accuracy by incorporating both sensitivity (TP rate, or recall) = TP/(TP + FN), which measures actual positive instances correctly identified by a classification model, and specificity (TN rate) = TN/(TN + FP), which measures the proportion of actual negative instances correctly identified by a classification model. The calculation of Balanced Accuracy involves averaging sensitivity and specificity, resulting in the following expression: $Balanced Accuracy = (Sensitivity + Specificity) / 2$ (3)

Result and Discussion

Infrared and Raman Spectra

The molecular structure and Raman and IR spectra of pure TU and various adulterants examined in this study are shown in Figures 1 and 2. IR light absorption leads to bond vibrations in the molecules, with bonds in functional groups absorbing energy at frequencies that correspond to their own vibrational frequencies. Consequently, when different compounds possess diverse bonds and functional groups that absorb IR light at distinct band frequencies, they can be more effectively distinguished from one another. Even though TU and some of its lookalike adulterants have similar colors, they differ significantly in their functional groups and composition. The bright yellow color of TU comes from fat-soluble polyphenolic pigments called curcuminoids, one of its prominent chemical components accounting for about 2–6% of TU rhizome.⁵⁷ Curcumin is a large molecule consisting of a number of double and single bonds, also known as a conjugated system. Many pigments make use of conjugated bonds to absorb visible light and produce vibrant colors. The chemical structure of curcuminoids in general consists of two aromatic ring systems containing o-methoxy phenolic groups, connected by a seven-carbon linker consisting of an α, β-unsaturated β-diketone moiety.⁵⁸ Curcumin, which represents the principal curcuminoids, exists in two molecular configurations (the bis-keto form and the enolate form).⁵⁹ Other curcuminoids found in TU include demethoxy-curcumin and bisdemethoxy curcumin.⁶⁰

Figure 1.

Chemical structure of different adulterants used for substitution of ground TU sample (a) curcumin, (b) MY, (c) OR, (d) ST, (e) SU, and (f) LC.

Figure 2.

(a) Raman and (b) IR spectra of pure TU and the fillers/adulterants used in the study. TU, ST, MY, OR, SU, and LC.

The TU adulterants used in this study MY, OR, and SU III (Figure 1) are members of the group of chemicals commonly known as azo dyes. Azo dyes are characterized by the presence of an azo group –N=N–, which makes a bridge between organic molecules in which at least one is an aromatic molecule.⁶¹ MY is an azo dye derived from the reaction of metanilic acid with diphenylamine.⁶² Despite having quite different chemical compositions, the similarity in the degree of extended conjugation is responsible for the yellow of MY and curcumin–TU.⁵⁹ MY, unlike curcumin, has three nitrogen atoms (N=N and –NH) and one sulfate (SO₃) group; it lacks methylene (CH₂) and methyl (CH₃) groups, with oxygen located at the SO₃ site (Figure 2). OR is very similar to MY in that it has two nitrogen atoms (N=N) and a SO₃ group. In the structure of OR, a hydroxyl group is located in the ortho-position to the azo group. Unlike MY, OR contains the naphthalene group that has a very important effect on its IR vibrational modes. SU I, II, III, and IV are related red dyes that have somewhat differing bond configurations.⁶³ The molecule employed in this work, SU III, is a 2-naphthol substitution bis(azo) or diazo compound, which is similar to OR in containing the naphthalene group. It differs from OR in that it has a double –N=N– group with four nitrogen atoms and lacks the SO₃ group present in MY and OR. Lead(II) chromate (PbCrO₄) is a relatively simple, naturally occurring (crocoite) or synthetic crystalline inorganic yellow-orange colored compound. It has a strong IR signal that can easily be identified by the presence of distinct lead and chromium vibrational modes. ST (derived from flour) and TU display common organic compounds characterized by shared functional groups and chemical bonds, leading to certain comparable vibrational frequencies in IR spectroscopy. This resemblance is based on the fact that, although curcumin constitutes ∼2–6% of TU,⁵⁷ the primary composition of TU rhizome consists mainly of carbohydrates, including some ST, along with protein, fat, and fibers.⁶⁴ This similarity presents challenges in differentiating ST substitution in TU, especially at lower levels of substitutions, when compared to the other adulterants investigated in this study.

Raman and IR Spectra of the Ground TU and Adulterants

Band position assignments are based on literature reports and the actual band position measured in the study, when shifted up or down, is shown in parenthesis. The prominent Raman bands that are characteristic of TU/curcumin are 966 (967) cm⁻¹ attributed to C–O–H, and 1185 (1187) cm⁻¹ attributed to C–O–C vibrations.^65,66 The strong peaks around 1600 (1602) and 1630 (1632) cm⁻¹ are associated with the vibrational modes of the aromatic benzene ring (Figure 2). These peaks originate from C=C and C=O stretching vibrations. The Raman peak around 1430 (1431) cm⁻¹ was the characteristic peak of phenol C–O. The keto and enol forms of curcumin structures are defined by the Raman band at 1250 cm⁻¹, with the keto-enol form showing vibration at 1250 cm⁻¹ whereas the Raman band at 1429 (1431) cm⁻¹ is a characteristic peak of phenol C–O. In addition, a methoxy group (OCH₃) shows a vibrational band at 573 (575) cm⁻¹.⁶⁷ The Raman spectra of LC are straightforward. The most intense Raman peak is at 840 cm⁻¹ and is associated with CrO₄ cm⁻¹ symmetric stretching, whereas bands at 400 (401) cm⁻¹, 379 (375) cm⁻¹, 360 (359) cm⁻¹, 338 (340) cm⁻¹, and 327 (331) cm⁻¹ are associated with CrO₄ cm⁻¹ bending modes and are diagnostic for LC.^68,69 The presence of LC in TU can be easily detected because its Raman bands do not overlap with those of TU.

The four most prominent Raman bands in MY are 1606 (1605) cm⁻¹, 1452 (1455) cm⁻¹, 1437(1436) cm⁻¹, and 1147(1146) cm⁻¹, which are all related to the N=N group.¹³ Previously, the Raman band around 997 (996) cm⁻¹, which is associated with breathing in ring II, and the Raman band around 1402 cm⁻¹, which is associated with the SO₃ group (S=O stretching), were proposed as the most unique that can be used as confirmatory bands for the presence of MY in TU.¹³

Acid Orange 7 (OR) is another azo dye of the N=N group that is related to MY and SU. According to Xie et al.,⁷⁰ some of the distinctive OR Raman bands result from bond vibration related to either the benzene or naphthalene ring, or both. The Raman bands around 990 cm⁻¹ (C=C stretching), 1182 (1184) cm⁻¹ (C–S), 1240 (1233) cm⁻¹ (C–N and C=C stretching), 1344 (1337) cm⁻¹ (C=C stretching and C–H plane swing), 1394 (1386) cm⁻¹ (C–H none plane swing), 1508 (1497) cm⁻¹ (C–N, C=C stretching, C–H asymmetric stretching and rocking, and C–O rocking), and 1602 (1598) cm⁻¹ (C–N non-plane rocking). Most of these are major bands that can be used to identify the presence of OR in TU. Figure 4 reveals that the Raman bands 990 cm⁻¹, 1337 cm⁻¹, and 1386 cm⁻¹ are unaffected by TU bands and may be used as a quick confirmatory Raman band for the presence of OR in TU.

Since ST is characterized by multiple coupled vibrational modes, it is usually difficult to ascribe some of the Raman and IR bands of ST to a single vibration mode.⁷¹ As a result, band regions are often given for a group of vibrational modes, which gives ST its distinctive Raman and IR spectra. For example, ST displays a 2930 cm⁻¹ C–H stretching band, but it is not exclusive to ST as the band can also be found in various other organic compounds (not shown). Raman bands such as C–O and C–C stretching and C–C–H, C–O–H, and C–C–H deformations are found between 1300 and 1400 cm⁻¹, and Raman bands such as C–O and C–C stretching and C–C–H, C–O–H, and C–C–H deformations are present between 1340 and 1200 cm⁻¹. The 1200 and 800 cm⁻¹ band region also known as the fingerprint area, is assigned to stretching and deformation modes of the C–O and C–C glycosidic bonds.^72,73 The 480 cm⁻¹ (475–485 cm⁻¹) Raman band, which is quite distinctive and has been related to ring vibration, has been frequently used as an ST marker. This Raman band was found to be weak in TU powder, but more prominent with increased substitutions with the purified ST used for adulteration. However, the band is not unique in this case and cannot be unambiguously used as a marker. While the natural variation in curcumin levels may complicate the analysis, the primary consequence of ST substitution is dilution. This dilution results in a reduced relative amount of curcuminoids in the sample, making it easily noticeable due to the prominent Raman and IR spectral signals exhibited by these compounds.

Sudan III (SU) is derived from naphthalene (1-phenyl-azo-2-naphthol), and it has a characteristic band associated with the naphthalene group, which includes the Raman band around 1227 (1226) cm⁻¹ that comes from the C–O stretching, C–C–H scissoring bending, 1388 (1387) cm⁻¹ from C=N stretching vibration and C–H in-plane bending, and 1495 (1496) cm⁻¹ from C=N, N–N stretching.⁷⁴ Sudan also lacks the SO₃ and the corresponding band at 1402 cm⁻¹ that is found in MY and OR.

The IR spectra of TU and adulterants are relatively more complex than the Raman spectra. The most intense IR bands, 1185 cm⁻¹ in MY, 1181 cm⁻¹ and 1118 cm⁻¹ in OR, and 1127 cm⁻¹ and 1206 cm⁻¹ in SU appear to be less affected by IR bands of TU. All azo dyes exhibit more intricate IR vibrational modes over the whole spectrum as compared to TU. Below the 900 cm⁻¹ band region, the IR bands of all three azo dyes (MY, OR, and SU) exhibit strong bands that are hardly overlapping with TU bands, which could be taken as an indication of potential adulteration by these chemicals. Each of the azo dye bands in this area has also a specific vibrational mode that could be used to distinguish among them. Just like in Raman, ST and TU have the most similar IR bands. The amount of curcuminoids-related vibrational modes in the TU bands, particularly in the 1100 to 1700 cm⁻¹ range, decreases as the level of ST substitution increases, while the IR spectra of ST in other parts of the TU progressively increase (not shown).

Interactive Visualization for Pattern Recognition

Figures 3 and 4 showcase the PCA and t-SNE plots generated through the Shiny interactive application. These plots correspond to the band region selected using the band region selector. Both PCA and t-SNE plots were used to classify the spectral data according to the adulterant present in TU. These plots make it easy to visually recognize similarity groups through associated data clusters. For both Raman and IR spectral data, t-SNE provides well-defined clusters for visualization on a single two-dimensional plot, while using PCA may require plotting combinations of many PCs to see more patterns in the data set. The separation of the different classes of substituted TU samples increased with the level of substitution, with the pure TU data positioned at the center. Some data overlap at the lower-end substitution levels of 1, 2, and 5% with pure TU and ST data was evident, showing that these concentration levels are within the detection limits of the IR and Raman spectroscopy for bulk analysis under the most basic setup conditions.

Figure 3.

An interactive Shiny platform dashboard for non-targeted analysis for TU samples substituted with different adulterants analyzed using Raman spectroscopy. It features selectors for data range and band regions, displaying PCA and t-SNE plots. Symbol size indicates the degree of adulteration, and the central plot presents first-order derivative spectra. Output plots adapt dynamically to changes in the band region selector. “Group” denotes the adulterant data class, while “Variable” includes the adulterant type and level. “Spectral range” refers to chosen spectral band regions. Radio buttons enable data exploration with scaling options, and dropdown boxes at the bottom facilitate the exploration of different PC plot combinations. Averaged data set was used for the application.

Figure 4.

An interactive Shiny platform dashboard for non-targeted analysis for TU samples substituted with different adulterants analyzed using IR spectroscopy. The remaining details are similar to the captions provided in Figure 3.

The simultaneous implementation of both PCA and t-SNE allows one to exploit the methods’ specific advantages. PCA maintains the large pairwise distance while eliminating the least significant ones to maximize variance. The scores generated by PCA can be used for further classification analysis. Meanwhile, t-SNE preserves local similarities and generates an excellent visual summary of high-dimensional data. However, unlike PCA, t-SNE is primarily for data exploration and does not generate scores that can be used in further data analysis. The interactive capabilities of the Shiny-based dashboard enabled a more thorough analysis of the data through interactive band area selection, instantaneous dimensionality reduction, and the generation of visual displays that help identify trends in the data set, as opposed to static charts.

Classification Analysis Adulterated TU Sample Using ML Models

Figure 5a depicts the classification accuracy of the various trained models on the test data set as a confusion matrix that relates the predicted values to the ground truth. The confusion matrix is a cross table designed so that if the algorithm correctly predicts the real value of the test data, the value will fall on the diagonal, but incorrectly predicted or misclassified values will fall on either side of the diagonal. Normalized accuracy values are displayed as percentages in each category. The colors of the confusion matrix grids are proportional to the number of instances. Except for the TU class, the sample for each class of substitution group is balanced. Because kappa statistics (Cohen's kappa) corrects the bias of overall accuracy, including unbalanced class distribution,⁷⁵ it was employed as a cross-reference on classification results. Accuracy of class prediction of the test-set data by the LDA with no feature selection, as compared to the feature selection-based PC–LDA and sPLS–DA, the ensemble learning model RF and CNN are shown in Figure 5b.

Figure 5.

Confusion table showing classification accuracy of trained models using Raman spectral data to categorize adulterated ground TU test samples. Panel (a) shows results from shallow ML algorithms (LDA, PC–LDA, sPLS–DA, and RF), while panel (b) displays outcomes from a DL 1D CNN. The right-side plot for 1D CNN depicts model accuracy and loss versus epoch during training and validation.

Feature selection strategies have increased the classification accuracy for Raman data (from 66% in LDA which performed poorly for this data set, versus 74% and 79% for PC–LDA and sPLS–DA, respectively) based on the overall accuracy values. The effect variable selection methods were less for FT-IR data sets where the accuracy achieved by LDA (85%) is slightly improved by PC–LDA (86%) and sPLS–DA (87%). For both Raman and IR spectral data sets, the overall classification accuracy achieved by RF was distinctly higher (93%). This result positions RF above the other shallow ML methods assessed in this study, aligning with the general view of RF as a high-performing ML algorithm. The prediction accuracy achieved by the 1D CNN implementation was 97% for both Raman and IR spectral data sets, surpassing all of the ML techniques tested for both data sets (Figures 5 and 6). The algorithm achieved 100% accuracy in predicting the classes of LC and all three azo dyes when using the Raman spectral data set, regardless of their level of substitution (Table II). The only misclassifications occurred between samples of TU and ST, both of which are plant-derived materials that closely resemble each other. The superior performance of the CNN model observed here is in line with the results reported in previous related research.^48,76 The sample size to feature ratio of the data utilized in this study is small. To mitigate the issue of overfitting BN and dropout layers were implemented, which are recommended methods for addressing the issue of overfitting in DL models.^34,76 By using these techniques, the model can better generalize to unseen data, despite the limited size of the data set. The application of a fivefold to 10-fold data augmentation, following the procedure described by Bjerrum et al.,³⁴ did not result in any significant increase in the accuracy of predictions over the use of the un-augmented data set in this particular case (not shown). However, due to the time-consuming nature of processing the augmented data sets on a standard PC employed in this work, the impact of data augmentation on model performance was not thoroughly investigated. Comparison of model performance during training with and without Gaussian noise injection, BN, and dropout layers implementations indicated that the adoption of these regularization strategies positively affected model performance. The learning rate is an important tuneable hyperparameter in CNN.⁷⁷ While a low learning rate provides smooth convergence but may cause local minima rather than the global optimum, a high learning rate can accelerate learning but may impede convergence.⁷⁸ Finding the right hyperparameters is an iterative process that requires experience, but it can significantly reduce training time while enhancing performance.⁷⁹ Here, the model's accuracy showed improvement when an exponential step decaying learning rate⁷⁷ was utilized, and when a relatively larger batch size was used during model training. The greater variability evident in the validation curve of IR spectral data (Figure 6) indicates a potential concern with overfitting, especially when contrasted with the CNN model utilizing Raman data (Figure 5), despite previous efforts at regularization. This suggests that the IR data displayed more variability compared to its Raman counterpart. To improve the model's performance, future endeavors could explore additional measures, such as incorporating more training data and conducting a more meticulous fine-tuning of hyperparameters.

Figure 6.

Confusion table showing classification accuracy of trained models using IR spectral data to categorize adulterated ground TU test samples. The remaining details are similar to the captions provided in Figure 5.

Table II.

The classification metrics for the analysis of Raman and IR spectral data using both shallow ML algorithms (LDA, PC–LDA, sPLS–DA, and RF) and DL 1D CNN.

	Performance	Raman							FT-IR
Algorithm	metrics	Overall	Statistics by class						Overall	Statistics by class
LDA	Accuracy	0.66							0.85
	Kappa	0.59							0.80
			LC	MY	OR	ST	SU	TU		MY	OR	ST	SU	TU
	Precision		0.82	0.76	0.66	0.63	0.76	0.57		0.90	0.84	0.79	0.95	0.55
	Balanced accuracy		0.90	0.90	0.84	0.80	0.90	0.61		0.96	0.93	0.89	0.94	0.67
PC–LDA
	Accuracy	0.74							0.86
	Kappa	0.69							0.82
	Precision		0.74	0.66	0.76	0.68	0.84	0.79		0.97	1.00	1.00	1.00	0.31
	Balanced Accuracy		0.94	0.96	0.97	0.76	0.98	0.62		0.96	0.93	0.91	0.91	0.88
sPLS–DA
	Accuracy	0.79							0.87
	Kappa	0.75							0.83
	Precision		1.00	0.67	0.79	0.71	0.81	0.85		1.00	1.00	0.94	1.00	0.28
	Balanced accuracy		0.99	0.96	0.98	0.91	0.98	0.61		0.91	0.96	0.91	0.96	0.84
RF
	Accuracy	0.94							0.93
	Kappa	0.92							0.91
	Precision		1.00	1.00	1.00	0.85	1.00	0.50		0.97	1.00	1.00	1.00	0.40
	Balanced accuracy		1.00	1.00	0.99	0.89	1.00	0.79		1.00	1.00	0.89	0.98	0.97
CNN
	Accuracy	0.97							0.97
	Kappa	0.96							0.96
	Precision		1.00	1.00	1.00	0.88	1.00	0.80		1.00	1.00	1.00	1.00	0.69
	Balanced accuracy		1.00	1.00	1.00	0.96	1.00	0.80		1.00	1.00	0.93	1.00	0.98

The table encompasses overall accuracy and precision, along with class-level performance. Accuracy signifies the overall correctness of predictions, precision indicates the proportion of true positives among predicted positives, Kappa reflects agreement beyond chance, and balanced accuracy ensures a balance between sensitivity (TP rate) and specificity (TN rate), especially crucial in imbalanced class scenarios.

The strong performance of the CNN model observed in this study, despite being trained on a relatively small data set, aligns with the idea that the sample size required for CNN model training is dependent on the model's complexity, the scope of the problem, and the characteristics of the data.^27,29 The CNN results reported here were based on hyperparameter combinations that were deemed optimal during the training process. Although the CNN model achieved a higher accuracy than the parallel ML-based analyses, there is still scope for improving the model's performance, as there exist multiple methods for fine-tuning it. The use of a smaller research data set in conjunction with a simpler 1D CNN makes the method more accessible and ready to be tested for a wider range of research problems without requiring specialized computational resources. Simple 1D CNN experiments applied to spectrum data have shown superior results when compared to the shallow ML methods. Considering that the exploration of DL applications in food safety using IR spectroscopy is still in its early stages, the positive outcomes achieved thus far should be subject to further validation through additional research on this relatively novel approach.

Conclusion

This study assessed the effectiveness of both traditional chemometric algorithms and contemporary ML, and DL techniques based on 1D CNN in reducing dimensionality and improving the classification accuracy of ground TU samples adulterated with LC, azo-dyes (MY, OR, and SU III) and ST that are frequently cited as possible color enhancers and bulking agents. The unsupervised dimensionality reduction and pattern recognition performed using the nonlinear ML algorithm t-SNE in parallel with the standard linear-based PCA, revealed detailed patterns in the data that provided more insights into the data exploration process. A comparison of supervised classification analysis was performed between LDA, which considers the entire data set, feature selection-based PC–LDA and sPLS–DA; and RF, which is based on ensemble learning. The 1D CNN consistently demonstrates a higher classification accuracy trend, outperforming other algorithms in both Raman and FT-IR-based analyses, with RF closely following in multiple runs. This improved accuracy underscores the 1D CNN's proficiency in extracting meaningful features from complex spectral data. Furthermore, the 1D CNN distinguishes itself as the most efficient, undergoing training and testing on raw data, setting it apart from other compared algorithms. These capabilities suggest its potential to further empower spectroscopy in food safety research.

Footnotes

Acknowledgments

The author extends thanks to Todd Marrow, Director, and Jason Gotera, Supervisor, Canadian Food Inspection Agency, Greater Toronto Area Laboratory, for their encouragement and support. Gratitude to all manuscript reviewers. Special acknowledgment to Pavisha Kumaravel and Jeremy Van Buskirk for assistance in sample preparation and collecting IR spectra.

Declaration of Conflicting Interests

The author declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was funded by the CFIA Technology Development (TD) Project (Project ID: P000427).

ORCID iD

Thomas A. Teklemariam

References

Black

Haughey

S.A.

Chevallier

O.P.

Galvin-King

Elliott

C.T.

. “A Comprehensive Strategy to Detect the Fraudulent Adulteration of Herbs: The Oregano Approach”. Food Chem. 2016. 210: 551–557. https://doi.org/10.1016/j.foodchem.2016.05.004

Sharifi-Rad

Rayess

Y.E.

Rizk

A.A.

Sadaka

, et al. “Turmeric and Its Major Compound Curcumin on Health: Bioactive Effects and Safety Profiles for Food Pharmaceutical Biotechnological and Medicinal Applications”. Front. Pharmacol. 2020. 11: 01021. https://doi.org/10.3389/fphar.2020.01021

Sattar

Das

Hossain

Sarower

Uddin

. “Study on Consumer Perception towards Quality of Spices Powder Available in Bangladesh”. Open J. Saf. Sci. Technol. 2019. 9(4): 137–144. https://doi.org/10.4236/ojsst.2019.94009

Sutherland

Sim

Gleim

Smyth

S.J.

. “Consumer Insights on Canada’s Food Safety and Food Risk Assessment System”. J. Agric. Food Res. 2020. 2: 100038. https://doi.org/10.1016/j.jafr.2020.100038

Roy

Mukherjee

Sircar

Roy

, et al. “Adulteration in Spices: A Threat to Human Health and Well Being”. Am. J. Appl. Bio-Tech. Res. 2020. 1(3): 25–28. https://doi.org/10.15864/ajabtr.1303

Cowell

Ireland

Vorhees

Heiger-Bernays

. “Ground Turmeric as a Source of Lead Exposure in the United States”. Public Health Rep. 2017. 132(3): 289–293. https://doi.org/10.1177/0033354917700109

Lohumi

Lee

Cho

B.-K.

. “A Review of Vibrational Spectroscopic Techniques for the Detection of Food Authenticity and Adulteration”. Trends Food Sci. Technol. 2015. 46(1): 85–98. https://doi.org/10.1016/j.tifs.2015.08.003

Arendse

Nieuwoudt

Magwaza

L.S.

Nturambirwe

J.F.I.

, et al. “Recent Advancements on Vibrational Spectroscopic Techniques for the Detection of Authenticity and Adulteration in Horticultural Products with a Specific Focus on Oils Juices and Powders”. Food Bioprocess. Technol. 2021. 14(1): 1–22. https://doi.org/10.1007/s11947-020-02505-x

Mozaffari

M.H.

Tay

. “Raman Spectral Analysis of Mixtures with One-Dimensional Convolutional Neural Network”. ArXiv. 2021. https://doi.org/10.48550/arXiv.2106.05316

10.

Sjögren

. “Synergies Between Chemometrics and Machine Learning”. [Doctoral Thesis]. Umeå, Sweden: Umeå University Faculty of Science and Technology Department of Chemistry, 2021. https://urn.kb.se/resolve?urn=urn%3Anbn%3Ase%3Aumu%3Adiva-182683 [accessed Feb 8 2024].

11.

Dhakal

Schmidt

W.F.

Kim

Tang

, et al. “Detection of Additives and Chemical Contaminants in Turmeric Powder Using FT-IR Spectroscopy”. Foods. 2019. 8(5): 143. https://doi.org/10.3390/foods8050143

12.

Galvin-King

Haughey

S.A.

Elliott

C.T.

. “The Detection of Substitution Adulteration of Paprika With Spent Paprika by the Application of Molecular Spectroscopy Tools”. Foods. 2020. 9(7): 944. https://doi.org/10.3390/foods9070944

13.

Dhakal

Chao

Schmidt

Qin

, et al. “Detection of Azo Dyes in Curry Powder Using a 1064 nm Dispersive Point-Scan Raman System”. Appl. Sci. 2018. 8(4): 564. https://doi.org/10.3390/app8040564

14.

van der Maaten

L.J.P.

Hinton

G.E.

. “Visualizing High-Dimensional Data Using t-SNE”. J. Mach. Learn. Res. 2008. 9(86): 2579–2605.

15.

Pfisterer

K.J.

Amelard

Wong

. “Intuitive Data-Driven Visualization of Food Relatedness via t-Distributed Stochastic Neighbor Embedding”. J. Comp. Vis. Imag. Sys. 2018. 4(1): 1–3.

16.

Nallan Chakravartula

S.S.

Moscetti

Bedini

Nardella

Massantini

. “Use of Convolutional Neural Network (CNN) Combined with FT-NIR Spectroscopy to Predict Food Adulteration: A Case Study on Coffee”. Food Control. 2022. 135: 108816. https://doi.org/10.1016/j.foodcont.2022.108816

17.

Oliveira

F.H.M.

Machado

A.R.P.

Andrade

A.O.

. “On the Use of T-Distributed Stochastic Neighbor Embedding for Data Visualization and Classification of Individuals with Parkinson’s Disease”. Comput. Math. Methods Med. 2018. 2018: 8019232. https://doi.org/10.1155/2018/8019232

18.

Breiman

. “Random Forests”. Mach. Learn. 2001. 45(1): 5–32. https://doi.org/10.1023/A:1010933404324

19.

LeCun

Bengio

Hinton

. “Deep Learning”. Nature. 2015. 521(7553): 436–444. https://doi.org/10.1038/nature14539

20.

Mazzoni

Andersen

R.A.

Jordan

M.I.

. “A More Biologically Plausible Learning Rule for Neural Networks”. Proc. Natl. Acad. Sci. U.S.A. 1991. 88(10): 4433–4437. https://doi.org/10.1073/pnas.88.10.4433

21.

Agatonovic-Kustrin

Beresford

. “Basic Concepts of Artificial Neural Network (ANN) Modeling and Its Application in Pharmaceutical Research”. J. Pharm. Biomed. Anal. 2000. 22(5): 717–727. https://doi.org/10.1016/S0731-7085(99)00272-1

22.

Choudhary

DeCost

Chen

Jain

, et al. “Recent Advances and Applications of Deep Learning Methods in Materials Science”. npj Comput. Mater. 2022. 8(1): 59. https://doi.org/10.1038/s41524-022-00734-6

23.

Wang

Kuen

, et al. “Recent Advances in Convolutional Neural Networks”. Pattern Recognit. 2018. 77(C): 354–377. https://doi.org/10.1016/j.patcog.2017.10.013

24.

Kiranyaz

Avci

Abdeljaber

Ince

, et al. “1D Convolutional Neural Networks and Applications: A Survey”. Mech. Syst. Signal Process. 2021. 151: 107398. https://doi.org/10.1016/j.ymssp.2020.107398

25.

Zhou

Zhang

Liu

Qiu

. “Application of Deep Learning in Food: A Review”. Compr. Rev. Food Sci. Food Saf. 2019. 18(6): 1793–1811. https://doi.org/10.1111/1541-4337.12492

26.

Liang

Sun

Zhang

Qiu

. “Advances in Infrared Spectroscopy Combined with Artificial Neural Network for the Authentication and Traceability of Food”. Crit. Rev. Food Sci. Nutr. 2022. 62(11): 2963–2984. https://doi.org/10.1080/10408398.2020.1862045

27.

Liu

Zhou

Han

, et al. “Detection of Adulteration in Infant Formula Based on Ensemble Convolutional Neural Network and Near-Infrared Spectroscopy”. Foods. 2021. 10(4): 785. https://doi.org/10.3390/foods10040785

28.

Olson

Wyner

A.J.

Berk

. “Modern Neural Networks Generalize on Small Data Sets”. Advances in Neural Information Processing Systems (NeurIPS). 2018. https://proceedings.neurips.cc/paper_files/paper/2018/file/fface8385abbf94b4593a0ed53a0c70f-Paper.pdf [accessed Jan 24 2024].

29.

Brigato

Iocchi

. “A Close Look at Deep Learning with Small Data”. Paper presented at: 2020 25th International Conference on Pattern Recognition (ICPR). Milan, Italy; 10–15 January 2021. https://doi.org/10.1109/ICPR48806.2021.9412492

30.

Yang

Zhang

, et al. “Deep Learning for Vibrational Spectral Analysis: Recent Progress and a Practical Guide”. Anal. Chim. Acta. 2019. 1081: 6–17. https://doi.org/10.1016/j.aca.2019.06.012

31.

Wang

Tian

, et al. “End-to-End Analysis Modeling of Vibrational Spectroscopy Based on Deep Learning Approach”. J. Chemom. 2020. 34(10): e3291. https://doi.org/10.1002/cem.3291

32.

Srivastava

Hinton

Krizhevsky

Sutskever

Salakhutdinov

. “Dropout: A Simple Way to Prevent Neural Networks from Overfitting”. J. Mach. Learn Res. 2014. 15(56): 1929–1958. https://doi.org/10.5555/2627435.2670313

33.

Ioffe

Szegedy

. “Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift”. ArXiv. 2015. https://arxiv.org/pdf/1502.03167.pdf [accessed Feb 8 2024].

34.

Bjerrum

E.J.

Glahder

Skov

. “Data Augmentation of Spectral Data for Convolutional Neural Network (CNN) Based Deep Chemometrics”. 2017. https://arxiv.org/pdf/1710.01927 [accessed Jan 24 2024].

35.

Murphy

. “Deep Learning and Chemometrics: Quantitative and Qualitative Spectroscopy Interpretation of Aqueous Solutions”. https://curf.github.io/assets/docs/CS615_Project__CM.pdf [accessed Jan 24 2024].

36.

Alzubaidi

Zhang

Humaidi

A.J.

Al-Dujaili

, et al. “Review of Deep Learning: Concepts CNN Architectures Challenges Applications Future Directions”. J. Big Data. 2021. 8(1): Article no. 53. https://doi.org/10.1186/s40537-021-00444-8

37.

RStudio Team . “RStudio: Integrated Development Environment for I R”. RStudio, PBC, Boston, MA, 2020. URL http://www.rstudio.com/. [accessed Feb 8 2024].

38.

Valenzuela

Rodriguez-Llamazares

. “Spftir (Version 0.1.0): Pre-Processing and Analysis of Mid-Infrared Spectral Region”. https://www.rdocumentation.org/packages/spftir [accessed Jan 24 2024].

39.

Stevens

Ramirez-Lopez

. “An Introduction to the Prospectr Package”. https://cran.r-project.org/web/packages/prospectr/vignettes/prospectr.html [accessed Jan 24 2024].

40.

Wickham

. Ggplot2: Elegant Graphics for Data Analysis. New York: Springer-Verlag, 2016.

41.

Tang

Horikoshi

. “ggfortify: Unified Interface to Visualize Statistical Results of Popular R Packages”. R Journal. 2016. 8(2): 474–485. https://doi.org/10.32614/rj-2016-060

42.

Chun

Keleş

. “Expression Quantitative Trait Loci Mapping with Multivariate Sparse Partial Least Squares Regression”. Genetics. 2009. 182(1): 79–90. https://doi.org/10.1534/genetics.109.100362

43.

Altman

Krzywinski

. “Ensemble Methods: Bagging and Random Forests”. Nat. Methods. 2017. 14(10): 933–934. https://doi.org/10.1038/nmeth.4438

44.

Cao

K.-A.L.

Rossouw

Robert-Granié

Besse

. “A Sparse PLS for Variable Selection When Integrating Omics Data”. Stat. Appl. Genet Mol. Biol. 2008. 7(1): 35. https://doi.org/10.2202/1544-6115.1390

45.

Poggio

Mhaskar

Rosasco

Miranda

Liao

. “Why and When Can Deep-but Not Shallow-Networks Avoid the Curse of Dimensionality: A Review”. Int. J. Autom. Comput. 2017. 14(5): 503–519. https://doi.org/10.1007/s11633-017-1054-2

46.

Mozaffari

M.H.

Tay

. “A Review of 1D Convolutional Neural Networks toward Unknown Substance Identification in Portable Raman Spectrometer”. ArXiv. 2020. https://doi.org/10.48550/arXiv.2006.10575

47.

O’Shea

Nash

. “An Introduction to Convolutional Neural Networks”. ArXiv. 2015. https://doi.org/10.48550/arXiv.1511.08458

48.

Liu

Osadchy

Ashton

Foster

, et al. “Deep Convolutional Neural Networks for Raman Spectrum Recognition: A Unified Solution”. Analyst. 2017. 142(21): 4067–4074. https://doi.org/10.1039/C7AN01371J

49.

Lê Cao

K.-A.

Boitard

Besse

. “Sparse PLS Discriminant Analysis: Biologically Relevant Feature Selection and Graphical Displays for Multiclass Problems”. BMC Bioinf. 2011. 12(1): 253. https://doi.org/10.1186/1471-2105-12-253

50.

Probst

Wright

M.N.

Boulesteix

A.-L.

. “Hyperparameters and Tuning Strategies for Random Forest”. WIREs Data. Mining. Knowl. Discov. 2019. 9: e1301. https://doi.org/10.1002/widm.1301

51.

Holmstrom

Koistinen

. “Using Additive Noise in Back-Propagation Training”. IEEE Trans. Neural Netw. 1992. 3(1): 24–38. https://doi.org/10.1109/72.105415

52.

Liu

. “Adaptive Gaussian Noise Injection Regularization for Neural Networks”. In: Han

Qin

Zhang

, editors. Advances in Neural Networks (ISNN 2020). Cham: Springer, 2020. https://doi.org/10.1007/978-3-030-64221-1_16

53.

Zafar

Aamir

Mohd Nawi

Arshad

, et al. “A Comparison of Pooling Methods for Convolutional Neural Networks”. Appl. Sci. 2022. 12(17): 8643. https://doi.org/10.3390/app12178643

54.

Wang

Chen

. “Empirical Evaluation of Rectified Activations in Convolutional Network”. ArXiv. 2015. https://doi.org/10.48550/arXiv.1505.00853

55.

Kingma

D.P.

. “Adam: A Method for Stochastic Optimization”. ArXiv. 2014. https://doi.org/10.48550/arxiv.1412.6980

56.

Ruder

. “An Overview of Gradient Descent Optimization Algorithms”. ArXiv. 2016. https://doi.org/10.48550/arXiv.1609.04747

57.

Monton

Charoenchai

Suksaeree

Sueree

. “Quantitation of Curcuminoid Contents Dissolution Profile and Volatile Oil Content of Turmeric Capsules Produced at Some Secondary Government Hospitals”. J. Food Drug Anal. 2016. 24(3): 493–499. https://doi.org/10.1016/j.jfda.2016.01.007

58.

Anand

Kunnumakkara

A.B.

Newman

R.A.

Aggarwal

B.B.

. “Bioavailability of Curcumin: Problems and Promises”. Mol. Pharmaceutics. 2007. 4(6): 807–818. https://doi.org/10.1021/mp700113r

59.

Nebrisi

E.E.

. “Neuroprotective Activities of Curcumin in Parkinson’s Disease: A Review of the Literature”. Int. J. Mol. Sci. 2021. 22(20): 11248. https://doi.org/10.3390/ijms222011248

60.

Ghodke

S.V.

Pawar

V.N.

. “Studies on Food Value Base Curcumin Extraction for Commercial Exploration”. J. Pharmacog. Phytochem. 2018. 7(6): 1173–1176. https://www.phytojournal.com/archives/2018/vol7issue6/PartT/7-6-85-365.pdf [accessed Feb 8 2024].

61.

Allen

R.L.M.

. “The Chemistry of Azo Dyes”. In: Colour Chemistry. Boston: Springer, 1971. Chap. 3, Pp. 21–36.

62.

Ghosh

Singh

P.S.

Firdaus

S.B.

Ghosh

. “Metanil Yellow: The Toxic Food Colorant”. Asian Pac. J. Health Sci. 2017. 4(4): 65–66. https://doi.org/10.21276/apjhs.2017.4.4.16

63.

Heinze

T.M.

Paine

D.D.

Cerniglia

Chen

. “Sudan Azo Dyes and Para Red Degradation by Prevalent Bacteria of the Human Gastrointestinal Tract”. Anaerobe. 2010. 16(2): 114–119. https://doi.org/10.1016/j.anaerobe.2009.06.007

64.

Balakrishnan

K.V.

. “Postharvest Technology and Processing of Turmeric”. In: Ravindran

P.N.

Nirmal Babu

Sivaraman

, editors. Turmeric: The Genus Curcuma. Boca Raton, Florida: CRC Press, 2007. Chap. 8, Pp. 193–256.

65.

Kolev

T.M.

Velcheva

E.A.

Stamboliyska

B.A.

Spiteller

. “DFT and Experimental Studies of the Structure and Vibrational Spectra of Curcumin”. Int. J. Quantum Chem. 2005. 102(6): 1069–1079. https://doi.org/10.1002/QUA.20469

66.

Hoang

Xuan Hung

Thắng

Chinh

, et al. “Fabrication and Vibration Characterization of Curcumin Extracted from Turmeric (Curcuma longa) Rhizomes of the Northern Vietnam”. Springerplus. 2016. 5(1): 1147. https://doi.org/10.1186/s40064-016-2812-2

67.

Nguyen

T.A.

Tang

Q.D.

Doan

D.C.T.

Dang

M.C.

. “Micro and Nano Liposome Vesicles Containing Curcumin for a Drug Delivery System”. Adv. Nat. Sci.: Nanosci. Nanotechnol. 2016. 7(3): 035003. https://doi.org/10.1088/2043-6262/7/3/035003

68.

Frost

R.L.

. “Raman Microscopy of Selected Chromate Minerals”. J. Raman Spectrosc. 2004. 35(2): 153–158. https://doi.org/10.1002/jrs.1121

69.

Erasmus

S.W.

van Hasselt

Ebbinge

L.M.

van Ruth

S.M.

. “Real or Fake Yellow in the Vibrant Colour Craze: Rapid Detection of Lead Chromate in Turmeric”. Food Control. 2021. 121: 107714. https://doi.org/10.1016/j.foodcont.2020.107714

70.

Xie

Chen

Guo

Cheng

, et al. “Rapid SERS Detection of Acid Orange II and Brilliant Blue in Food by Using Fe₃O₄@Au Core–Shell Substrate”. Food Chem. 2019. 270: 173–180. https://doi.org/10.1016/j.foodchem.2018.07.065

71.

van Soest

J.J.G.

Tournois

de Wit

Vliegenthart

J.F.G.

. “Short-Range Structure in (Partially) Crystalline Potato Starch Determined with Attenuated Total Reflectance Fourier-Transform IR Spectroscopy”. Carbohydr. Res. 1995. 279: 201–214. https://doi.org/10.1016/0008-6215(95)00270-7

72.

Nikonenko

N.A.

Buslov

D.K.

Sushko

N.I.

Zhbankov

R.G.

. “Spectroscopic Manifestation of Stretching Vibrations of Glycosidic Linkage in Polysaccharides”. J. Mol. Struct. 2005. 752(1): 20–24. https://doi.org/10.1016/j.molstruc.2005.05.015

73.

Wiercigroch

Szafraniec

Czamara

Pacia

M.Z.

, et al. “Raman and Infrared Spectroscopy of Carbohydrates: A Review”. Spectrochim. Acta, Part A. 2017. 185: 317–335. https://doi.org/10.1016/j.saa.2017.05.045

74.

Esme

Sagdinc

S.G.

. “The Vibrational Studies and Theoretical Investigation of Structure Electronic and Non-Linear Optical Sudan III [1-[4-(phenylazo) phenyl]azo-2-Naphthalenol]”. J. Mol. Struct. 2013. 1048: 185–195. https://doi.org/10.1016/j.molstruc.2013.05.022

75.

Cohen

. “A Coefficient of Agreement for Nominal Scales”. Educ. Psychol. Meas. 1960. 20(1): 37–46.

76.

Chen

Y.-Y.

Wang

B.-Z.

. “End-to-End Quantitative Analysis Modeling of Near-Infrared Spectroscopy Based on Convolutional Neural Network”. J. Chemom. 2019. 33(5): e3122. https://doi.org/10.1002/cem.3122

77.

Wang

Tian

Yang

S.X.

Zhu

, et al. “Improved Deep CNN With Parameter Initialization for Data Analysis of Near-Infrared Spectroscopy Sensors”. Sensors. 2020. 20(3): 874. https://doi.org/10.3390/s20030874

78.

Mohtashami

Jaggi

Stich

S.U.

. “Special Properties of Gradient Descent with Large Learning Rates”. In: Proceedings of the 40th International Conference on Machine Learning. Honolulu, Hawaii: 23–29 July 2023. Vol. 202. Pp. 25082–25104. https://proceedings.mlr.press/v202/mohtashami23a.html [accessed Feb 8 2024].

79.

Smith

L.N.

. “A Disciplined Approach to Neural Network Hyper-Parameters: Part 1. Learning Rate, Batch Size, Momentum, and Weight Decay”. U.S. Naval Res. Lab. Tech. Rep. 5510-026. ArXiv. 2018. https://doi.org/10.48550/arXiv.1803.09820 [accessed Jan 24 2024].

Raman and Mid-Infrared Spectroscopy Coupled With Machine–Deep Learning for Adulterant Detection in Ground Turmeric

Abstract

Keywords

Introduction

Experimental

Materials and Methods

Infrared Spectroscopy

Data Preparation

Data Analysis

Unsupervised Pattern Recognition

Interactive Application for Visualization of PCA and t-SNE Outputs

Supervised Pattern Recognition

One-Dimensional Convolutional Neural Network (1D CNN)

Optimization of ML and DL Models

Data Augmentation for CNN Model

Evaluation of Classification Performance

Result and Discussion

Infrared and Raman Spectra

Raman and IR Spectra of the Ground TU and Adulterants

Interactive Visualization for Pattern Recognition

Classification Analysis Adulterated TU Sample Using ML Models

Conclusion

Footnotes

Acknowledgments

Declaration of Conflicting Interests

Funding

ORCID iD

References