Sage Journals: Discover world-class research

Abstract

This study introduces a novel approach employing Graph Attention Networks (GAT) to detect pre-installation defects in built-in spindles. Traditional quality control relies heavily on manual inspections and basic mechanical testing, which often miss subtle defects. Using vibration datasets from 13 spindles with identical specifications, this research applies a GAT-based diagnostic method, transforming vibration signals into graph representations. GAT’s attention mechanism effectively extracts essential node and structural features, enabling early and accurate defect identification. Experimental results demonstrate that the proposed GAT model significantly outperforms conventional techniques such as k-nearest neighbors (KNN), Support Vector Machines (SVM), and Graph Convolutional Networks (GCN). By adding noise to simulate harsh operational conditions, the GAT model maintained superior clustering and classification performance, achieving 100% accuracy under noiseless conditions and exhibiting exceptional robustness even at a challenging −5 dB signal-to-noise ratio (SNR). Additionally, GAT effectively handles imbalanced datasets and displays strong generalization capabilities, underscoring its practical industrial potential. This research marks a notable advancement in spindle manufacturing quality control, highlighting promising future directions for deep learning in industrial diagnostics.

Keywords

graph attention networks defect detection built-in spindles quality control deep learning

Introduction

In CNC machine tool manufacturing, the spindle is a critical component whose quality directly impacts machining precision and performance.¹ Defects in the spindle can result in compromised performance, increased maintenance costs, and potential machine failures. Thus, ensuring that only high-quality, defect-free spindles are assembled into machine tools necessitates stringent quality control on the production line before shipment. Traditionally, spindle quality control has relied on manual inspection and basic mechanical testing.² However, these conventional methods lack the sensitivity required to detect subtle and latent defects in newly assembled spindles, particularly in light of the increasing complexity of modern manufacturing requirements. Due to their rich informational content and ease of measurement, vibration signals have been extensively used in machinery fault diagnosis.^3–5 Recent advancements have been facilitated by integrating deep learning techniques into this domain.

Among these techniques, GATs have shown considerable potential for managing complex mechanical system defect detection due to their unique advantages. Traditional machinery fault diagnosis methods typically depend on manually extracted time-domain and frequency-domain features from vibration signals,^6,7 combined with machine learning models such as SVM and KNN for fault classification.^8,9 These methods face challenges such as cumbersome feature extraction processes, reliance on expert experience, and difficulty handling complex working conditions. The rapid development of deep learning technology has significantly improved automatic feature extraction techniques, enhancing diagnostic efficiency and accuracy. For instance, some studies have employed Convolutional Neural Networks (CNN) integrated with physical characteristics of bearing acceleration to improve fault classification accuracy.¹⁰ Despite advancements, CNNs struggle with sequential or temporal dependencies, which are critical in time-series data. Recurrent Neural Networks (RNN) have been used to better manage time-series information, addressing the limitations of CNNs in temporal information processing.¹¹ Although RNNs excel in handling temporal information, they face issues such as vanishing gradients, which hinder their ability to capture long-term dependencies, and their sequential processing nature leads to increased training times.

Additionally, some studies propose advanced feature learning methods to enhance diagnostic performance.^12–16 However, these approaches often treat vibration signals as images, potentially overlooking the complex interrelationships among components within mechanical systems. In this context, the use of GAT models represents a significant advancement. This study investigates the application of GAT models for defect detection in built-in spindles before installation. By analyzing vibration signals from the spindle assembly line, potential defects can be identified prior to spindle installation in machine tools, ensuring product quality at the source and offering an innovative solution for defect detection in spindle production. This approach improves the overall efficiency and reliability of traditional spindle production lines. GAT models capture the complex dependencies and interactions between different components by representing vibration signals as graphs rather than images.^17,18 This method leverages the attention mechanism to focus on the most relevant parts of the signal, thereby enhancing the accuracy and robustness of defect detection. By adopting GAT models, the limitations of traditional and other deep learning-based methods can be addressed, providing a more comprehensive and precise fault diagnosis framework for spindle production. Graph Neural Networks (GNN) can effectively model the relationships among components within mechanical systems. For instance, sensor networks or mechanical components can be represented as nodes in a graph, with their interactions depicted as edges, thus capturing system state information more comprehensively. Some studies have proposed multi-receptive field GCN, which represents sample differences through weighted graphs. However, the unweighted graph structure limits their application.^19,20 Other studies use spectral analysis to construct spatiotemporal graphs and employ GCN to extract deep features, thereby enhancing fault diagnosis performance.^21,22 Early GCNs, however, applied fixed weights to neighbor nodes during information aggregation, which limited the model’s expressive power. To address the issues of fixed neighborhood aggregation and information propagation loss in GCNs, GATs, as proposed by Veličković et al.,²³ introduce an attention mechanism. As an improved GNN model, GAT dynamically adjusts the weights of neighbor nodes based on node features and neighbor relationships, capturing critical information in the graph more flexibly and accurately. Specifically, GAT uses learnable attention coefficients to measure the association between nodes and aggregates neighbor node information weighted by these coefficients, effectively addressing the fixed neighborhood aggregation and information propagation loss problems in GCN, thereby enhancing the effectiveness of feature learning.²⁴ For example, in manufacturing data, which typically contains noise and complex dependencies, GAT can adaptively learn the importance of various features through its attention mechanism,²⁵ assigning higher weights to key features and thus improving defect diagnosis accuracy. The evolution of machinery fault diagnosis methods has progressed from traditional machine learning to deep learning and GNNs. Each advancement addresses the limitations of its predecessor. As the latest method, GAT combines the advantages of deep learning and GNNs, enhancing model adaptability and interpretability through its attention mechanism.

This study systematically tracked existing factory quality control processes and customer feedback over 1 year, annotating existing data with a posteriori knowledge. Utilizing datasets from 13 spindles with identical specifications and designs, the GAT model was employed to enhance the traditional diagnostic process for built-in spindles. The neighbor self-attention mechanism in GAT adaptively extracted node and structural features by constructing a graph structure of vibration signals and using it as input data. Classification was then performed to achieve early identification of potential spindle defects, enabling the detection of issues that might only become apparent after installation or during client usage.

The advantage of this study lies in the use of GAT to analyze vibration signals on the spindle assembly line, detecting potential defects before spindle installation on machine tools. The GAT model’s ability to apply varying levels of attention to different parts of the graph allows it to highlight the most relevant features and patterns for defect detection. This adaptability was crucial for identifying subtle indicators of potential defects. While frequency-domain features are commonly used in traditional fault diagnosis due to their physical interpretability (e.g. fault frequencies and harmonics), the proposed method adopts time-domain vibration signals for graph construction. This choice aligns with the goal of capturing local dependencies and structural similarities between signal segments, which are more naturally preserved in the temporal domain. Furthermore, using raw time-domain data avoids the need for manual spectral transformation or hand-crafted feature design, allowing the model to operate in a more flexible and data-driven manner.

This proactive defect detection and intervention system ensures early correction of defects, thereby enhancing the quality control process, improving production efficiency, and increasing product reliability. Integrating these advanced diagnostic technologies into the spindle production process can reduce rework, lower warranty claim costs, and provide more reliable products. This research addresses a significant gap in pre-installation diagnostics, offering valuable insights for both the manufacturing industry and academia. The method reduces rework and maintenance by providing practical solutions for producing defect-free spindles, thereby enhancing product reliability.

Methodology

Definition and types of graphs

A graph is a data structure composed of nodes (also known as vertices) and edges used to represent objects and their relationships. In a graph, nodes represent objects, while edges indicate the connections or relationships between these nodes. A graph G can be represented as $G (V, E, X, A)$ , where each symbol represents different graph components. The components are defined as follows:

$V$ (Vertices): This is the set of nodes in the graph, where each node typically represents an object or entity.

$E$ (Edges): This is the set of edges in the graph, where each edge denotes a connection or relationship between nodes.

$X$ : In some graph representations, $X$ denotes the feature vectors associated with the nodes.

$A$ : This denotes the adjacency matrix, which describes the connections between nodes.

Adjacency Matrix $A$ :

In an unweighted graph, $A$ is a binary matrix where $A_{ij}$ = 1 indicates the presence of an edge between node $i$ and node $j$ , and $A_{ij}$ = 0 indicates the absence of an edge.

In a weighted graph, the value of $A_{ij}$ represents the weight of the edge from node $i$ and node $j$ .

Graphs have a wide range of applications, including social network analysis, path planning, and mechanical defect detection. The most common types of graph structures include directed graphs, undirected graphs, kNN graphs, and complete graphs, each differing significantly in terms of topology and information propagation. An explanation of these four common graph types is provided below (see Figure 1):

(a) Directed Graph: In a directed graph, edges have a specific direction, indicating a one-way relationship from one node to another. This directionality implies that if there is an edge from node $A$ to node $B$ , traversal is possible only from $A$ to $B$ , and not in the reverse direction. Directed graphs are commonly used to represent unidirectional relationships.

(b) Undirected Graph: In an undirected graph, edges have no direction, signifying a symmetric relationship between nodes. If there is an edge between node A and node B, traversal is possible in both directions. Undirected graphs are often used to represent bidirectional relationships.

(c) kNN Graph: A KNN graph is constructed based on the similarity or distance between data points. In this type of graph, each node is connected to its $K$ most similar or closest nodes. Formally, for each node $v$ in the set of vertices $V$ , the $K$ most similar nodes to v are identified, and edges are established between $v$ and these $K$ nodes.

(d) Complete Graph: A complete graph is a graph in which there is an edge between every pair of distinct nodes. In a complete graph with $n$ nodes, there are $n (n - 1) / 2$ edges.

Figure 1.

Four types of graph structures: (a) directed graph, (b) undirected graph, (c) kNN graph, and (d) directed complete graph.

Constructing the adjacency matrix and node feature matrix is essential for understanding how graphs are represented and analyzed. The adjacency matrix describes the connections between nodes, while the node feature matrix contains information about each node’s attributes (see Figure 2). This process is illustrated below:

Figure 2.

Construction of adjacency matrix and node feature matrix.

Comparison of deep learning models: GCN versus GAT

In GCNs and GATs, the use of the adjacency matrix (A) differs significantly. Comparing these two methods is essential because GCNs and GATs represent two critical milestones in the evolution of GNNs, demonstrating that methods for processing graph-structured data have been progressively refined. Both GCNs and GATs emerged from the need to apply deep learning to graphs to effectively aggregate and propagate node features within graph structures.

GCNs and GATs were both proposed to address the limitations of traditional neural networks in handling graph-structured data directly. GCNs, an earlier approach, directly utilize the topology of the graph (through the adjacency matrix) to perform feature aggregation. Conversely, GATs build on GCNs by incorporating an attention mechanism, allowing the model to dynamically learn the significance of node relationships. Understanding the differences between these two methods is essential for grasping the evolution of graph neural networks and selecting the appropriate model for different application scenarios. GCNs are well-suited for graphs with relatively fixed structures, while GATs are more suitable for scenarios requiring dynamic adjustment of the importance of node relationships. Below are the main differences between GCN and GAT:

GCN

In GCNs, the connections in the adjacency matrix (A) typically have equal weights, indicating that the influence of each neighboring node is fixed. GCNs perform graph convolution operations to aggregate information from neighboring nodes. Specifically, the basic operation of a GCN can be represented as^26,27:

$H^{(I + 1)} = σ (\tilde{A} H^{I} W^{I})$ (1)

where:

$H^{(I + 1)}$ is the node feature matrix at layer $(I + 1)$ .

$W^{I}$ is the weight matrix for the linear transformation at layer $I$ .

$σ$ is the activation function (e.g. ReLU).

$\tilde{A}$ is the normalized adjacency matrix, typically represented as $\tilde{A} = {\tilde{D}}^{- 0.5} \tilde{A} {\tilde{D}}^{- 0.5}$ , where $\tilde{D}$ is the degree matrix.

In GCNs, the adjacency matrix (A) aggregates and propagates node features, thereby capturing local information within the graph structure. This matrix uses fixed, structure-based weights, meaning it can only capture linear relationships. Since GCNs rely on pre-established graph structure data, they cannot dynamically adjust the weights between a node and its neighbors based on the features of each node.²⁸

GAT

In GAT, the adjacency matrix (A) incorporates a self-attention mechanism to dynamically compute the weights between each node and its neighbors. GATs are more adaptable than GCNs in accounting for the varying importance of different nodes. Specifically, the basic operation of GAT can be represented as²⁹:

$e_{i, j} = a^{T} ([W X_{i} | | W X_{j}]), j \in N_{i}$ (2)

$α_{i, j} = \frac{\exp (Leaky Re LU (e_{i, j}))}{\sum_{k \in N_{i}} \exp (Leaky Re LU (e_{i, k}))}$ (3)

where:

$e_{i, j}$ is the attention coefficient derived from the similarity between node $i$ and node $j$ .

$W$ is the weight matrix for the linear transformation used to extract new feature representations.

$a$ is a learnable parameter vector in the attention mechanism.

‖ denotes the concatenation operation of vectors.

$α_{i, j}$ represents the normalized attention coefficient between node $i$ and its neighbor $j$ .

$Leaky Re LU (\cdot)$ is a non-linear activation function.

The computed attention coefficients are utilized to weigh the features of neighboring nodes, resulting in an updated representation for each node. The specific calculations are as follows²⁹:

${X^{'}}_{i} (K) = \overset{K}{∥_{k = 1}} σ (\sum_{j \in N_{i}} α_{ij}^{k} W^{k} X_{j})$ (4)

where:

$X_{i}^{'}$ denotes the updated feature of node $i$ .

∥(·) denotes the aggregation operation used to integrate the outputs from K attention heads.

$K$ represents the number of attention heads used for multi-head attention aggregation.

$N_{i}$ denotes the set of neighbors of node $i$ .

In GATs, the adjacency matrix (A) is no longer directly used for feature aggregation. Instead, the neighborhood self-attention mechanism dynamically calculates the weights between each node and its neighbors. Specifically, GAT employs a self-attention mechanism to compute the relevance (i.e. attention scores) between each node and its neighbors. These attention scores are used to perform a weighted average of the features of neighboring nodes, thereby generating new node representations. This mechanism allows for dynamic adjustment of the influence of each neighboring node on the target node, enabling more flexible and adaptive feature aggregation that captures essential information within the graph structure. The update process for node 4 when $K = 3$ is illustrated in Figure 3.

Figure 3.

Node update process using multi-head attention mechanism with K = 3.

Defect detection process

Figure 4 illustrates the comprehensive process of transforming vibration signals into graph structures, extracting features, classifying defects based on GAT, and evaluating and visualizing diagnostic results. The process is described in detail as follows:

Step 1: Transformation of Time-Domain Vibration Signals to Graph Structures: In the graph $G (V, E, X, A)$ , each node $v_{i} \in V$ represents a segment of the time series, with its feature vector $x_{i}$ containing the vibration signal for that time segment. Edges $e_{i, j} \in E$ connect adjacent time segments, with weights $W_{ij}$ reflecting the correlation between nodes. A sliding window method is employed to construct the graph structure: with a window size of $w$ and a stride of $s$ , a time series of length $T$ is divided into $(T - w) / s + 1$ sub-sequences, each corresponding to a node in the graph. Each node represents a signal segment within a specific time window.

Step 2: Defect Detection Process Based on GAT: After constructing the graph structure, the defect detection process proceeds using GAT. The constructed graph is first input into the GAT input layer, where features are extracted and aggregated through multiple hidden layers. Nodes in each hidden layer update their features based on the features of their neighboring nodes and the weights of the connecting edges. This feature aggregation process effectively captures local patterns and global structures in the time series data.

Step 3: Classification and Evaluation: The high-dimensional features extracted from the GAT model are mapped to the classification space via a fully connected layer (FC Layer) and classified using a SoftMax layer, which outputs the defect category labels for each node. The model’s classification performance is evaluated using a confusion matrix, which compares the classification results with the actual labels. Furthermore, feature visualization techniques are employed to display the distribution of different defect categories in the feature space, further validating the model’s effectiveness.

Figure 4.

Proposed latent defect detection framework.

Data and model configuration

Experimental data description

The dataset utilized in this study originates from the quality control processes of a built-in spindle manufacturing line, documenting actual vibration signals. The analysis focused on 13 built-in spindles with identical design specifications and assembly components. All spindles employed the BBT #40 taper and were designed for a maximum rotational speed of 15,000 RPM. Each spindle underwent an oil lubrication process post-assembly to ensure thorough lubrication of the internal bearings. Accelerometers were mounted on the spindle heads at the run-in station using magnetic bases (as shown in Figure 5), and the spindles were operated at a constant speed of 7500 RPM. Vibration signals were collected at a sampling rate of 25.6 kHz by the data acquisition system. Records from existing factory quality control processes, such as durability and cutting tests, as well as customer feedback over 1 year—including in-factory repairs—were analyzed. Potential latent defects detected on the production line were annotated with a posteriori knowledge. For each spindle, 300 vibration signal samples were collected by segmenting the raw time-domain signal into non-overlapping windows, each with a fixed length of 1024 data points. This resulted in a total of 3900 samples across all 13 spindles. For each spindle, its 300 samples were randomly split into 70% for training and 30% for testing, ensuring that no sample overlap occurred between the two sets. This spindle-wise segmentation strategy was applied consistently to all spindles. Detailed descriptions of the experimental data are provided in Table 1.

Figure 5.

Installation position of the accelerometer on a built-in spindle.

Table 1.

Numbers and built-in spindle defect types.

Label number	Training/testing sample	Defect type	Spindle number
1	840/360	Normal	1
			2
			3
			4
2	840/360	Assembly	5
			6
			7
			8
3	630/270	Drawbar	9
			10
			11
4	420/180	Bearing	12
4	420/180	Bearing	13

Table 1 lists the data labeling and classification according to the potential defect types of the spindles. Each spindle is uniquely identified by a specific number. The spindles are categorized into four different groups, as indicated by the Label Number in the first column. The Defect Type column describes the specific defect associated with each sample. Except for Label 1, which denotes normal operation, the other labels indicate potential defect issues in new spindles, such as “Assembly,”“Drawbar,” and “Bearing” defects. This classification aids in distinguishing different defect patterns in newly assembled spindles, thereby enhancing the efficiency of repairs on the production line.

GAT parameter configurations

Network architecture and training parameters

The proposed model architecture consists of two primary components: the feature extraction module and the feature classification module. Table 2 summarizes the details of each layer in the network, including their input and output dimensions and descriptions.

Table 2.

Network architecture and parameter settings.

Feature extraction module
No.	Layer	Input size	Output size
1	GATConv1	(10, 1024)	(10, 1024)
2	BN Layer1	(10, 1024)	(10, 1024)
3	ReLU	(10, 1024)	(10, 1024)
4	GATConv2	(10, 1024)	(10, 1024)
5	BN Layer2	(10, 1024)	(10, 1024)
6	ReLU	(10, 1024)	(10, 1024)
Feature classification module
No.	Layer	Input size	Output size
1	FC Layer1	(10, 1024)	(10, 256)
2	ReLU	(10, 256)	(10, 256)
3	FC Layer2	(10, 256)	(10, 4)
4	Softmax	(10, 4)	(10, 4)
Training hyperparameters
Batch size	Epochs	Optimizer	Learning rate
64	100	Adam	0.0001

Feature extraction module

The feature extraction module begins with an initial input feature dimension of 10 × 1024, where the first dimension represents the number of nodes in a visual graph, and the second dimension represents the node feature dimensions. The first Graph Attention Convolution (GATConv) layer processes this input while maintaining the feature dimension at 10 × 1024. This is followed by a batch normalization (BN) layer that normalizes the feature dimensions without altering their size. A ReLU activation function is then applied to introduce non-linearity while preserving the feature dimension. This sequence is repeated in the second GATConv layer, ensuring that the feature dimensions remain consistent throughout the extraction process. The specifics of the feature extraction module are outlined in Table 2.

Feature classification module

Following the extraction module, the feature classification module processes the output features of dimension 10 × 1024. This module includes a fully connected (FC) layer that reduces each node’s feature dimension from 1024 to 256. A ReLU activation function is then applied to introduce non-linearity. Another FC layer transforms the feature dimensions from 10 × 256 to 10 × 4, where 4 represents the number of classification categories. The specific structure of the feature classification module is also detailed in Table 2. The final output is obtained by applying a softmax function to ensure the output is suitable for classification tasks.

Implementation and training environment

The proposed framework was implemented using PyTorch on an RTX 3060 GPU. Time-domain vibration signals were segmented into non-overlapping segments with a length of 1024, ensuring independent data segments. This approach was chosen to maintain simplicity and computational efficiency while avoiding potential data leakage. Each visual graph was constructed using n = 10 nodes, representing 10 sub-samples of similar signal types. The GAT diagnostic model was configured with a K = 1 attention head, indicating the use of a single-head attention network.

Training hyperparameters

The model was trained with a batch size of 64 over 100 epochs. The Adam optimizer, known for its adaptive learning rate capabilities, was used with an initial learning rate set at 0.0001. These training parameters were selected to ensure effective and efficient model convergence, as summarized in Table 2.

Experimental results and analysis

Graph construction from time-domain signals and attention visualization

Figure 6 shows how the time-domain vibration signals differ under different spindle conditions, along with the graph representations constructed from each signal. This helps clarify how the time-domain signals are transformed into graph structures, as described in Section “Defect detection process” Step 1.

Figure 6.

Vibration signals and GAT attention visualization for four spindle conditions: (a) normal, (b) assembly, (c) drawbar, and (d) bearing.

In the left column of Figure 6, the time-domain signals show differences in amplitude across conditions. Each signal is divided into fixed-length segments of 1024 points, and each segment is used as a node in the graph. The right column shows the resulting graphs, all constructed using the same undirected path structure, which leads to similar graph topologies across all cases. The node features are the raw signal segments themselves. These features are not visualized directly in the figure, and each node is labeled only by its index for clarity. The edge colors indicate the attention weights computed by the first GAT layer (GATConv1). These weights reflect how much influence each neighboring node has during message passing. Edges with warmer colors (e.g. red) have higher weights, meaning they contribute more when updating node embeddings. This figure shows that although the graph structure remains the same across conditions, the GAT can still learn to focus on important differences in the node features using the attention mechanism.

Evaluation of GAT classification capability training loss and accuracy

Figure 7 illustrates the trends in training loss (black curve) and accuracy (red curve) throughout the training process using the GAT algorithm. The horizontal axis represents the number of epochs, ranging from 0 to 100, while the left vertical axis indicates the training loss, ranging from 0 to 5. The right vertical axis represents the training accuracy, ranging from 75% to 100%. This study uses GAT to train graph data constructed from vibration signal data to detect potential spindle defect types. The observed training loss and accuracy trends reveal that, as the number of training iterations (epochs) increases, the model’s training loss (black dashed line) gradually decreases and stabilizes after approximately 20 epochs, eventually approaching zero. This trend suggests that the model effectively learns the data patterns with minimal overfitting. Concurrently, the training accuracy rapidly improves during the initial stages, reaching nearly 100% accuracy after about 20 epochs. This indicates that the GAT model achieves high accuracy and efficiency in classifying defect categories within the training data, with no apparent signs of overfitting. These results suggest that the GAT model exhibits both good convergence speed and high accuracy in processing vibration signal data. However, further test set evaluation is necessary to assess the model’s generalization capability.

Figure 7.

Training loss and training accuracy curve.

Evaluation of GAT classification capability

To assess the classification performance of the GAT model, the test set was fed into the trained GAT diagnostic model, and a confusion matrix was generated, as depicted in Figure 8. The results from the confusion matrix indicate that the GAT model achieved 100% accuracy across all four categories. Every actual label (true category) was correctly classified, and no misclassifications were observed. Specifically, the diagonal elements of the confusion matrix are 360, 360, 270, and 180, demonstrating that all data points with actual labels 1, 2, 3, and 4 were accurately predicted, resulting in a 100% accuracy rate. The off-diagonal elements are all zeros, further confirming the absence of classification errors in any category. These findings demonstrate that the GAT model exhibits exceptional accuracy and stability in processing the vibration signal dataset.

Figure 8.

Confusion matrix.

Comparative analysis of validation results across different machine learning methods

This study validates the advantages of the proposed method by comparing it with kNN, SVM, and various graph neural network models, including GCN and GAT. To ensure fairness in the comparative experiments and mitigate the effects of randomness, all models were tested 10 times, with the classification results on the test set recorded. The specific experimental results are as presented follows:

Binary classification results analysis

In the binary classification task, which simplifies the original four-class classification problem into a two-class problem, all labels except for the “Normal” type were grouped under the “Defect” category. The results of 10 experiments for all models on the test set are illustrated in Figures 9 and 10. In these figures, the horizontal axis represents the number of experiments, and the vertical axis represents the accuracy. The average accuracy of the 10 experiments is summarized in Table 3. Both GAT and GCN achieved an average accuracy of 100%. The graph neural network models consistently distinguished between Normal and Defect categories, highlighting their effectiveness in defect classification tasks. In contrast, the average accuracies for kNN and SVM were slightly lower. Specifically, kNN achieved an average accuracy of 99.69% for the Normal type and 99.92% for the Defect type, while SVM achieved 99.66% for the Normal type and 99.90% for the Defect type. These results indicate that while kNN and SVM perform well in binary classification tasks, graph neural network models exhibit higher accuracy, particularly when handling complex data. This superiority underscores the strong capability of GNNs in defect classification, making them more suitable for scenarios involving intricate data structures.

Figure 9.

Accuracy for normal type (Label 1).

Figure 10.

Accuracy for defect types (Labels 2, 3, and 4 included).

Table 3.

Binary classification results.

Model	Mean accuracy (%)
Model	Normal type	Defect type
KNN	99.69	99.92
SVM	99.66	99.90
GCN	100	100
GAT	100	100

Multi-class classification results analysis

In the multi-class classification task, the results of 10 experiments for all models on the test set are illustrated in Figures 11 to 14. In these figures, the horizontal axis represents the number of experiments, and the vertical axis represents accuracy. The average accuracies from these 10 experiments are summarized in Table 4. The GAT model demonstrated superior classification performance, achieving an average accuracy of 100% across all categories (Label 1 to Label 4). In contrast, the GCN model exhibited slightly lower average accuracies for Label 3 and Label 4, achieving 99.62% and 95.33%, respectively. Despite this, GCN outperformed traditional kNN and SVM models in the multi-class classification task. Specifically, the kNN model achieved average accuracies of 94.25% for Label 3 and 87.72% for Label 4, while the SVM model attained average accuracies of 93.81% for Label 3 and 88.94% for Label 4. These results reveal that traditional machine learning methods, such as kNN and SVM, experience a notable decline in accuracy when handling categories with smaller data volumes, such as Label 4. These findings highlight the advantages of graph neural network models, particularly GAT, in multi-class classification tasks. GNNs demonstrate robustness and high accuracy even in complex data scenarios, while traditional models, though adequate, tend to struggle with lower data volume categories. These results highlight the efficacy of GNNs in applications requiring detailed handling of data-rich and data-sparse categories.

Figure 11.

Label 1 accuracy.

Figure 12.

Label 2 accuracy.

Figure 13.

Label 3 accuracy.

Figure 14.

Label 4 accuracy.

Table 4.

Mean accuracy of model over 10 experiment runs (%).

Model	Mean accuracy (%)
Model	Label 1 (%)	Label 2 (%)	Label 3 (%)	Label 4 (%)
KNN	99.69	100	94.25	87.72
SVM	99.13	99.74	93.81	88.94
GCN	100	100	99.62	95.33
GAT	100	100	100	100

Noise resistance comparison and analysis

To validate the generalization and noise resistance of the proposed methods, the GCN and GAT diagnostic models were compared under both noiseless conditions and noise conditions with SNR of 0 and −5 dB. The definition of SNR is shown in equation (5), where dB indicates the unit of SNR, $P_{signal}$ represents the signal power, and $P_{noise}$ represents the noise power.

$SNR (dB) = 10 \log (\frac{P_{signal}}{P_{noise}})$ (5)

To compare the performance of GCN and GAT under an SNR of −5 dB, noise was added to the original signals, and the results were subsequently analyzed. The added noise simulated harsh operational environments, providing insights into the reliability of each model under adverse conditions. All hyperparameters were kept consistent across the models. The visualizations, generated using t-distributed Stochastic Neighbor Embedding (t-SNE), are divided into two sections: the left side illustrates the positional distribution based on each spindle’s number, while the right side displays the corresponding label number. In the time-domain features visualization, shown in Figure 15, the four categories exhibit a certain degree of overlap, making it difficult to distinguish between them. This indicates that time-domain features alone are insufficient for accurate classification under noisy conditions. In the GCN visualization results (Figure 16), Label 3 and Label 4 features overlap significantly, while Label 1 and Label 2 form distinct clusters, resulting in three prominent clusters. This pattern suggests that GCN can partially separate categories but struggles to distinguish closely related categories under noisy conditions. In contrast, the GAT model’s visualization (Figure 17) shows that it can effectively separate the four category clusters with greater distances between them. This separation suggests that GAT adaptively focuses on relevant features, demonstrating enhanced feature extraction and clustering performance.

Figure 15.

Time-domain features visualization.

Figure 16.

GCN visualization results.

Figure 17.

GAT visualization results.

These observations highlight that while both models exhibit specific strengths, the GAT model excels in scenarios with substantial noise interference. It demonstrates its ability to maintain high classification accuracy by effectively separating and clustering the relevant features.

The study assesses the detection accuracy of the GCN and GAT models in multi-class defect detection tasks under varying SNR conditions. The experimental setup includes environments without noise, 0 dB, and −5 dB noise levels, evaluating detection accuracy across four distinct defect categories (Label 1, Label 2, Label 3, and Label 4). Notably, the dataset is imbalanced, with Label 4 containing fewer data points compared to the other labels.

In the noiseless environment, Table 5 shows that the GCN model achieved detection accuracies of 100%, 100%, 99.62%, and 95.33% for the respective categories. In comparison, the GAT model attained 100% accuracy across all categories, indicating that both models are highly effective at defect identification under ideal conditions, with GAT exhibiting perfect performance. Under the 0 dB SNR condition, the GCN model demonstrated detection accuracies of 99.94%, 98.08%, 100%, and 76.05% for the four categories, while the GAT model achieved accuracies of 100%, 99.44%, 100%, and 98.94%. Although GCN’s accuracy for Label 3 improved to 100%, its performance for Label 4 significantly dropped to 76.05%. In contrast, the GAT model maintained high accuracy across all categories, particularly exhibiting stronger noise resistance for Label 4. Under the −5 dB SNR condition, the detection accuracies of the GCN model further decreased to 99.91%, 96.61%, 100%, and 55.83% for the four categories, respectively. Meanwhile, the GAT model achieved 99.94%, 98.38%, 100%, and 97.11% accuracy. In this high-noise scenario, the GCN model’s accuracy for Label 4 significantly declined while the GAT model maintained high accuracy, particularly demonstrating effective noise resistance for Label 4.

Table 5.

Classification accuracy of GCN and GAT models for multi-class defect detection under varying SNR conditions.

Model	SNR (dB)	Label 1 (%)	Label 2 (%)	Label 3 (%)	Label 4 (%)
GCN	Noiseless	100	100	99.62	95.33
	0	99.94	98.08	100	76.05
	−5	99.91	96.61	100	55.83
GAT	Noiseless	100	100	100	100
	0	100	99.44	100	98.94
	−5	99.94	98.38	100	97.11

From these results, several key conclusions can be drawn. Under noiseless conditions, both models accurately detected defect categories, with the GAT model slightly outperforming the GCN model by achieving 100% accuracy. The GAT model demonstrated superior noise resistance in noisy environments, particularly under high noise conditions, especially in detecting Label 4. In contrast, the GCN model’s accuracy significantly decreased as noise levels increased, particularly for Label 4, where accuracy dropped to 55.83%. The smaller number of data points for Label 4 may have contributed to greater fluctuations in accuracy across different SNR conditions. Nonetheless, the GAT model’s performance remained relatively stable across these categories, emphasizing its advantage in handling imbalanced datasets. The GAT model’s superior performance is attributed to its self-attention mechanism, which dynamically assigns varying attention weights based on input data features. This capability allows the GAT model to capture critical information more flexibly, even under high-noise and data imbalance conditions, thus maintaining high accuracy. The GAT model is recommended to ensure higher detection accuracy if significant noise is anticipated in practical applications. Although the GCN model can be effective in low-noise environments, its performance significantly deteriorates in high-noise conditions. In conclusion, the experimental results highlight the reliability of the GAT model in noisy and imbalanced data scenarios, underscoring its practical utility in defect detection.

Given the imbalanced dataset used in the experiments, where the total number of samples varies across categories, with Label 4 containing relatively fewer samples than the others, the GAT model demonstrated consistent performance in noisy environments. This consistency is attributed to its attention mechanism, which mitigates the impact of noise by assigning higher weights to significant neighboring nodes, thereby potentially enhancing overall model accuracy. Conversely, the GCN model exhibited reduced accuracy under these conditions. To demonstrate that data imbalance leads to reduced accuracy for the unmodified GCN method in noisy environments, the study balanced the sample sizes of the other three categories to match the number of samples in Label 4 through random sampling, as shown in Table 6. The experiment was conducted 10 times, with the average accuracy of all tests used as the final accuracy measure. The results in Table 7 show an improvement in the GCN model’s accuracy with a balanced dataset, highlighting the importance of data balance in improving model performance under noisy conditions.

Table 6.

Sample sizes of built-in spindle defect types adjusted for balanced data.

Label number	Training/testing sample	Defect type	Spindle number
1	420/180	Normal	1
			2
			3
			4
2	420/180	Assembly	5
			6
			7
			8
3	420/180	Drawbar	9
			10
			11
4	420/180	Bearing	12
4	420/180	Bearing	13

Table 7.

Accuracy of GCN and GAT models with balanced sample sizes across categories under different SNR conditions.

Model	SNR (dB)	Label 1 (%)	Label 2 (%)	Label 3 (%)	Label 4 (%)
GCN	Noiseless	100	100	99.94	99.50
	0	100	96.33	100	94.83
	−5	99.88	94.27	100	80.38
GAT	Noiseless	100	100	100	100
	0	99.94	99.72	100	98.27
	−5	99.16	96.33	100	97.11

Analyzing the balanced dataset for training and testing the GCN model reveals performance variations across different labels. Specifically, the accuracy for Label 2 exhibited a slight decrease, while a notable improvement was observed for Label 4. In noiseless conditions, the accuracy for Label 4 significantly increased from 95.33% to 99.50% for the GCN model. Under the 0 dB SNR condition, the accuracy for Label 4 increased by 18.78%, rising from 76.05% to 94.83%, while the accuracy for Label 1 saw a marginal improvement of 0.06%. Conversely, the accuracy for Label 2 decreased by 1.75%. Under the −5 dB SNR condition, the accuracy for Label 4 improved by 24.55%, from 55.83% to 80.38%, while the accuracies for Label 1 and Label 2 decreased slightly, by 0.02% and 2.34%, respectively. In contrast, the balanced dataset did not result in significant differences in the performance of the GAT model compared to the imbalanced dataset. Under the 0 dB SNR condition, the GAT model’s accuracy experienced a slight decrease but remained high. Under the −5 dB SNR condition, the GAT model maintained high accuracy across all categories, with particularly notable performance for Label 4, achieving an accuracy of 97.11%, demonstrating strong noise tolerance.

Balancing the dataset led to substantial improvements in the GCN model’s accuracy for Label 4, which previously had fewer samples. However, there were slight decreases in accuracy for other categories. This indicates that data imbalance significantly affects the GCN model’s performance in noisy environments. The GAT model, equipped with a self-attention mechanism, adaptively assigns weights, benefiting categories with fewer samples, such as Label 4. Even with imbalanced data, the GAT model effectively mitigates noise by assigning higher weights to important neighboring nodes, thereby maintaining high accuracy. In practical applications, where defect data is often scarce, achieving a balanced sample size poses challenges. Traditional machine learning models require balanced training datasets to avoid bias toward certain defect types. This study underscores the limitations of the GCN model in noisy environments due to data imbalance, whereas the GAT model effectively leverages all data for training and testing, maintaining high accuracy even with imbalanced data.

This study highlights the significant advantages of the GAT model under high noise and data imbalance conditions, particularly for Label 4, where it outperforms the GCN model by 16.73%. This provides valuable insights for selecting and applying mechanical defect detection models. Regardless of dataset balance, the GAT model accurately classifies all categories, demonstrating superior performance in defect detection of vibration signal data attributed to its self-attention mechanism, which effectively assigns weights to essential features and maintains high accuracy even in challenging conditions.

Analysis and comparison of the impact of graph structures on accuracy

The GAT model was used to diagnose and classify various anomalies under noiseless, 0 dB SNR, and −5 dB SNR conditions. All parameters were kept constant except for the graph topology. Each experiment was conducted 10 times, with the average accuracy of these tests recorded to ensure fairness and reproducibility. Figures 18 to 20 present the classification accuracy for four different graph structures (directed, undirected, kNN, complete) across four labels (Label 1, Label 2, Label 3, Label 4) under the noiseless, 0 dB, and −5 dB SNR conditions. The horizontal axis represents accuracy (%), and the vertical axis represents the four different labels, with different legends indicating various graph structures. The average accuracy data is summarized in Table 8.

Figure 18.

Classification accuracy results of four different graph structures under noiseless conditions.

Figure 19.

Classification accuracy results of four different graph structures under 0 dB SNR conditions.

Figure 20.

Classification accuracy results of four different graph structures under −5 dB SNR conditions.

Table 8.

Classification accuracy of four different graph structures for four labels.

Graph structures	Label 1 (%)	Label 2 (%)	Label 3 (%)	Label 4 (%)
Noiseless
Directed	100	100	99.96	97
Undirected	100	100	100	100
kNN	99.72	100	100	98.72
Complete	100	100	100	91.66
0 dB SNR
Directed	99.97	99.27	100	89
Undirected	100	99.44	100	98.94
kNN	99.13	99.33	100	95.66
Complete	96.94	100	100	90.55
−5 dB SNR
Directed	99.41	98.75	100	86.16
Undirected	99.94	98.38	100	97.11
kNN	99.44	99.13	100	92.11
Complete	96.94	99.16	100	86.66

Under noiseless conditions (Figure 18), the directed graph achieved 100% accuracy for most labels, indicating excellent performance. The undirected graph also attained 100% accuracy across all labels, demonstrating robustness. In contrast, the kNN graph displayed slightly lower accuracy, with 99.72% for Label 1 and 98.72% for Label 4, while achieving 100% for the other labels. The complete graph, however, exhibited significantly lower accuracy for Label 4, with an accuracy of 91.66%.

Under the 0 dB SNR condition (Figure 19), all graph structures experienced a slight reduction in accuracy compared to the noiseless condition. The directed graph’s accuracy for Label 4 decreased to 89%, while it performed well for other labels. The undirected graph maintained high performance, achieving 100% accuracy for Label 1 and 98.94% for Label 4. The kNN graph also exhibited relatively high accuracy, with 99.13% for Label 1 and 95.66% for Label 4. In contrast, the complete graph demonstrated a more significant decline, with accuracy for Label 1 dropping to 96.94% and for Label 4 to 90.55%.

Under the −5 dB SNR condition (Figure 20), all graph structures experienced a further decline in accuracy due to the added noise. The directed graph’s accuracy for Label 4 notably decreased to 86.16%, representing a decline of 10.84%. The undirected graph maintained high accuracy, with Label 1 maintaining 99.94% accuracy and Label 4 at 97.11%, showing only minor decreases. The kNN graph experienced a more significant reduction, with its accuracy for Label 4 decreasing to 92.11%, a drop of 6.61%. The complete graph exhibited the most substantial decline, with its accuracy for Label 1 falling to 96.94% (a 3.06% decrease) and for Label 4 dropping to 86.66% (a 5.00% decrease).

The undirected graph structure demonstrates the highest performance in defect detection and classification, particularly in high-noise environments and when dealing with imbalanced data. This structure effectively leverages comprehensive information and maintains stability. In contrast, the directed graph structure may hinder information propagation due to its directional edges, resulting in less effective use of critical neighboring node information. The kNN graph performs adequately but may overlook essential details by only considering the nearest neighbors. Meanwhile, despite incorporating all possible edges, the complete graph suffers from excessive redundancy and increased noise, which reduces efficiency and accuracy. These findings suggest that employing the undirected graph structure in GAT models significantly improves classification accuracy, especially in scenarios with imbalanced data. This insight is essential for optimizing GAT model applications in defect detection, ensuring more reliable and accurate anomaly classification across various conditions.

Conclusion

In manufacturing, particularly concerning built-in spindles, detecting defects during pre-installation is crucial. Traditional methods, such as manual visual inspection and basic mechanical tests, often fall short, potentially missing subtle but significant issues. To address this challenge, a novel approach using GAT has been developed. GAT, an advanced deep learning technique, operates on data structured as graphs representing complex relationships within data. In this study, the vibrational signals from spindles were modeled as graphs, allowing the application of a GAT model to analyze and learn from the complex patterns within the data. Data from 13 identical spindles were collected, with their vibration signals converted into graph representations. The GAT model’s ability to apply varying levels of attention to different parts of the graph enables it to highlight the most relevant features and patterns for defect detection. This adaptability proved crucial in identifying subtle indicators of potential defects, surpassing traditional quality control methods.

Experimental results showed that the GAT-based approach outperformed existing techniques, including KNN, SVM, and other graph-based methods such as GCN. The GAT model consistently demonstrated high accuracy in defect identification and classification, showing exceptional resilience to noise, a common issue in industrial environments. Notably, the GAT model’s performance under various SNRs, simulating different background noise levels, remained high. Even in noisy conditions intended to replicate adverse operational environments, the GAT model maintained high classification accuracy across different defect types, significantly outperforming the GCN model, especially in high-noise scenarios. Additionally, the study underscored the critical role of graph structure in GAT model performance. Among the various graph types tested (directed, undirected, KNN, complete), the undirected graph proved to be the most effective in both low-noise and high-noise conditions, achieving an optimal balance between information utilization and stability.

The implications of this research extend beyond spindle manufacturing. Manufacturers can significantly enhance their defect detection capabilities by integrating GAT into quality control processes, ensuring superior product quality and reliability. This precision during pre-installation can reduce post-purchase maintenance and replacements, leading to cost savings and increased consumer trust. This study represents a significant advancement in applying advanced deep-learning techniques to address real-world industrial challenges, offering a powerful tool for improving accuracy, efficiency, and innovation in manufacturing quality control. Future work will focus on further enhancing GAT model performance by refining algorithms to adapt to evolving industrial environments, exploring additional graph structure optimization techniques, and incorporating more sensor data to improve diagnostic capabilities. Testing these enhanced methods across various manufacturing environments will validate their effectiveness and reliability. Continuous refinement of these models and exploration of their applications promise significant improvements in the efficiency and accuracy of defect detection in manufacturing processes.

Footnotes

Handling Editor: Tiago Alexandre Narciso da Silva

ORCID iD

Kuo-Hao Li

Funding

The author(s) received no financial support for the research,authorship,and/or publication of this article.

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research,authorship,and/or publication of this article.

Data availability statement

The data cannot be made publicly available upon publication because they contain commercially sensitive information. The data that support the findings of this study are available upon reasonable request from the authors.

References

Tai

C-Y.

Model-based spindle health monitoring. PhD Thesis, University of British Columbia, Canada, 2024.

Huang

Yan

Liu

, et al. Research on process quality prediction and control of spindle housings in flexible production lines. Appl Sci 2023; 13: 8371.

Lin

Zhu

Zhang

, et al. A novel framework for bearing fault diagnosis across working conditions based on time-frequency fusion and multi-sensor data fusion. Meas Sci Technol 2024; 35: 126205.

Jiang

Shi

Sheng

, et al. Lightweight CNN architecture design for rolling bearing fault diagnosis. Meas Sci Technol 2024; 35: 126142.

Ren

Guo

Liu

, et al. An improved morphological filtering and feature enhancement method for rolling bearing fault diagnosis. Meas Sci Technol 2024; 35: 126143.

Wang

Multilevel feature fusion of multi-domain vibration signals for bearing fault diagnosis. Signal Image Video Process 2024; 18: 99–108.

Jaber

AA.

Diagnosis of bearing faults using temporal vibration signals: a comparative study of machine learning models with feature selection techniques. J Fail Anal Prev 2024; 24: 752–768.

Hou

Zheng

Pan

, et al. Multivariate multi-scale cross-fuzzy entropy and SSA-SVM-based fault diagnosis method of gearbox. Meas Sci Technol 2024; 35: 056102.

Liang

Chen

Zhong

, et al. Multi-feature fusion-based TCA-WKNN cross-sensor fault diagnosis method for dynamic weighing. Meas Sci Technol 2023; 35: 015132.

10.

Dong

Zhao

Cui

An intelligent bearing fault diagnosis framework: one-dimensional improved self-attention-enhanced CNN and empirical wavelet transform. Nonlinear Dyn 2024; 112: 6439–6459.

11.

Xue

Yang

Chen

, et al. A novel local binary temporal convolutional neural network for bearing fault diagnosis. IEEE Trans Instrum Meas 2023; 72: 1–13.

12.

Zhang

Chen

Liu

, et al. Domain discrepancy-guided contrastive feature learning for few-shot industrial fault diagnosis under variable working conditions. IEEE Trans Industr Inform 2023; 19: 10277–10287.

13.

Long

Qin

Yang

, et al. Discriminative feature learning using a multiscale convolutional capsule network from attitude data for fault diagnosis of industrial robots. Mech Syst Signal Process 2023; 182: 109569.

14.

Chen

Fan

, et al. Improved convolutional neural network chiller early fault diagnosis by gradient-based feature-level model interpretation and feature learning. Appl Therm Eng 2024; 236: 121549.

15.

Lee

Kim

Chae

, et al. Self-supervised feature learning for motor fault diagnosis under various torque conditions. Knowl Based Syst 2024; 288: 111465.

16.

Bao

Liang

, et al. A broad learning model guided by global and local receptive causal features for online incremental machinery fault diagnosis. Expert Syst Appl 2024; 246: 123124.

17.

Dang

V-H

Nguyen

HX.

Multi-task framework for vibration-based structural damage detection of spatial truss structure using graph learning. J Vib Eng Technol 2024; 12(7): 7763–7779.

18.

Zhou

Long

Yin

, et al. Fault diagnosis of helicopter tail-drive system using a multi-grained hierarchical message graph convolutional networks. Nondestr Test Eval 2025; 40: 1141–1160.

19.

Xie

, et al. A novel fault diagnosis method using a multi-receptive field graph convolutional network integrated with Gaussian filters. 2024.

20.

Zhao

Sun

, et al. Multireceptive field graph convolutional networks for machine fault diagnosis. IEEE Trans Ind Electron 2020; 68: 12739–12749.

21.

Yan

Liao

Zhang

, et al. Graph convolutional network based on CQT spectrogram for bearing fault diagnosis. Machines 2024; 12: 179.

22.

Yuan

, et al. Speed adaptive graph convolutional network for wheelset-bearing system fault diagnosis under time-varying rotation speed conditions. J Vib Eng Technol 2024; 12: 247–258.

23.

Veličković

Cucurull

Casanova

, et al. Graph attention networks. arXiv preprint arXiv:1710.10903, 2017.

24.

Jiang

Wang

Zhang

, et al. Semi-supervised few-shot fault diagnosis driven by multi-head dynamic graph attention network under speed fluctuations. Digit Signal Process 2024; 151: 104528.

25.

Zhang

Fan

, et al. Embedding-enhanced graph attention networks for imbalanced industrial fault diagnosis. IEEE Access 2024; 13: 125119–125130.

26.

Cui

Shao

Luo

, et al. Center weighted convolution and GraphSAGE cooperative network for hyperspectral image classification. IEEE Trans Geosci Remote Sens 2023; 61: 1–16.

27.

Shao

Xie

Ning

, et al. An efficient GCN accelerator based on workload reorganization and feature reduction. IEEE Trans Circuits Syst I Regul Pap 2023; 71: 646–659.

28.

Ding

Chen

Liu

, et al. ASG-HOMGAT: a high-order multi-head graph attention network with adaptive small graph structure for rolling bearing fault diagnosis. Meas Sci Technol 2024; 35: 065018.

29.

Hou

Zhang

Chen

, et al. Unsupervised graph anomaly detection with discriminative embedding similarity for viscoelastic sandwich cylindrical structures. ISA Trans 2024; 147: 36–54.

Leveraging graph attention networks for enhanced latent defect detection in precision built-in spindle assembly lines

Abstract

Keywords

Introduction

Methodology

Definition and types of graphs

Comparison of deep learning models: GCN versus GAT

GCN

GAT

Defect detection process

Data and model configuration

Experimental data description

GAT parameter configurations

Network architecture and training parameters

Feature extraction module

Feature classification module

Implementation and training environment

Training hyperparameters

Experimental results and analysis

Graph construction from time-domain signals and attention visualization

Evaluation of GAT classification capability training loss and accuracy

Evaluation of GAT classification capability

Comparative analysis of validation results across different machine learning methods

Binary classification results analysis

Multi-class classification results analysis

Noise resistance comparison and analysis

Analysis and comparison of the impact of graph structures on accuracy

Conclusion

Footnotes

ORCID iD

Funding

Declaration of conflicting interests

Data availability statement

References