Sage Journals: Discover world-class research

Abstract

Although the computation amount involved in the image processing is very large, image information which is very intuitive and easy to be understood has attracted great many attentions in the fault diagnosis of machines. In order to extract useful features from the images accurately and perfectly, a novel mechanical fault diagnosis method was proposed by the combination of the multi-kernel non-negative matrix factorization and multi-kernel support vector machine. The genetic algorithm was used to optimize the parameters of both multi-kernel non-negative matrix factorization and multi-kernel support vector machine. Experiments were used to validate the efficacy of the proposed method. It is shown that the multi-kernel function combined with the polynomial kernel function and radial-based kernel function can describe the fault feature more perfectly in the kernel space than a single kernel function. Sound accuracy can be obtained in the application of the bearing fault diagnosis. Compared with the fault diagnosis method based on the sparse non-negative matrix factorization, the proposed method is more accurate in the condition identification of rotor.

Keywords

Multi-kernel method non-negative matrix factorization support vector machine data de-dimension mechanical fault diagnosis

Introduction

As one of the important parts of the engineering machines, the fault diagnosis system can efficiently monitor the condition of the machine and maintain the performance safely and reliably. Because of the non-stationarity and non-linearity of the vibration collected by the monitoring system, the regular fault diagnostic methods, performed on the basis of the calculation of the statistic indexes, cannot sufficiently extract the fault information. Therefore, images were gradually applied in the monitoring and diagnosis process of the machine in recent years.

For example, images of the temporal waveform, power or amplitude spectrum, time–frequency distribution, time-scale distribution and non-linear spectrum, and so on have been applied on the fault diagnosis of machines with different degrees. Compared with the statistic index, the image is a presentation of the feature with much higher dimension. For example, a memory with the dimension of 1024 × 512 is needed for the storage of the time–frequency distribution image, if the temporal waveform of 1024 samplings was transformed to the time–frequency domain with a window of 512 samplings. Containing more fault information, the image presentation of the vibration signal is more favorable to the maintenance of the machine, although the larger memory, longer calculation time, and more knowledge are needed. In this case, a great many attentions have been paid on the image-based fault diagnosis methods in the past decades.^1–8 However, the high dimension makes big troubles for the application of the image-based diagnostics in engineer applications because the dimension of the input vector increases sharply in the processing program. In this instance, reducing the dimension of input signal is the key step for the application of the image-based diagnosis method, and many dimensionality reduction methods, shown in Table 1, have been investigated in different areas.

Table 1.

Dimensionality reduction methods and corresponding applications in recent years.

Dimensionality reduction methods	Applications	Literatures
Principal components analysis (PCA)	Face recognition	Thakur et al.⁹ and Buntine¹⁰
Linear discriminant analysis (LDA)	Face recognition	Kim and Kittler,¹¹ Thakur et al.,¹² Belhumeur et al.,¹³ Baudat and Anouar¹⁴ and Xiong et al.¹⁵
Projection pursuit (PP)	Biomedical analysis	Mandelzweig et al.,¹⁶ Yang¹⁷ and Gribonval¹⁸
Kernel principal components analysis (KPCA)	Face recognition	Pan et al.¹⁹ and Lange et al.²⁰
Singular value decomposition (SVD)	Microarray data analysis	Eckart and Young²¹ and Wall et al.²²
Non-negative matrix factorization (NMF)	Microarray data analysis, face recognition	Lin,²³ Zhang et al.²⁴ and Lee and Seung²⁵
Multi-dimensional scaling (MDS)	Scientific visualization, data mining	Borg and Groenen²⁶ and Bronstein et al.²⁷
Locally linear embedding (LLE)	Face, lips and handwritten recognition	Roweis and Saul²⁸
Isomap	Face, hands and handwritten recognition	Tenenbaum et al.²⁹
Local tangent space alignment (LTSA)	Face recognition	Zhang and Zha³⁰ and Feng et al.³¹

As one of the initial literatures on the dimensionality reduction, the principal components analysis (PCA) has been proposed by Hotelling³² in 1933 and applied in the statistical analysis of the psychological data. Consequently, a series of dimensionality reduction methods, including linear discriminant analysis (LDA), locally linear embedding (LLE), local tangent space alignment (LTSA), and so on, were developed on the basis of different optimization criterions. However, in engineering applications, non-linear features are imbedded in the high-dimension spaces, and the traditional linear dimensionality reduction methods can hardly deal with problems involving non-linearity. Therefore, non-linear dimensionality reduction methods, including non-negative matrix factorization (NMF), manifold alignment, and the kernel-based dimensionality reduction methods, attracted many attentions in the past decades. Suppose that the samples were located or approximately located on the non-linear manifold in a high-dimension space, Zhang et al.^33,34 introduced the manifold learning to the dimensionality reduction of the face recognition. Mapping the original data to a higher space, the kernel-based method can transform the non-linear problem to be a linear one in the dimensionality reduction. For the data processing with very high dimension in engineering applications, the kernel-based methods can effectively save the consuming time and memories used by the regular methods.³⁵ Compared with the manifold learning method, NMF is more stable and mature in the application of dimensionality reduction. Involving the sparse NMF (SNMF),^36,37 kernel NMF (KNMF),^38,39 and so on, the NMF can present the original data with a coefficient matrix with less dimensions. However, a single kernel function cannot perfectly present all features imbedded in the data, many useful features may be lost in the mapping program, and the diagnosis accuracy may be decreased. In this instance, by combining the multi-kernel method and the NMF, a novel fault diagnosis method is proposed in the article and applied for the bearing fault diagnosis and rotor condition identification.

Multi-KNMF

NMF

Adding a negative restriction on all elements of the matrix, the NMF was proposed by Lee and Seung²⁵ in 1999. The mechanism of the NMF is given as follows.

Suppose that m samplings with the dimension of n are denoted by a matrix $V_{n \times m}$ and all elements of the matrix are non-negative. Then, the NMF is a linear factorization of the original matrix and is given by

$V_{n \times m} \approx W_{n \times r} H_{r \times m}$ (1)

where $W_{n \times r}$ is named as the base matrix, $H_{r \times m}$ is the coefficient matrix, and r is the key dimension of the factorization. Because all elements are non-negative, any vector in the original matrix V can be presented as the synthesis of all vectors in the base matrix W with the weights determined by the coefficient matrix H . If the key dimension r matches r(m + n) < mn, the dimensionality reduction can be performed by replacing the original matrix with the coefficient matrix.

In order to obtain the base matrix W and the coefficient matrix H , a loss function is needed to evaluate the similarity between the original matrix V and the product of the factorized matrixes. Generally, the loss function of NMF can be calculated by the Euclidean distance or the Kullback–Leibler dispersion. When the loss function is determined by the Euclidean distance, the optimization function of NMF is given by

$\begin{matrix} min \begin{matrix} ‖ V - WH ‖^{2} = \sum_{ij} {[V_{ij} - (WH)_{ij}]}^{2} \end{matrix} \\ \begin{matrix} W, H \geq 0 \end{matrix} \end{matrix}$ (2)

When the loss function is determined by the Kullback–Leibler dispersion, the optimization function of the NMF is given by

$\begin{matrix} min \begin{matrix} D (V - WH) = \sum V_{ij} \log \frac{V_{ij}}{{(WH)}_{ij}} - V_{ij} + {(WH)}_{ij} \end{matrix} \\ \begin{matrix} W, H \geq 0 \end{matrix} \end{matrix}$ (3)

It can be seen that the NMF obtains the matrixes of W and H by solving the optimization problem of equation (2) or (3) with the restriction of W , H ≥ 0. Only when V equals the product of W and H , both equations (2) and (3) reach the minimum value of zero.

In engineering applications, many signals can be presented by the non-negative matrix, and the memory needed for the factorization matrix is much less than the original one. Moreover, the NMF includes implicitly the concept that the original matrix is synthesized by the base and coefficient matrixes. Therefore, the NMF has attracted many attentions in many areas.

Kernel NMF

When the Euclidean distance is used to calculate the loss function of NMF and W , H are sparse matrixes, equation (2) can be rewritten as

$\begin{matrix} min \begin{matrix} \frac{1}{2} ‖ V - WH ‖^{2} + λ \sum_{i = 1}^{n} ‖ h_{i} ‖_{1} \end{matrix} \\ \begin{matrix} W, H \geq 0 \end{matrix} \end{matrix}$ (4)

Apparently, equation (4) is a quadratic programming problem and can be rewritten as

$\begin{matrix} min \begin{matrix} \sum_{i = 1}^{n} \frac{1}{2} h_{i}^{T} (W^{T} W) h_{i} + {(λ - W^{T} v_{i})}^{T} h_{i} + v_{i}^{T} v_{i} \end{matrix} \\ \begin{matrix} W, H \geq 0 \end{matrix} \end{matrix}$ (5)

It can be seen that only products of W ^T W , W ^T V , and V ^T V are needed when calculating the update form of H . From equation (1), the following equations exist

$W^{T} W = {(H^{*})}^{T} V^{T} {VH}^{*}$ (6)

$W^{T} V = {(H^{*})}^{T} V^{T} V$ (7)

where H ^* denotes the Moore–Penrose pseudo-inverse of H . Combining equations (5)–(7), it can be seen that only the product V ^T V needs to be calculated in the optimization program of the NMF. Therefore, replacing the product V ^T V by the kernel function of K( V , V ), the NMF was extended to the kernel space and was named as KNMF. The KNMF can solve many non-linear problems which the linear NMF cannot.

Multi-KNMF-based mechanical fault diagnosis

Multi-kernel design

In engineering applications, the collected data/signals are produced by many sources or hold heterogeneous features, and a single kernel function cannot sufficiently describe whole features imbedded in the data/signals. In this instance, a multi-kernel function can be used to compensate the shortage of the single kernel function, and a new dimensionality reduction method, named multi-KNMF, is proposed here.

In the multi-KNMF, the kernel function is composed of at least two different kernel functions. Apparently, the construction of the multi-kernel function is the key of the multi-KNMF. The most convenient method is the convex combination of the different kernel functions

$K (x, y) = \sum_{j = 1}^{M} β_{j} {\hat{K}}_{j} (x, y), β_{j} \geq 0, \sum_{j = 1}^{M} β_{j} = 1$ (8)

where the kernel function $K (x, y)$ is the linear summation of the matrix sets of ${K_{1}, \dots, K_{M}}$ and satisfies the following conditions

$\begin{matrix} K (x, y) = \sum_{j = 1}^{M} β_{j} {\hat{K}}_{j} (x, y) \\ \begin{matrix} \begin{matrix} K (x, y) \geq 0 \\ trace (K (x, y)) \leq c \end{matrix} \end{matrix} \end{matrix}$ (9)

where $β_{j}$ is the coefficient of ${\hat{K}}_{j} (x, y)$ , and the matrix normalization is given by

${\hat{K}}_{j} (x_{i}, y_{j}) = \frac{{\hat{K}}_{j} (x_{i}, y_{j})}{trace ({\hat{K}}_{j} (x_{i}, y_{j}))}$ (10)

It is known that kernel functions contain the linear kernel, the radial-based function kernel, the polynomial kernel, the sigmoid kernel, and so on, and these functions can be divided into two different groups based on the different feature description capabilities. The first group is the global kernel function which can describe the feature with large scales in the data, while the other one is the local kernel function used to present the feature with small scales. The Gaussian radial-based kernel function is the typical local kernel function

$\hat{K} (x, y) = \exp (- \frac{‖ x - y ‖^{2}}{2 σ^{2}})$ (11)

Comparatively, the polynomial kernel function is one of the global kernel functions

$\hat{K} (x, y) = {(〈 x, y 〉 + 1)}^{q}$ (12)

Combining the global and the local kernel functions, the multi-kernel function can describe both the large- and small-scale features that a single kernel function cannot. Figure 1 illustrates a Gaussian radial-based function kernel, a polynomial kernel function, and the combination of both kernel functions. The center of the test is 0.05. It can be seen that the amplitude variation of the radial-based kernel function is concentrated in the areas around zero when the kernel parameter σ varies. Comparatively, the amplitude of the polynomial kernel varies sharply in the areas beyond zero when the kernel parameter q varies. The combination of both kernel functions varies the amplitude in the whole axis, which is more suitable for the data with various features.

Figure 1.

Combination of the Gaussian and polynomial kernel function: (a) Gaussian kernel function, (b) polynomial kernel function, and (c) combination of both kernel functions.

Based on the above analysis, a multi-kernel function is constructed by the linear summation of the polynomial kernel and Gaussian radial-based function kernel function here in order to satisfy the various features imbedded in the data

$K (x, y) = ρ {\hat{K}}_{G} (x, y) + (1 - ρ) {\hat{K}}_{P} (x, y)$ (13)

where ρ denotes the coefficient of Gaussian kernel function ${\hat{K}}_{G} (x, y)$ , and (1−ρ) is the coefficient of the polynomial kernel function ${\hat{K}}_{P} (x, y)$ . Combining equations (11) and (12), the multi-kernel function can be rewritten as

$K (x, y) = ρ {[(x, y) + 1]}^{q} + (1 - ρ) \exp (- \frac{‖ x - y ‖^{2}}{2 σ^{2}})$ (14)

It can be seen that the performance of multi-kernel function K( x , y ) is determined collaboratively by the coefficient ρ, the Gaussian kernel parameter σ, and the polynomial kernel parameter q.

Multi-KNMF

Based on the construction of multi-kernel function by Gaussian kernel and the polynomial kernel, a novel fault diagnosis method, performed by the combination of the multi-KNMF method and multi-kernel support vector machine (SVM), is proposed in this section. The sketch of the proposed method is listed as follows:

For a given data set V , synthesized by different classes, divide the data into a training set and a test set randomly.

Step 1: Initialize the kernel parameter ρ, σ, q, and the key dimension r of the multi-KNMF and the kernel parameter σ_i, the weight β_i, and the penalty factor C of the SVM.

Step 2: Factorize the training set T by the multi-KNMF based on the initialized parameters ρ, σ, q, and r and obtain the base matrix W and coefficient matrix H of the training set.

Step 3: Input the coefficient matrix H into the classifier of multi-kernel SVM.

Step 4: Optimize parameters of the multi-KNMF and the multi-kernel SVM classifier on the maximum of the classification accuracy of the training set. Generally, the ranges of ρ and β_i are between [0, 1], and q and r are located in the area of Z⁺. Both lnσ and lnC are limited on the grid of [−10, −9, …, 0, …, 9, 10].

Step 5: The genetic algorithm was used for the optimization of all these parameters, and corresponding settings are listed as follows: the initial generation is randomly created with the size of 20 populations. In all, 60% of the generation with higher classification accuracy is selected to perform the crossover operation, while the mutation probability is 0.1. The convergence threshold is 0.0001, and the maximum generation is 1000.

Step 6: Map the test data on the base matrix, obtained by the multi-KNMF on the train data with the optimal parameters. Then, input corresponding coefficient matrix to the multi-kernel SVM classifier with the optimal parameters and diagnose the classification of the test data.

Figure 2 illustrates the sketch of the proposed diagnosis method performed by the combination of the multi-KNMF and the multi-kernel SVM. It can be seen that parameters of both multi-KNMF and multi-kernel SVM are optimized by the genetic algorithm together. Optimization on the multi-KNMF is to find the optimal input vector for the classifier, while optimization on the multi-kernel SVM is to find the optimal classifier for the input vector. The combination of both parts illustrates that the proposed method designs such classification system which can make the combination of the input and the classifier most suitable for the classification. Therefore, the proposed method is inclined to classify the data with the most comprehensive feature. It is worth noting that the multi-kernel function used in the multi-KNMF can be equal to the one used in the multi-kernel SVM or can be different from each other. Ideally, any feature of the classifier input can be mapped to the kernel space by an independent kernel function. However, this idea is time and memory consuming in the training program and can be investigated in the future.

Figure 2.

Fault diagnostic sketch based on the combination of the multi-kernel NMF and multi-kernel SVM.

Case study

Case 1: bearing fault diagnosis

In this section, the vibrations signals collected in the bearing center of Case Western Reserve University was used to validate the efficacy of the proposed method. The test rig was composed of the electrical motor, the torque transducer, the power consumption machine, and the control part. The drive end was supported by a groove bearing of type 6205-2RS JEMSKF, whose geometry size is listed in Table 2. In order to simulate the bearing faults, grooves with a depth of 11 mils (279.4 μm) were planted on the surface of the inner race, outer race, or the rolling element. The diameter of the grooves was set as 7 mils (177.8 μm), 14 mils (355.6 μm), and 21 mils (533.4 μm) to illustrate the different degrees of the faults.

Table 2.

Geometry size of the drive end bearing (mm).

Diameter of the inner race	Diameter of the outer race	Width	Diameter of the rolling element	Pitch diameter
25	52	15	8	39

An acceleration transducer was mounted on the bearing house at the drive end. The vertical bearing vibration, produced by various combinations of loads (0, 1, 2, and 3 hp), types, and degrees of the faults, was collected by the DAT data acquisition machine with 16 channels at a sampling frequency of 48 kHz when the shaft rotated at a frequency of 30 Hz. In all, 40 groups of the vibrations were recorded with a length of 10 s. Randomly select 10 groups collected at the load of 3 hp and divide them into nine sets with the length of 2048 points based on the different combination of the fault type and degree. Detailed descriptions of the nine sets are listed in Table 3. Numbers in the name of each set denote the diameter size of the groove on the rolling element, the inner race, and the outer race. For example, D070707 presents a data set produced by groove with a diameter of 7 mils on the rolling element, inner race, and outer race, respectively. DBALL, DINN, and DOUT present data sets produced by different degrees of ball, inner race, and outer race faults, respectively.

Table 3.

Data set composed of various combinations of bearing faults.

Data set name	Number of training samplings	Number of test samplings	Fault size (mils)	Fault type	Fault label
D070707	40 40 40 40	78 78 78 78	0 7 7 7	N B I O	1 2 3 4
D141414	40 40 40 40	78 78 78 78	0 14 14 14	N B I O	1 2 3 4
D212121	40 40 40 40	78 78 78 78	0 21 21 21	N B I O	1 2 3 4
D071421	40 40 40 40	78 78 78 78	0 7 14 21	N B I O	1 2 3 4
D142107	40 40 40 40	78 78 78 78	0 14 21 7	N B I O	1 2 3 4
D210714	40 40 40 40	78 78 78 78	0 21 7 14	N B I O	1 2 3 4
DBALL	40 40 40	78 78 78	7 14 21	B B B	1 2 3
DINN	40 40 40	78 78 78	7 14 21	I I I	1 2 3
DOUT	40 40 40	78 78 78	7 14 21	O O O	1 2 3

N: normal condition; B: ball fault; I: inner race fault; O: outer race fault.

Figure 3 illustrates typical waveforms produced by different bearing faults. It can be seen that the vibration collected at the normal condition is very similar to that produced by the rolling element fault. In the validation program, all samplings were decomposed by DB10 wavelet transform. As shown in Figure 4, the wavelet coefficients of four layouts were attached to be a waveform with the length of 8192 points and used for the input image of the proposed method. The results are listed in Table 4. It can be seen that all faults are accurately classified even when the size of the input vector is decreased from 8192 to less than 16. Only six fault features are used for the accurate classification of the ball faults with different degrees. Compared with the original size of the wavelet coefficient waveform, the dimensionality of the input vector was sharply reduced by the multi-KNMF. It is shown that the proposed method can be efficiently applied for the fault diagnosis of rolling element bearings.

Figure 3.

Waveforms collected at different bearing conditions: (a) normal, (b) rolling element fault, (c) inner race fault, and (d) outer race fault.

Figure 4.

Wavelet coefficients of different bearing conditions: (a) normal, (b) rolling element fault, (c) inner race fault, and (d) outer race fault.

Table 4.

Optimal kernel parameters and corresponding average accuracy of the application of the proposed method for bearing fault diagnosis.

Serial number	Fault type	Optimal kernel parameters					Average accuracy
Serial number	Fault type	$\ln σ$	q	k	ρ	lnC	Acc
1	D070707	2	1	16	0.7109	−4	100%
2	D141414	0	2	13	0.1406	−6	100%
2	D212121	−2	5	9	0.6953	−5	100%
4	D071421	5	1	14	0.3906	−2	100%
5	D142107	8	1	12	0.1953	2	100%
6	D210714	8	1	12	0.5078	−9	100%
7	DBALL	−13	2	6	0.4609	−4	100%
8	DINN	−1	2	14	0.7891	−5	100%
9	DOUT	6	10	15	0.0391	−6	100%

Case 2: rotor condition identification

The shaft orbit involves both amplitude and phase information of the rotor and is more convenient for the condition monitoring performed by a signal amplitude curve or amplitude–frequency curve. However, the collected shaft orbit may not be completely closed and may be disturbed heavily by the noise. To extract the orbit and to identify the conditions of the rotor automatically, the proposed method was applied on the condition identification of the rotor based on the shaft orbit images in this section.

Composed of the drive system, control system, data acquisition system, and lubrication system, a test rig developed by Xi’an Jiaotong University is shown in Figure 5. In order to simulate the eccentricity of the rotor, a screw of 10.5 g was mounted to depart 0.08 m from the center of the disk at the radial direction. A plastic circle was mounted beyond the disk, and the friction between the disk and the plastic circle may happen when the distance between these two objects changes.

Figure 5.

Rotor test rig.

Eddy current displacement sensors were placed at the vertical and horizontal directions to collect the displacement of the rotor. Corresponding sketch of the sensor displacement is shown in Figure 6. Averaged 210 orbit images of the rotor were created by both directions of the displacements at each condition. Containing $420 \times 560$ pixels in each image, Figure 7 illustrated five different conditions of the rotor. It is worth noting that all signals contained the vibration components produced by the misalignment owing to the imprecision of the rig.

Figure 6.

Schematic diagram of the rotor test rig.

Figure 7.

Shaft orbit images of different rotor condition: (a) normal condition, (b) rubbing, (c) eccentricity, (d) combination of rubbing and misalignment, and (e) combination of parallel and misalignment.

Transforming the orbit image to be a vector with a dimension of 235,200 × 1 and making the summation of all elements to be the unit, all orbit images were normalized first. The generic algorithm was used to optimize the averaged accuracy of the condition identification. In the same time, the SVM classifier based on the SNMF was applied on the condition identification of the rotor. Corresponding settings are listed as follows: the key dimension r = 60, $α = 0.2$ , $N = 200$ . It is worth noting that the feature selection method was applied on the features obtained by the SNMF in order to improve the performance of the SNMF-based method. A total of 35 features were selected to construct the base matrix. The coefficient matrix of the original data mapping on the base matrix with dimension of 35 was input to the SVM classifier. Five-fold cross-validation was used to validate the efficacy of the multi-kernel SVM classifier in both cases, and the results are shown in Table 5.

Table 5.

Optimal kernel parameters and corresponding average accuracy of the application of the proposed method for rotor condition identification.

Serial number	Rotating speed (r/min)	Optimal parameters					Average accuracy	Average accuracy of SNMF
Serial number	Rotating speed (r/min)	$\ln σ$	q	k	ρ	lnC	Acc (%)	Acc (%)
1	300	−11	1	54	0.3672	−3	97.97	92.57
2	600	−8	1	61	0.4688	−1	98.84	99.42
3	1200	−15	13	29	0.4453	11	100	100

SNMF: sparse non-negative matrix factorization.

It can be seen that both methods have got very high accuracy in the condition identification of rotor. Compared with the fault diagnosis method based on the SNMF, the proposed method is more accurate even when the dimension of the input vector is reduced more sharply. When the shaft rotates with the speed of 1200 r/min, only 29 features are used by the proposed method and accurate identifications are carried out. When the shaft rotates with the speed of 600 r/min, the accuracy of proposed method is a little lower than the method based on the SNMF. However, when the shaft rotates with the speed of 300 r/min, the proposed method presents obvious improvements. It is worth noting that the feature selection method is not involved in the proposed method. In summary, the proposed method is more powerful and concise than the method based on the SNMF.

Conclusion

In order to maintain the fault information as good as possible in the dimensionality reduction program, a multi-KNMF was proposed on the basis of the combination of the radial-based function kernel and the polynomial kernel function. By inputting the feature vectors into the multi-kernel SVM classifier, a novel fault diagnosis method was further proposed by the combination of the multi-KNMF and multi-kernel SVM. The genetic algorithm was used for the optimization of the parameters involved in the dimensionality reduction and fault classification. Two experiments were used to validate the efficacy of the proposed method. It is shown that the multi-KNMF can efficiently maintain the fault information in the fault diagnosis. The proposed method shows very high accuracy in the application of the bearing fault diagnosis. Compared with the classifier based on the SNMF, the proposed method is more powerful in the condition identification of rotor even not involving a feature selection program.

Footnotes

Academic Editor: Anand Thite

Declaration of conflicting interests

The authors declare that there is no conflict of interest.

Funding

The research work described in this article was supported by Major National S&T Program of China under Grant No. 2012ZX04012-031.

References

Hong

Yong

Xiao-Guo

. Fault diagnosis for IC engines using wavelet packet and image processing. Int J Plant Eng Manag 2003; 8: 154–162.

Giger

Karssemeijer

Schnabel

. Breast image analysis for risk assessment, detection, diagnosis, and treatment of cancer. Annu Rev Biomed Eng 2013; 15: 327–357.

Chen

C-M

Chou

Y-H

Tagawa

. Computer-aided detection and diagnosis in medical imaging. Comput Math Meth Med 2013; 2013: 790608 (2 pp.).

Peng

Meng

Chu

. Improved wavelet reassigned scalograms and application for modal parameter estimation. Shock Vib 2011; 18: 299–316.

Hao

Peng

Feng

. Application of support vector machine based on pattern spectrum entropy in fault diagnostics of rolling element bearings. Meas Sci Technol 2011; 22: 045708–045720.

Peng

Meng

Lang

. Polynomial chirplet transform with application to instantaneous frequency estimation. IEEE Trans Meas Instrum 2011; 60: 3222–3229.

Peng

Peter W

Tse

Chu

. A comparison study of improved Hilbert-Huang transform and wavelet transform: application to fault diagnosis for rolling bearing. Mech Syst Signal Process 2005; 19: 974–988.

Peng

Chu

Tse

. Detection of the rubbing caused impacts for rotor—stator fault diagnosis using reassigned scalogram. Mech Syst Signal Process 2005; 19: 391–409.

Thakur

Sing

Basu

. Face recognition using principal component analysis and RBF neural networks. In: First international conference on emerging trends in engineering and technology (ICETET), Nagpur, India, 16–18 July 2008, pp. 695–700. New York: IEEE.

10.

Buntine

. Variational extensions to EM and multinomial PCA. In: Proceedings of the European conference on machine learning (ECML-02; LNAI), Helsinki, Finland, 19–23 August 2002, pp.23–34. Berlin, Heidelberg: Springer-Verlag.

11.

Kim

T-K

Kittler

. Locally linear discriminant analysis for multimodally distributed classes for face recognition with a single model image. IEEE Trans Pattern Anal 2005; 27: 318–327.

12.

Thakur

Sing

Basu

. Face recognition using kernel Fisher linear discriminant analysis and RBF neural network. Contemp Comput 2010; 94: 13–20.

13.

Belhumeur

Hespanha

Kriegman

. Eigenfaces versus fisherfaces: recognition using class specific linear projection. IEEE Trans Pattern Anal Mach Intell 1997; 23: 711–720.

14.

Baudat

Anouar

. Generalized discriminant analysis using a kernel approach. Neural Comput 2000; 12: 2385–2404.

15.

Xiong

Swamy

MNS

Ahmad

. Two-dimensional FLD for face recognition. Pattern Recogn 2005; 38: 1121–1124.

16.

Mandelzweig

Demko

Dolenko

. A projection method for the visualization of high-dimensional biomedical datasets. In: IEEE Canadian conference on electrical and computer engineering (CCECE), Montreal, QC, Canada, 4–7 May 2003, vol. 3, pp.1453–1456. New York: IEEE.

17.

Yang

. Visual exploration of large relational data sets through 3D projections and footprint splatting. IEEE Trans Knowl Data Eng 2003; 15: 1460–1471.

18.

Gribonval

. From projection pursuit and CART to adaptive discriminant analysis. IEEE Trans Neural Network 2005; 16: 522–532.

19.

Pan

Liu

Zheng

. Face recognition using kernel PCA and hybrid flexible neural tree. In: International conference on wavelet analysis and pattern recognition, Beijing, China, 2–4 November 2007, pp.1361–1366. New York: IEEE.

20.

Lange

Biehl

Villmann

. Non-Euclidean principal component analysis by Hebbian learning. Neurocomputing 2015; 147: 107–119.

21.

Eckart

Young

. The approximation of one matrix by another of lower rank. Psychometrika 1936; 1: 211–218.

22.

Wall

Rechtsteiner

Rocha

. Singular value decomposition and principal component analysis. In: Berrar

Dubitzky

Granzow

(eds) A practical approach to microarray data analysis. Norwell, MA: Kluwer, 2003, pp.91–109.

23.

Lin

C-J

. On the convergence of multiplicative update algorithms for nonnegative matrix factorization. IEEE Trans Neural Network 2007; 18: 1589–1596.

24.

Zhang

Fang

Liu

. Total variation norm-based non-negative matrix factorization for identifying discriminant representation of image patterns. Neurocomputing 2008; 71: 1824–1831.

25.

Lee

Seung

. Learning the parts of objects by non-negative matrix factorization. Nature 1999; 401: 788–791.

26.

Borg

Groenen

. Modern multidimensional scaling: theory and applications. 2nd ed. New York: Springer-Verlag, 2005, pp.207–212.

27.

Bronstein

Kimmel

. Generalized multidimensional scaling: a framework for isometry-invariant partial surface matching. Proc Natl Acad Sci USA 2006; 103: 1168–1172.

28.

Roweis

Saul

. Nonlinear dimensionality reduction by locally linear embedding. Science 2000; 290: 2323–2326.

29.

Tenenbaum

de Silva

Langford

. A global geometric framework for nonlinear dimensionality reduction. Science 2000; 290: 2319–2323.

30.

Zhang

Zha

. Principal manifolds and nonlinear dimension reduction via local tangent space alignment. SIAM J Sci Comput 2005; 26: 313–338.

31.

Feng

Liu

Xiao

. A novel CBIR system with WLLTSA and ULRGA. Neurocomputing 2015; 147: 509–522.

32.

Hotelling

. Analysis of a complex of statistical variables into principal components. J Educ Psychol 1933; 24: 417–441.

33.

Zhang

Wang

. Nearest manifold approach for face recognition. In: The 6th IEEE international conference on automatic face and gesture recognition, Seoul, Korea, 17–19 May 2004, pp.223–228. New York: IEEE.

34.

Zhang

Wang

. Manifold learning and applications in recognition. In: Tan

Yap

Wang

(eds) Intelligent multimedia processing with soft computing. Heidelberg: Springer-Verlag, 2004, pp.281–300.

35.

Scholkopf

Smola

Muller

. Nonlinear component analysis as a kernel Eigen value problem. Neural Comput 1998; 10: 1299–1319.

36.

Zass

Shashua

. Non-negative sparse PCA. In: Advances in neural information processing systems (NIPS), Vancouver, BC, Canada, 4–7 December 2006, pp.1561–1568. Cambridge, MA: MIT Press.

37.

Tibshirani

. Regression shrinkage and selection via the lasso. J Roy Stat Soc B Meth 1996; 58: 267–288.

38.

Zhang

Zhou

Chen

. Non-negative matrix factorization on kernels. Lect Notes Comput Sci 2006; 4099: 404–412.

39.

Ngom

. A new kernel non-negative matrix factorization and its application in microarray data analysis. In: Computational intelligence in bioinformatics and computational biology (CIBCB), San Diego, CA, 9–12 May 2012, pp.371–378. Piscataway, NJ: IEEE CIS Society.

Application of the multi-kernel non-negative matrix factorization on the mechanical fault diagnosis

Abstract

Keywords

Introduction

Multi-KNMF

NMF

Kernel NMF

Multi-KNMF-based mechanical fault diagnosis

Multi-kernel design

Multi-KNMF

Case study

Case 1: bearing fault diagnosis

Case 2: rotor condition identification

Conclusion

Footnotes

Declaration of conflicting interests

Funding

References