Sage Journals: Discover world-class research

Abstract

Fault diagnosis of equipment is a key issue in the industrial field, and it is essential to keep abreast of equipment status. However, previous studies either considered fault data at a single moment or gave the same weight to data over a period of time. In view of the problems above, fault diagnosis method based on time domain weighted data aggregation and information fusion is proposed in this article. First, the monitored data of sensors loaded by the equipment are aggregated utilizing the linear decaying weights. Then, Gaussian models of each fault type under different fault features are established based on aggregated data. And the basic probability assignments are generated by matching aggregated testing samples with the constructed Gaussian model. At last, the basic probability assignments generated under each fault feature are fused by Dempster combination rule. The proposed method is verified and the results show that the total fault recognition rate can reach 97.5%, which increased by 1.9% compared with the method that Gaussian model constructed by original data.

Keywords

Fault diagnosis linear decaying weights data aggregation information fusion Dempster combination rule

Introduction

With the increasing demand for applications and the continuous development of modern technology, the complexity of industrial equipment is gradually increasing, and it plays an important role in industrial production.^1,2 However, the industrial equipment often faces a complicated working environment during its operation, which may affect its performance and even cause equipment failure.

At present, domestic and foreign scholars have proposed many methods for equipment fault diagnosis, including qualitative analysis–based method,^3,4 model-based method,^5,6 data-driven-based method,^7–10 and so on. The qualitative analysis–based method, which analyzes the causality and development law of equipment failure, needs to rely on some qualitative analysis tools and combine with expert knowledge and system knowledge to carry out knowledge reasoning, including graph theory methods,¹¹ expert systems,^12–14 qualitative simulation,¹⁵ and so on. Therefore, this method has some disadvantages. For instance, it cannot obtain quantitative analysis results for equipment failure, the complexity of graph model increases suddenly when the device is more complicated, and knowledge acquisition may be difficult. The key of the model-based approach, including state estimation method,¹⁶ parameter estimation method,¹⁷ parity space method,¹⁸ and so on, is to establish a mathematical model that is consistent with the running process of the equipment and judge the running state and fault type of the equipment through the residual mathematical signal between the accurate mathematical model and the observable measurement. This method can diagnose equipment failure well. However, the model of equipment may not be modeled well when the equipment is complicated. The data-driven technique does not require the creation of a physical model of the device; use the monitored data during the operation of the equipment to diagnose the fault type of the equipment, for example, machine learning,^19–22 multivariate statistical analysis,²³ signal processing,²⁴ rough set,^25,26 fuzzy set,²⁷ and multi-sensors or multi-sources information fusion method.^28–31

In the multi-sensors information fusion based method, in which the data of multiple sensors (or sources) are fused, reflects the diversity, redundancy, and complementarity of multiple information. Therefore, this method could obtain more reliable diagnostic results than single source information. A great deal of researches based on this approach have been developed. Wu et al.³² proposed a framework of fault diagnosis based on Bayesian network (BN) for nuclear power plants. Within the framework, the data from multiple sensors were improved by fuzzy theory and data fusion. Zhang et al.³³ analyzed tunnel-induced pipeline damage based on fuzzy BN, and the proposed method was demonstrated on the construction of the Wuhan Yangtze River Tunnel. Huo et al.³⁴ proposed a new bearing fault diagnosis method based on weighted Dempster–Shafer (D-S) evidence theory combined with Genetic Algorithm (GA). Banerjee and Das³⁵ presented a new hybrid method based on information fusion for fault diagnosis, which combined the support vector machine (SVM) and short-term Fourier transform (STFT) techniques. Owing to the complexity of the working environment of the equipment and the limitations of the equipment loading sensors, the observation data often has certain ambiguity and uncertainty. The relationship between the fault characteristics and the fault type may be complex. Hence, the fault diagnosis method should deal with this uncertainty well. Compared with other methods, the D-S evidence theory provides the basic probability assignment (BPA), which can effectively represent the uncertainty and complex relationships.^36–39 Besides, it also provided Dempster combination rule to fuse multiple information.

Many studies based on D-S evidence theory for fault diagnosis have been carried out. Wang and Xiao⁴⁰and Xiao⁴¹ proposed an improved multi-sensors data fusion method combined Euclidean distance with belief entropy to fuse sensor data for fault diagnosis. Jiang et al.⁴² proposed a novel fuzzy evidential method to analyze failure mode and effects. Gong et al.⁴³ built a triangle fuzzy function according to historical data of the symptoms and utilized the BPA functions based on D-S evidence theory to diagnose the fault of nuclear power plant. Dong et al.⁴⁴ proposed a fault diagnosis method generating weight by the diversity degree of sensor reports to combine multi-sensor information. Credibility degree of evidence calculated by the support degree of all evidence is translated into weight by Xiao.⁴⁵ Besides, the weight is adjusted by the information volume of the evidence. Chen et al.⁴⁶ proposed a weighted fault diagnosis method, which generated weights using evidence distance and uncertainty.

As far as we know, the above researches treated historical data equally and did not consider the impact of data acquisition time on the diagnosis results. Actually, the farther the monitored data acquisition time is from the current diagnosis time, the smaller the impact on the device failure type diagnosis result is. Based on this, in this article, fault diagnosis method based on time domain weighted data aggregation and information fusion is proposed. At first, data from multi-sensors in a period of time are aggregated by a set of linear decaying weights, which ensure that the farther the data from the current time, the smaller the weight. Next, fault Gaussian model is constructed utilizing the aggregated data. Compared with the fault gauss model constructed by original data, the new fault model has better distinguishing ability among fault features. Then, the intersections between aggregated testing data and fault Gaussian model are transformed into BPAs. Finally, these BPAs are fused with discount coefficient based on D-S evidence theory. Besides, the final BPA is transformed into pignistic probability to make a decision for the fault type. The proposed method based on data aggregation in this article takes account of the influence of historical data so that it can reflect the uncertainty of fault data. The fusion results can also reflect the time effect. In addition, the original fault data are aggregated into a series of data considering different time length. Thus, the effects of certain extreme points on the diagnosis results can be avoided while increasing the robustness of the proposed method.

The rest of this article is organized as follows: in the “Preliminaries” section, the preliminaries about data aggregation and D-S evidence theory are introduced. In the “The proposed method” section, fault diagnosis method based on time domain weighted data aggregation and information fusion is proposed. In the “Illustrative example and discussion” section, verification of the proposed method on the motor rotor is elaborated and analyzed. The conclusion is made in section “Conclusion.”

Preliminaries

Data aggregation based on linear decaying weights

The data aggregation method based on linear decaying weights, proposed by Yager,⁴⁷ is an effective tool for data aggregation, which can reflect the impact of data acquisition time. The generated approach of linear decaying weights is shown in Figure 1.

Figure 1.

Generated method for linear decaying weights.

Suppose that the data of a sensor of device during a period of time are

${x_{1}, x_{2}, \dots, x_{t}, \dots, x_{n}}$ (1)

where $x_{t}$ indicates the sensor data for the $t th$ second, $x_{n}$ indicates the sensor data for the $n th$ second. In Figure 1

$T = \sum_{t = 1}^{m} t = \frac{m (m + 1)}{2}$ (2)

where $m$ refers to the length of time to aggregate data, $1 \leq m \leq n$ . And then the linear decaying weights are

${\begin{matrix} ω_{t} = \frac{t - (n - m)}{T}, n - m + 1 \leq t \leq n \\ ω_{t} = 0, 0 \leq t \leq n - m \end{matrix}$ (3)

Using the above weight generated method, the aggregated data are

$\hat{X} (x_{1}, x_{2}, \dots, x_{n}) = \frac{1}{T} \sum_{j = 1}^{m} t x_{n - m + t}$ (4)

When the length of time $m$ is different, the aggregation result of sensor data is different. As can be seen from Figure 1, when the acquisition time of the sensor data is closer to the current time, the weight given to the corresponding data is also larger.

The Gaussian model

There is some uncertainty about the fault monitored data of the equipment, and an error may occur in judging the state of the equipment by using a certain observation value. Therefore, it is necessary to use a simple and accurate mathematical model to model the fault data, which can effectively extract the essential characteristics of the data, and then identify the fault type of the engine. Practices have indicated that many of the random variables produced in everyday production and scientific experiments can be approximated as a Gaussian distribution. Hence, the Gaussian model is utilized to model the fault data. The detailed construction process is described in the following.

Suppose there are $n$ observations, ${x_{1}, x_{2}, \dots, x_{n}}$ , the mean value $μ$ and standard deviation $σ$ are calculated by

$μ = \frac{1}{n} (x_{1} + x_{2} + \dots + x_{n})$ (5)

$σ = \sqrt{\frac{1}{n - 1} \sum_{i = 1}^{n} {(x_{i} - μ)}^{2}}$ (6)

When calculating $σ$ , compared to dividing by $n$ , dividing by $n - 1$ gives an unbiased estimator. Then, the Gaussian model can be represented as

$f = \frac{1}{\sqrt{2 π} σ} \exp (- \frac{{(x - μ)}^{2}}{2 σ^{2}})$ (7)

It should be noted that the extreme values of the generated Gaussian model may be different due to differences in fault data dimensions. In order to better establish the Gaussian fault model of the device, the generated Gaussian model is normalized in this article, described in the section “The proposed method.”

D-S evidence theory

D-S evidence theory is a mathematical theory of multi-source information proposed by Dempster⁴⁸ and expanded by Shafer.⁴⁹ It extends the theory of probability and can effectively represent uncertainty due to inaccuracy and completely unknown uncertainty. This theory is widely used in the field of fault diagnosis,⁵⁰ multiple criteria decision,^51–53 game theory,^54,55 complex network,⁵⁶ and so on. In this part, a few concepts about D-S evidence theory are given.

Assume the device has $s$ types of faults, expressed as $F_{1}, F_{2}, \dots, F_{s}$ , then all fault types constitute a frame of discernment (FOD)

$Θ = {F_{1}, F_{2}, \dots, F_{s}}$ (8)

And the power set of $Θ$ is

$2^{Θ} = {F_{1}, F_{2}, \dots, F_{s}, {F_{1}, F_{2}}, {F_{1}, F_{3}}, \dots, Θ, Ø}$ (9)

When using evidence theory for fault diagnosis, all propositions are a subset of FOD, that is an element of the power set. And the reliability of a proposition is determined by the BPA function. The concept of BPA is defined as follows:

Definition 1

Set $Θ$ is the FOD, and its power set constitutes a set of propositions. If the function $m$ satisfies the following formula

${\begin{matrix} m (Ø) = 0 \\ \sum m (A) = 1 \end{matrix}$ (10)

Then the mass function $m$ is a BPA. In this definition, $m (A)$ is the BPA of proposition $A$ , indicates the reliability assigned to $A$ .

In addition, D-S evidence theory provides Dempster combinatorial rules to fuse multi-source information. Let the two basic probability distribution functions be $m_{1}$ and $m_{2}$ , respectively, and the propositions are the elements of the power set of the FOD, $A = {F_{1}, F_{2}, \dots, F_{k}}$ and $B = {F_{1}, F_{2}, \dots, F_{l}}$ . Then, the Dempster⁴⁸ combination rule is

${\begin{matrix} m (Ø) = 0 \\ m (C) = \frac{\sum_{A \cap B = C} m_{1} (A) m_{2} (B)}{1 - K} \end{matrix}$ (11)

In this equation, $K = \sum_{A \cap B = Ø} m_{1} (A) m_{2} (B)$ , which is utilized to measure the conflict between BPAs. The larger the $K$ , the greater the conflict.

The proposed method

In this article, a new method for engine fault diagnosis based on data aggregation, which considers the time of data acquisition, is proposed. The detailed description of the proposed method is shown in Figure 2, and the procedures are elaborated step by step in the following text.

Step 1. Sensor data aggregation. First, collect sensor data of engine under different fault features during a period of time. Then, the collected data are aggregated based on a chain of linear decaying weights considering different length of time for different fault type under different fault features. Suppose that there are $s$ fault types with $z$ fault features, represented as $F = {F_{1}, F_{2}, \dots, F_{s}}$ and $C = {C_{1}, C_{2}, \dots, C_{z}}$ , respectively. Take the fault feature $C_{1}$ of fault type $F_{1}$ (represented as $F_{1 C_{1}}$ ) as an example, the original data in time period $n$ are shown as follows

$F_{1 C_{1}} = {x_{1}, x_{2}, \dots, x_{n}}$ (12)

Then, the aggregated data used the aforementioned linear decaying weights considering different the different lengths of time $(m = {1, 2, \dots, n})$ are shown as follows

${\hat{F}}_{1 C_{1}} = {x_{n}, \frac{2}{3} x_{n} + \frac{1}{3} x_{n - 1}, \dots, \frac{2}{n (n + 1)} \sum_{j = 1}^{n} j \cdot x_{j}}$ (13)

Step 2. Gaussian model generation for aggregated data of known fault under different fault features. Due to the complexity of the working environment of the engine and the difference of monitored time of the sensor, the obtained monitoring data are generally within a certain range, in other words, the data have certain ambiguity. And these values fluctuate around a certain value. Similarly, the aggregated data also have the same characteristics. The Gaussian distribution can well characterize these features. Practices have indicated that many of the random variables produced in everyday production and scientific experiments can be approximated as a Gaussian distribution. Hence, we take the Gaussian membership function as a fault model based on the mentioned above. At first, the Gaussian membership function for the fault feature $C_{j}$ of fault type $F_{i}$ is defined as

$μ (x) : X \to [0, 1], x \in X$ (14)

where $X$ indicates the set of aggregated data of fault feature $C_{j}$ for fault type $F_{i}$ . The specific generation process of Gaussian model for aggregated data of known fault under different features is as follows:

For the fault feature $C_{j}$ of fault type $F_{i}$ , calculate the mean value ${\bar{F}}_{i C_{j}}$ and standard deviation value $σ_{i C_{j}}$ of all the aggregated data belonging to the fault type on the fault feature. The mean value and the standard deviation value can be obtained by

${\bar{F}}_{i C_{j}} = \frac{1}{n} \sum_{l = 1}^{n} x_{i C_{j}}^{l}$ (15)

$σ_{i C_{j}} = \sqrt{\frac{1}{n - 1} \sum_{l = 1}^{n} {(x_{i C_{j}}^{l} - {\bar{F}}_{i C_{j}})}^{2}}$ (16)

Based on the obtained mean and standard deviation, generate a Gaussian model of fault type $F_{i}$ on fault characteristics $C_{j}$ as follows

$μ (x) = \exp (- \frac{(x_{i C_{j}}^{l} - {\bar{F}}_{i C_{j}})}{2 σ_{i C_{j}}^{2}})$ (17)

Step 3. BPAs generation under different fault features. If the single measurement value is utilized to match with the generated Gaussian model in step 2, the diagnosis result may not cover the fuzziness of testing data. Therefore, in this step, the testing data are aggregated using the linear decaying weights method like the step 1. Then, match the aggregated testing data with the established Gaussian model to generate BPAs accordingly, which is significant for fault diagnosis of equipment in the proposed method. The detailed process is shown below.

Figure 2.

The flow-process diagnosis of the proposed method.

Suppose the generated Gaussian model in step 2 of fault feature $C_{j}$ is as shown in Figure 3. It contains three types of fault types under this fault feature. And the aggregated testing value is $p$ for the fault feature $C_{j}$ , as shown by the pink dotted line in the figure. Then, the degree of matching and BPAs can be determined by the intersection between the test sample and the Gaussian model.

When the test sample intersects a Gaussian model of a single fault type, the ordinate of the intersection point is the probability that the test sample belongs to the fault type.

If the test sample intersects a Gaussian model of multiple fault types, the ordinate height of the intersection point represents support for a single fault type, and the low point represents support for multiple fault types. For example, there are two intersections in Figure 3. And the point $A$ represents support for fault type $F_{2}$ , and the point $B$ represents support for proposition ${F_{1}, F_{2}}$ .

In addition, the sum of all BPAs in D-S evidence theory is equal to 1. Therefore, if the sum of the reliability values for the generated BPAs is greater than 1, normalization processing is performed; if less than 1, the redundant reliability is assigned to the complete set.

Step 4. The generated BPAs in step 3 fusion and decision making. In the actual environment, while the sensor monitors the working state of the engine, the relationship between the fault feature and the fault type may be complex. Different fault features have different degrees of discrimination on the fault type identification. Thus, fusing the generated BPAs under different features with the same weight may result in large errors for fault diagnosis. In this article, the discount fusion method based on D-S evidence theory is utilized to fuse the generated BPAs. The method of generating weight for fault features is described first.

Figure 3.

The generated Gaussian model under fault feature $C_{j}$ .

Assume there are two fault types, if the aggregated test sample is located in the intersection area of the Gaussian model of two fault types under a fault feature, then it is impossible to distinguish which fault type the aggregated test sample belongs to. Thus, the indistinguishability of this fault feature for the two fault types can be expressed as

$P = \frac{Intersecting Area}{All Area}$ (18)

Therefore, the weight of the fault feature is

$ω = 1 - P$ (19)

Similarly, we can get the weights of other fault features. After obtaining the weight of each fault feature, the BPAs under different fault characteristics generated in step 2 are multiplied by the corresponding weight to correct the reliability for fault type. Then, the Dempster combination rule, mentioned in the “Preliminaries” section, is used to fuse these corrected BPAs.

Finally, determine the fault type of the test sample. After the evidence is combined, the information we wish to obtain is as reasonable and reliable as possible, and the uncertainty of the information is as low as possible. However, BPA after fusion may contain a certain degree of uncertainty, which does not conducive to make a decision. Therefore, in this article, Pignistic probability conversion method is utilized to convert BPA into probability, and then judges the fault type for the test sample. The Pignistic probability conversion is defined as follows.

Assuming that $m$ is a BPA on the FOD $Θ$ , its Pignistic probability conversion $Bet P_{m}$ is expressed as^57,58

$Bet P_{m} (X) = \sum_{\binom{Y \in 2^{Θ}}{Y \neq Ø}} \frac{| X \cap Y |}{| Y |} \cdot \frac{m (Y)}{1 - m (Ø)}$ (20)

where $2^{Θ}$ represents the power set of $Θ$ , and $| Y |$ refers to the potential of set $Y$ , that is the number of elements contained in $Y$ .

Illustrative example and discussion

In order to verify the effectiveness of the proposed method, the motor rotor fault diagnosis is used as an example in this section. The equipment is multi-functional flexible rotor test-bed, and fault data are from rotor vibration signal collected and extracted by the displacement sensor and acceleration sensor. For the equipment, three fault types are configured, including the rotor imbalance, rotor misalignment, and support base loosening, represented by $F_{1}$ , $F_{2}$ , and $F_{3}$ , respectively. When the rotor is working normally, the amplitude of each vibration frequency does not exceed 0.1 m/s. However, the source of the fault that causes abnormal vibration will increase the amplitude of the vibration of a certain frequency component if a fault occurs, possibly a single frequency, or a set of frequencies or a certain frequency band. Different types of faults occur and the amplitude of the vibration increases differently.

The vibration energy of the three faults is mostly concentrated on the basic frequency $1 X$ , double frequency $2 X$ , and triple frequency $3 X$ , but it is difficult to determine which fault occurs in the analysis of the single frequency amplitude. Hence, the vibration amplitude of $1 - 3 X$ and the average amplitude of the time domain vibration displacement are taken as the fault features to make comprehensive decision in this article.

The fault data utilized in this article is from Wen and Xu.² And the detailed fault diagnosis is verified in the following text.

Step 1: aggregate the fault data

For each fault feature under each fault type, 40 observations were continuously collected in $Δ t = 16 s$ . For instance, for fault feature $C_{1}$ of fault type $F_{1}$ , the fault data collected are

$\begin{matrix} F_{1 C_{1}} = [0.1540 0.1518 0.1537 0.1548 0.1542 0.1538 0.1545 \\ 0.1537 0.1571 0.1560 0.1584 0.1552 0.1586 0.1574 \\ 0.1569 0.1565 0.1551 0.1585 0.1585 0.1593 0.1548 \\ 0.1558 0.1547 0.1593 0.1532 0.1632 0.1575 0.1590 \\ 0.1594 0.1541 0.1650 0.1674 0.1651 0.1604 0.1787 \\ 0.1818 0.1820 0.1656 0.1658 0.1644] \end{matrix}$

Then, aggregate the above fault data using the linear decaying weights. When the length of the historical data is taken, the aggregated data are

$\begin{matrix} {\hat{F}}_{1 C_{1}} = [0.1644 0.1649 0.1651 0.1668 0.1685 0.1698 0.1702 \\ 0.1702 0.1702 0.1701 0.1698 0.1694 0.1691 0.1687 \\ 0.1683 0.1680 0.1676 0.1673 0.1669 0.1666 0.1663 \\ 0.1660 0.1657 0.1654 0.1652 0.1650 0.1647 0.1645 \\ 0.1643 0.1641 0.1640 0.1638 0.1636 0.1634 0.1633 \\ 0.1631 0.1630 0.1628 0.1627 0.1625] \end{matrix}$

Comparing the fault data before and after the aggregation, it can be seen that the aggregated data are closer to the value of the diagnosis point and are more effective for fault diagnosis at the current time. It can be inferred that the type of equipment failure based on aggregated fault data may have a higher accuracy rate, which will be verified below.

Step 2: generate Gaussian model for aggregated fault data

According to the “The proposed method” section, the mean values and standard deviation values for all fault features are calculated and shown in Table 1. And the generated Gaussian model based on these mean values and standard deviation values is shown in Figure 4. Correspondingly, the generated Gaussian model for original data is shown in Figure 5.

Table 1.

The mean value and standard deviation value for fault features.

Faultfeatures		$F_{1}$	$F_{2}$	$F_{3}$
$C_{1}$	Mean	0.1597	0.1777	0.3306
	Standard deviation	0.0003	0.0003	0.0009
$C_{2}$	Mean	0.1472	0.3254	0.3467
	Standard deviation	0.0004	0.0019	0.0007
$C_{3}$	Mean	0.1122	0.2471	0.1338
	Standard deviation	0.0004	0.0019	0.0009
$C_{4}$	Mean	4.3278	4.6687	9.8187
	Standard deviation	0.0066	0.0176	0.0147

Figure 4.

The generated Gaussian model for aggregated data under (a) fault feature $C_{1}$ , (b) fault feature $C_{2}$ , (c) fault feature $C_{3}$ , and (d) fault feature $C_{4}$ .

Figure 5.

The generated Gaussian model for original data under (a) fault feature $C_{1}$ , (b) fault feature $C_{2}$ , (c) fault feature $C_{3}$ , and (d) fault feature $C_{4}$ .

Taking the Gaussian model of fault feature $C_{3}$ as an example, comparing Figure 4 with Figure 5, the intersection area between fault type $F_{2}$ and $F_{3}$ is almost 0. Besides, the intersection area between fault type $F_{1}$ and $F_{2}$ in Figure 4(c) is also significantly smaller than it in Figure 5(c). The same situation also occurs in the other parts of Figures 4 and 5. It can be seen that the intersection area for all fault characteristics are reduced for the Gaussian model generated by the aggregated fault data, which indicates the distinguishing ability of each fault feature is improved.

In addition, the discriminability of different fault features is verified by statistical methods⁵⁹

$J = \frac{tr (S_{w})}{tr (S_{b})}$ (21)

where $tr$ is the trace of a matrix, $S_{w}$ and $S_{b}$ are the within-types scatter matrix and between-types scatter matrix, respectively

${\begin{matrix} S_{w} = \sum_{i = 1}^{C} P (C_{i}) E [(X - \frac{1}{N_{i}} \sum_{X \in C_{i}} X) {(X - \frac{1}{N_{i}} \sum_{X \in C_{i}} X)}^{T}] \\ S_{b} = \sum_{i = 1}^{C} P (C_{i}) (\frac{1}{N_{i}} \sum_{X \in C_{i}} X - M) {(\frac{1}{N_{i}} \sum_{X \in C_{i}} X - M)}^{T} \end{matrix}$ (22)

where $X$ is a feature vector of a sample and $M$ is the mean of all fault types’ centroids

$M = \frac{1}{C} \sum_{i = 1}^{C} (\frac{1}{N_{i}} \sum_{X \in C_{i}} X)$ (23)

The smaller the value of $J$ , the better the discriminability of the corresponding fault feature.

Based on the above, we can obtain the value of $J$ of all fault features for aggregated fault data and original fault data as shown in Table 2. Comparing the $J$ values for the aggregated data and the original data in Table 2, it can be obtained that the distinguishing ability of the fault features is increased after aggregating the fault data.

Table 2.

The $J$ value for fault features.

Fault features	Aggregated data	Original data
$C_{1}$	0.0320	0.1442
$C_{2}$	0.0928	0.3249
$C_{3}$	0.5536	1.2898
$C_{4}$	10.7019	66.7239

Step 3: generate the BPAs for aggregated testing data

First, the testing data are aggregated using a set of linear decaying weights. Then, aggregated testing data are matched with the generated Gaussian model to generate BPAs using the method mentioned in the “The proposed method” section.

For instance, one aggregated testing sample of fault type $F_{2}$ is $[\begin{matrix} 0.1945 & 0.3344 & 0.2122 & 5.186 \end{matrix}]$ , that is to say, the values for the fault features $C_{1}$ – $C_{4}$ are 0.1945, 0.3344, 0.2122, and 5.1860, respectively. The corresponding fault values are sequentially matched with the generated Gaussian model, and the generated BPAs are shown in Table 3.

Table 3.

The BPAs of the aggregated testing sample for fault features.

Fault features	Generated BPAs
$C_{1}$	$m ({F_{2}}) = 0.0566, m ({F_{1}, F_{2}, F_{3}}) = 0.9434$
$C_{2}$	$m ({F_{2}}) = 0.4976, m ({F_{2}, F_{3}}) = 0.2946,$ $m ({F_{1}, F_{2}, F_{3}}) = 0.2078$
$C_{3}$	$m ({F_{2}}) = 0.3460, m ({F_{1}, F_{2}, F_{3}}) = 0.6540$
$C_{4}$	$m ({F_{2}}) = 0.0237, m ({F_{1}, F_{2}, F_{3}}) = 0.9763$

BPA: basic probability assignment.

Step 4: fuse the generated BPAs and make decision

In this step, first, calculate the weights (discrimination) of all fault features using the method mentioned in the “The proposed method” section. The result is shown in Table 4.

Table 4.

The weights of the fault features.

Fault features	Weights (discrimination)
$C_{1}$	0.9725
$C_{2}$	0.9300
$C_{3}$	0.9930
$C_{4}$	0.9271

Then, multiply BPA under each fault feature by the corresponding weight to discount the BPAs generated in the “Step 3: generate the BPAs for aggregated testing data” section and fuse these BPAs using the Dempster combinational rule. The final BPA is

$\begin{matrix} m ({F_{2}}) = 0.6741, m ({F_{2}, F_{3}}) = 0.1662, \\ m ({F_{1}, F_{2}, F_{3}}) = 0.1363 \end{matrix}$

Finally, the final BPA is converted probability to make a decision using Pignistic probability conversion method

${\begin{matrix} BetP (F_{1}) = 0.0532 \\ BetP (F_{2}) = 0.8104 \\ BetP (F_{3}) = 0.1363 \end{matrix}$

Based on the above, the probability of fault type $F_{2}$ is the largest, which is 0.8104. From this result, we can draw a conclusion that the fault type of test sample is $F_{2}$ , which is consistent with the true fault type.

To further illustrate the significance of the proposed method in this article, the diagnosis results are compared with the Jiang et al.,²⁸ as shown in Table 5. In Table 5, all fault types are correctly recognized with high reliability by the proposed method in this article and the method in Jiang et al.²⁸

Table 5.

The comparison of fusion results of our method and other method.

	Real faulttype	Combined BPA	Diagnosisresults
Proposed method (Ours)	$F_{1}$	$m ({F_{1}}) = 0.9994, m ({F_{1}, F_{2}}) = 4.9020 \times 10^{- 4}, m ({F_{1}, F_{2}, F_{3}}) = 9.4701 \times 10^{- 5}$	$F_{1}$
	$F_{2}$	$m ({F_{2}}) = 0.9799, m ({F_{2}, F_{3}}) = 3.8336 \times 10^{- 4}, m ({F_{1}, F_{2}, F_{3}}) = 0.0197$	$F_{2}$
	$F_{3}$	$m ({F_{3}}) = 0.9803, m ({F_{2}, F_{3}}) = 0.0089, m ({F_{1}, F_{2}, F_{3}}) = 0.0109$	$F_{3}$
	$F_{3}$	$\begin{matrix} m ({F_{2}}) = 0.9452, m ({F_{1}, F_{3}}) = 3.7792 \times 10^{- 4}, \\ m ({F_{2}, F_{3}}) = 0.0033, m ({F_{1}, F_{2}, F_{3}}) = 0.0511 \end{matrix}$	$F_{3}$
	$F_{2}$	$\begin{matrix} m ({F_{2}}) = 0.9990, m ({F_{1}, F_{2}}) = 3.2478 \times 10^{- 5}, \\ m ({F_{2}, F_{3}}) = 6.8830 \times 10^{- 4}, m ({F_{1}, F_{2}, F_{3}}) = 2.5738 \times 10^{- 4} \end{matrix}$	$F_{2}$
Method inJiang et al.²⁸	$F_{1}$	$m ({F_{1}}) = 0.9791, m ({F_{2}}) = 0.0199, m ({F_{1}, F_{2}}) = 0.0010$	$F_{1}$
	$F_{2}$	$m ({F_{1}}) = 0.0681, m ({F_{2}}) = 0.9302, m ({F_{3}}) = 0.0008, m ({F_{1}, F_{2}}) = 0.0009$	$F_{2}$
	$F_{3}$	$m ({F_{2}}) = 0.0002, m ({F_{3}}) = 0.9998$	$F_{3}$
	$F_{3}$	$m ({F_{2}}) = 0.0001, m ({F_{3}}) = 0.9999$	$F_{3}$
	$F_{2}$	$m ({F_{1}}) = 0.0124, m ({F_{2}}) = 0.9863, m ({F_{3}}) = 0.0005, m ({F_{1}, F_{2}}) = 0.0008$	$F_{2}$

BPA: basic probability assignment.

Finally, as in the above steps, fault identification of all test samples can be obtained and showed in Table 6. From Table 6, it can be seen that the recognition rate of fault type $F_{1}$ and fault type $F_{2}$ is improved, and the total fault recognition rate is increased by 1.9% after the test sample is aggregated.

Table 6.

The recognition rate for different fault types.

	Aggregatedtestingsamples (%)	Original testingsamples (%)
$F_{1}$	100	98.5
$F_{2}$	91.5	87.5
$F_{3}$	100	100
Total recognition rate	97.2	95.3

The above results show that the proposed method can be effectively utilized for fault diagnosis of engine. In practical applications, sensors loaded on the equipment need to be monitored for a long time. Then, the monitored data are transmitted to the processing software. The proposed method is applied to analyze the monitoring data on the computer to obtain the current state of the engine. In addition, in order to improve the accuracy of fault identification, the monitored data are analyzed multiple times, and a certain threshold is set according to the domain knowledge. When the results of multiple analyses exceed the threshold, the fault type can be confirmed.

Conclusion

In this article, fault diagnosis method based on time domain weighted data aggregation and information fusion has been proposed. The focus of this method is fault data aggregation considering different length of time, first, which was more effective for fault diagnosis than single point diagnosis. Second, the Gaussian models based on aggregated fault data are constructed for different fault features. Third, BPAs are generated by the intersection between aggregated testing sample and the constructed Gaussian models. Finally, the BPAs are fused with different weights for different features based on D-S evidence theory. In addition, a straightforward verification and analysis by the motor rotor fault data for this method are presented. The proposed method is compared with the exiting method, and the results show that the proposed method can identify fault correctly with higher reliability. The diagnosis for fault testing data before and after aggregation is also compared. The results of this example provide compelling evidence that the Gaussian fault model generated by the proposed method has better distinguishing ability and the total recognition rate has been improved. And the total recognition rate has been improved by 1.9%. This approach has potential in areas such as risk assessment. In the future research, more influencing factors for fault diagnosis will be considered.

Footnotes

The authors are grateful to anonymous reviewers for this useful comments and suggestions on improving this article.

Handling Editor: Masayoshi Aritsugi

Author contributions

Y.Z. wrote this article,and discussed and analyzed the numerical results. W.J. and X.D. reviewed and improved this article.

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research,authorship,and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research,authorship,and/or publication of this article: This work is supported by National Natural Science Foundation of China (Program Nos 61671384,61703338),Shaanxi Provincial Natural Science Basic Research Program (Program No. 2018JQ6085).

ORCID iD

Wen Jiang

References

Huang

. Uncertainty in fault diagnosis and knowledge acquisition. Beijing, China: Science Press, 2018.

Wen

. Multi-source uncertain information fusion theory and application: fault diagnosis and reliability assessment. Beijing, China: Science Press, 2012.

Venkatasubramanian

Rengaswamy

Kavuri

. A review of process fault detection and diagnosis part 2: qualitative models and search strategies. Comp Chem Eng 2003; 27(3): 313–326.

C-C

Lee

. Fault diagnosis based on qualitative/quantitative process knowledge. Aiche J 1991; 37(4): 617–628.

Frank

. Fault diagnosis in dynamic systems using analytical and knowledge-based redundancy: a survey and some new results. Automatica 1990; 26(3): 459–474.

Prakash

Samantaray

Bhattacharyya

. Model-based diagnosis of multiple faults in hybrid dynamical systems with dynamically updated parameters. IEEE T Syst Man Cy-S 2019; 49(6): 1053–1072.

Zhang

. Improved on-line process fault diagnosis through information fusion in multiple neural networks. Comp Chem Eng 2006; 30(3): 558–571.

Roozbeh

Ehsan

Maryam

, et al. Information fusion and semi-supervised deep learning scheme for diagnosing gear faults in induction machine systems. IEEE T Ind Electron 2019; 66(8): 6331–6342.

Zhang

Zhu

, et al. Perceiving safety risk of buildings adjacent to tunneling excavation: an information fusion approach. Automat Constr 2017; 73: 88–101.

10.

Zhang

Mahadevan

. Aircraft re-routing optimization and performance assessment under uncertainty. Decis Support Syst 2017; 96: 67–82.

11.

Kabir

. An overview of fault tree analysis and its application in model based dependability analysis. Expert Syst Appl 2017; 77: 114–135.

12.

Yang

Liu

Wang

, et al. Belief rule-base inference methodology using the evidential reasoning approach-RIMER. IEEE T Syst Man Cy A 2006; 36(2): 266–285.

13.

Liu

Yang

, et al. Inference and learning methodology of belief-rule-based expert system for pipeline leak detection. Expert Syst Appl 2007; 32(1): 103–113.

14.

Ramesh

Shum

Davis

. A structures framework for efficient problem solving in diagnostic expert systems. Comp Chem Eng 1988; 12(9–10): 891–902.

15.

Vinson

Ungar

. Dynamic process monitoring and fault-diagnosis with qualitative models. IEEE T Syst Man Cyb 1995; 25(1): 181–189.

16.

Wang

, et al. Network-based fault detection for discrete-time state-delay systems: a new measurement model. Int J Adapt Control 2008; 22(5): 510–528.

17.

Bagheri

Khaloozaded

Abbaszadeh

. Stator fault detection in induction machines by parameter estimation, using adaptive kalman filter. In: 2007 Mediterranean conference on control and automation, Athens, 27–29 June 2007, vols. 1–4, pp.951–956. New York: IEEE.

18.

Izadi

Shah

Chen

. Parity space fault detection based on irregularly sampled data. In: Proceedings of the American control conference, Seattle, WA, 11–13 June 2008, vols. 1–12, pp.2798+. New York: IEEE.

19.

Zhang

Ding

. Understanding and improving deep learning-based rolling bearing fault diagnosis with attention mechanism. Signal Process 2019; 161: 136–154.

20.

Wang

Liu

Zhu

. Bearing fault diagnosis based on a hybrid classifier ensemble approach and the improved Dempster-Shafer theory. Sensors 2019; 19(9): E2097.

21.

Geng

Zhou

, et al. Saliency-guided deep neural networks for SAR image change detection. IEEE T Geosci Remote 2019; 99: 1–13.

22.

Zhang

Mahadevan

. Ensemble machine learning models for aviation incident risk prediction. Decis Support Syst 2019; 116: 48–63.

23.

Wang

Kruger

Irwin

, et al. Nonlinear PCA with the local approach for diesel engine fault detection and diagnosis. IEEE T Contr Syst T 2008; 16(1): 122–129.

24.

Chai

, et al. An algorithm for sensor fault diagnosis with EEMD-SVM. T I Meas Control 2018; 40(6): 1746–1756.

25.

Muralidharan

Sugumaran

. Rough set based rule learning and fuzzy classification of wavelet features for fault diagnosis of monoblock centrifugal pump. Measurement 2013; 46(9): 3057–3063.

26.

Ghimire

Zhang

Pattipati

. A rough set-theory-based fault-diagnosis method for an electric power-steering system. IEEE-Asme T Mech 2018; 23(5): 2042–2053.

27.

Jiang

. A correlation coefficient for belief functions. Int J Approx Reason 2018; 103: 94–106.

28.

Jiang

Xie

. A new engine fault diagnosis method based on multi-sensor data fusion. Appl Sci-Basel 2017; 7(3): 2–18.

29.

Jiang

Cao

Deng

. A novel z-network model based on Bayesian network and z-number. IEEE T Fuzzy Syst 2019; 99: 1.

30.

Zhang

Deng

. Engine fault diagnosis based on sensor data fusion considering information quality and evidence theory. Adv Mech Eng 2018; 10(11): 1–10.

31.

Shi

, et al. Research on the fusion of dependent evidence based on mutual information. IEEE Access 2018; 6: 71839–71845.

32.

Tong

Zhang

, et al. Framework for fault diagnosis with multi-source sensor nodes in nuclear power plants based on a Bayesian network. Ann Nucl Energy 2018; 122: 297–308.

33.

Zhang

Qin

, et al. Towards a fuzzy Bayesian network based approach for safety risk analysis of tunnel-induced pipeline damage. Risk Anal 2016; 36(2): 278–301.

34.

Huo

Zhang

Shuea

. Bearing fault diagnosis using multi-sensor fusion based on weighted D-S evidence theory. In: Proceedings of the 18th international conference on Mechatronics —Mechatronika (ME), Brno, 5–7 December 2018, pp.403–408. New York: IEEE.

35.

Banerjee

Das

. Multi-sensor data fusion using support vector machine for motor fault detection. Inform Sciences 2012; 217: 96–107.

36.

Guo

Xinglong

Chaojie

, et al. Multi-sensor information fusion method and its applications on fault detection of diesel engine. In: Proceedings of the international conference on computer science and network technology (ICCSNT), Harbin, China, 24–26 December 2011, vols. 1–4, pp.2551–2555. New York: IEEE.

37.

Luo

Yang

, et al. Agent oriented intelligent fault diagnosis system using evidence theory. Expert Syst Appl 2012; 39(3): 2524–2531.

38.

Fei

Deng

. A new divergence measure for basic probability assignment and its applications in extremely uncertain environments. Int J Intell Syst 2019; 34(4): 584–600.

39.

Cui

Liu

Zhang

, et al. An improved Deng entropy and its application in pattern recognition. IEEE Access 2019; 7: 18284–18292.

40.

Wang

Xiao

. An improved multisensor data fusion method and its application in fault diagnosis. IEEE Access 2019; 7: 3928–3937.

41.

Xiao

. Multi-sensor data fusion based on the belief divergence measure of evidences and the belief entropy. Inform Fusion 2019; 46(2019): 23–32.

42.

Jiang

Xie

Zhuang

, et al. Failure mode and effects analysis based on a novel fuzzy evidential method. Appl Soft Comput 2017; 57: 672–683.

43.

Gong

Qian

, et al. Research on fault diagnosis methods for the reactor coolant system of nuclear power plant based on D-S evidence theory. Ann Nucl Energy 2018; 112: 395–399.

44.

Dong

Zhang

, et al. Combination of evidential sensor reports with distance function and belief entropy in fault diagnosis. Int J Comput Commun 2019; 14(3): 329–343.

45.

Xiao

. Multi-sensor data fusion based on the belief divergence measure of evidences and the belief entropy. Inform Fusion 2019; 46: 23–32.

46.

Chen

Diao

Sang

. A novel weighted evidence combination rule based on improved entropy function with a diagnosis application. Int J Distrib Sens N 2019; 15(1): 155014771882399.

47.

Yager

. Time series smoothing and OWA aggregation. IEEE T Fuzzy Syst 2008; 16(4): 994–1007.

48.

Dempster

. Upper and lower probabilities induced by a multivalued mapping. Ann Math Stat 1967; 38(2): 325–339.

49.

Shafer

. A mathematical theory of evidence. Princeton, NJ: Princeton University Press, 1976.

50.

Kang

Zhang

Gao

, et al. Environmental assessment under uncertainty using Dempster–Shafer theory and z-numbers. J Amb Intel Hum Comp. Epub ahead of print 6 February 2019. DOI: 10.1007/s12652-019-01228-y.

51.

Jiang

. An evidential dynamical model to predict the interference effect of categorization on decision making. Knowl-Based Syst 2018; 150: 139–149.

52.

Chang

Xue

, et al. Multiple criteria group decision making with belief distributions and distributed preference relations. Eur J Oper Res 2019; 273(2): 623–633.

53.

Xue

. Determining attribute weights for multiple attribute decision analysis with discriminating power in belief distributions. Knowl-Based Syst 2018; 143: 127–141.

54.

Deng

Han

Dezert

, et al. Evidence combination from an evolutionary game theory perspective. IEEE T Cybernetics 2016; 46(9): 2070–2082.

55.

Deng

Jiang

Wang

. Zero-sum polymatrix games with link uncertainty: a Dempster-Shafer theory solution. Appl Math Comput 2019; 340: 101–112.

56.

Zhang

Deng

. Evidential identification of influential nodes in network of networks. Chaos Sol Fract 2018; 117: 283–296.

57.

Smets

Kennes

. The transferable belief model. Artifi Int 1994; 66(2): 191–234.

58.

Qian

, et al. A new rule to combine dependent bodies of evidence. Soft Comput. Epub ahead of print 1 February 2019. DOI: 10.1007/s00500-019-03804-y.

59.

Xinyang

Fuyuan

Yong

. An improved distance-based total uncertainty measure in belief function theory. Appl Intell 2017; 46: 898–915.

Fault diagnosis method based on time domain weighted data aggregation and information fusion

Abstract

Keywords

Introduction

Preliminaries

Data aggregation based on linear decaying weights

The Gaussian model

D-S evidence theory

Definition 1

The proposed method

Illustrative example and discussion

Step 1: aggregate the fault data

Step 2: generate Gaussian model for aggregated fault data

Step 3: generate the BPAs for aggregated testing data

Step 4: fuse the generated BPAs and make decision

Conclusion

Footnotes

Author contributions

Declaration of conflicting interests

Funding

ORCID iD

References