Abstract
Keywords
Introduction
The estimation of direction of arrival (DOA) has been widely used in many research fields, such as passive location, sonar array direction finding, seismic and geological resource detection, and mobile communication. The traditional DOA estimation algorithms include the maximum likelihood method, 1 the propagation operator method, 2 the multiple signal classification (MUSIC) algorithm, 3 the estimating signal parameter via rotational invariance techniques (ESPRIT) algorithm, 4 and other algorithms.5,6
The traditional spatial spectrum estimation algorithm usually assumes that the signal source is located in the far field of the array, that is, the range from the source to the array is far enough, so that the spherical wavefront of the signal radiation can be approximated as a plane wavefront at the receiving array. However, when the source is closer to the array, the curvature of the wavefront at the aperture crossing cannot be ignored. At this point, the location of the near-field source needs to be described in conjunction with the DOA and range parameters. Therefore, the high-resolution direction-finding algorithm based on the far-field hypothesis cannot be directly applied to the near-field case. The near-field source parameters estimation has become a hot issue because of its breadth application and engineering applicability, and many DOA estimation algorithms suitable for near-field source scenarios have been proposed.7–9 In terms of the MUSIC algorithm, Huang and Barkat 10 extended the traditional far-field MUSIC method to the near-field source, and this method continued singular value decomposition of the multidimensional matrix, requiring spectral peak search, so this method is very computationally intensive. Starer and Nehorai 11 proposed an improved MUSIC algorithm based on the path tracking method, the 2-dimensional (2-D) search problem of near-field MUSIC algorithm was converted into the 1-dimensional (1-D) search problem, and the 2-D parameters of the source were estimated by the iterative method. Challa and Shamsunder 12 proposed the method based on high-order cumulant for the location of near-field signal source, which showed superior performance; however, this kind of method needs to calculate the cumulant. Zhang et al. 13 proposed a reduced-dimension MUSIC algorithm based on the directional matrix split method, and this reduced-dimension algorithm was converted into the optimization of the reduced-dimension spectrum function, and the spectral search is only involved in the angle domain. The advantage of the traditional MUSIC algorithm is its easy implementation and high resolution at a high signal-to-noise ratio (SNR), but the disadvantage is it is very inefficient and computationally intensive at a low SNR.
In recent years, the intelligent algorithms have been developed, such as the neural network algorithm,14–16 the support vector machine (SVM) algorithm,17,18 and other artificial intelligence algorithms.19,20 The SVM is a machine learning method developed by Vapnik, 21 which is established based on the Vapnik–Chervonenkis (VC) dimension theory of statistical learning theory and minimum structure risk principle. In recent years, the SVM has been successfully applied in the design of spread spectrum receivers, speech recognition, image processing, regression problem, and other fields. Machine learning establishes the relationship between the input and output of the model through training data, which is not affected by array error and other factors, and has good robustness. The studies on the application of SVM to near-field sources have been less reported domestically and abroad so far, and compressed sensing and other methods adopt step-by-step estimation method due to the computational complexity.
The near-field source signal received by linear array is a 2-D parameter of distance and angle, so the calculation of direct parameter estimation is too large. Most of the near-field source subspace-based parameter estimation method needs to realize the decoupling of distance and angle under specific array arrangement and certain approximate conditions. Support vector regression (SVR) parameter estimation algorithm is usually one-parameter regression. When multiple near-field source signals are incident to the receiving array at the same time, SVM regression becomes the regression of multiple 2-D parameters, and the algorithm becomes more complex. Therefore, the research of SVM regression algorithm of near-field source is rarely reported. The estimation accuracy of parameters and the generalization performance of the algorithm have a lot to do with the training process. In order to improve the estimation accuracy of parameters and the generalization performance of the algorithm, a very large training data set is needed, which greatly improves the amount of data in the training process. There is an urgent need to find a way to reduce the amount of computation. Principal component analysis (PCA) has attracted our attention. As a basic mathematical analysis method, PCA is often used in face recognition, image compression, feature extraction, and other fields. The advantages of PCA method are data compression and multidimensional data dimensionality reduction and noise reduction, so as to reduce redundancy and transition fitting, and the operation is simple without parameter restrictions. In this article, PCA method is introduced into the SVM regression algorithm of near-field parameter estimation, and PCA is used to extract the number of input features of SVM regression algorithm and reduce noise.
This article uses an SVR method to model the signal. In most of the literature, the upper triangular element of the covariance matrix of the received signal is used as the input to the SVR machine. However, the number of upper triangular elements in the covariance matrix is large, resulting in too large an input feature dimension, too long training and testing time, and too high algorithm complexity. In this article, a method of near-field acoustic source localization based on PCA and SVR is presented. The upper triangular elements of the covariance matrix of the received data are extracted first, feature extraction and dimension reduction are realized through the PCA, and the less number of features than those of original features are obtained. The dimension-reduced features are used as input features of SVR, the incident angle and range are used as output for training, and the function between the input and output signals is herein obtained. Simulation results show that this method has a high estimation accuracy and practical computational speed, and has strong adaptability at low SNR, and the prediction precision is better than that of the traditional method.
Array structure and receiving data model
In this article, the scalar sound pressure sensor array is adopted as the data-receiving model of the near-field sound source. The array element is a spatial omnidirectional sensor uniformly distributed on the x-axis. The number of array elements is

Schematic diagram of array structure.
Suppose that
At the sampling time
where
In the near-field case, there exists an approximate relationship
After Fresnel approximation is adopted,
Thus, the received signal on the
where
Equation (4) can be written as matrix
where
The signal is sampled, and the number of snapshots is
where
Principle of PCA-MSVR algorithm
Multi-output support vector regression algorithm
In order to solve the problem of regression estimation of multiple variables, multi-output support vector regression (MSVR) proposed by Pérez-Cruz et al. 22 is the promotion of standard SVR. This article is just a brief introduction to MSVR; for more details, please refer to the literature.23–25
In this article, the upper triangular element of the covariance matrix
The
where
Optimization problems are solved using an iterative process, each depending on the previous solution (
The first-order Taylor expansion of the target function (7) is shown as follows
where
Furthermore, the second-order Taylor expansion is obtained from equation (7)
where
Equations (11) and (12) can be written as matrix
where
The inner product kernel function
where
The IRWLS procedure can be summarized as the following steps
Initialization: Set
Compute the solution to equation (14), and label them as
The search step size
Compute
The convergence proof of the above algorithm is given in Sanchez-Fernandez et al.
24
For each new vector
Since the covariance matrix of the signal
PCA algorithm
PCA is a statistical method. PCA is to replace the original indicators with a new set of independent comprehensive indicators by recombining the original indicators which have a certain correlation. 26 The main idea is to establish the feature mapping relationship from high-dimensional space to low-dimensional space, and the original complex features have to reduce several main features, so that the original feature information are retained as much as possible, and not related to each other. This set of linearly independent feature by the orthogonal transformation is called principal component. After the dimensionality reduction of the upper triangular element of the signal covariance matrix by PCA, the redundant information is discarded, which increases the sampling density of the sample. At the same time, when the data are affected by noise, the eigenvector corresponding to the minimum eigenvalue is often related to noise, abandoning redundant information can remove noise to a certain extent. 27
In recent years, PCA has been widely used in various fields, and granted results have been achieved. In this article, PCA method is applied to the dimension reduction and noise reduction of support vector input features. The main components of input characteristic variable are extracted to reduce data redundancy, imperfection, and over fitting, so as to reduce the dimension and calculation complexity of regression model matrix of
1. The covariance matrix of
2. After the signal, eigenmatrix
3. The covariance matrix
where
4. Find the percentage of eigenvalues, and select a larger eigenvalue, that is
In the above expression,
5. The eigenvectors corresponding to eigenvalues are constructed into a matrix
Data preparation. The covariance matrix of the training sample signal is obtained, and the upper triangular element is extracted as the characteristic matrix of the signal.
Dimensionality reduction. PCA is used to reduce the dimension of the signal’s feature matrix.
Forecast the model. The training data are trained using MSVR to obtain a predictive model of the signal.
Performance estimation. The predictive data are used to predict and estimate performance.
Under the condition that the estimation accuracy is almost the same, PCA-MSVR algorithm can well retain the characteristic information of the signal with minimal datum as possible as we can. The PCA-MSVR algorithm also does not need eigen-decomposition and peak search, and can be realized quickly.
Simulation results and performance analysis
Two near-field, narrow-band, non-Gaussian stationary, sound source signals are incident into the uniform linear sensor array as shown in Figure 1. The receiving array is composed of eight arrays, The inter-element spacing is

Contribution rate of each principal component.

Cumulative contribution rate of principal components.

PCA-MSVR DOA estimation.

PCA-MSVR range estimation.

RMSE of DOA at different SNR.

Training time comparison of different algorithms.
Figure 2 shows the contribution rate of each principal component to the signal characteristics. It can be seen from the Figure 2 that the contribution rate of the first few principal components is higher. Figure 3 shows the cumulative contribution rate of principal components. From Figure 3, it can be seen that the cumulative contribution rate of the first eight principal components is 99%. In this article, the cumulative contribution rate of the principal component is 95%. Therefore, the first eight principal components are selected.
Figures 4 and 5 are the fitting diagram of the DOA and range estimated by the PCA-MSVR method, at different test sample size. From Figures 4 and 5, it can be seen that the estimated angle and range parameters of the near-field source can be well fitted with the actual value, and the proposed PCA-MSVR algorithm can accurately estimate the angle and range parameters of the near-field source. Simulation results show the effectiveness of the proposed algorithm.
Figure 6 plots the DOA estimation root mean square error (RMSE), respectively estimated by two-step MUSIC, back propagation (BP), MSVR, general regression neural network (GRNN) and the proposed PCA-MSVR algorithm, at various SNR levels. As can be seen from the Figure 6, in the SNR range (namely, at or above −10 dB), the RMSE of proposed PCA-MSVR algorithm is nearly the same as MSVR algorithm, the DOA estimation precision based on the PCA-MSVR and MSVR algorithm is more notable than that of the BP and two-step MUSIC algorithms. When the SNR is within the range of −5 to 20 dB, the RMSE of proposed PCA-MSVR algorithm is nearly the same as MSVR and GRNN algorithm; the DOA estimation precision of PCA-MSVR, MSVR, and GRNN algorithms is more notable than that of the BP and two-step MUSIC algorithms. Although the performance is the same, the calculation amount of proposed PCA-MSVR algorithm is significantly less than the MSVR and GRNN algorithms.
The complexity of the training process is determined by the convergence performance of the algorithm, so it is difficult to give a quantitative analysis of the complexity of the algorithm by using mathematical formula. The training time of the PCA-MSVR algorithm is compared with MSVR algorithms in Figure 7, and it can be seen that the training time of the PCA-MSVR is 0.06 s; however, the training time of the MSVR is about 0.08 s. Compared with MSVR algorithm, the complexity of PCA-MSVR algorithm is significantly reduced.
For further explanation, the frequency condition in simulation experiment is reset as
Figures 8 and 9 are the scatter diagram of DOA and range estimation for two near-field coherent signals, and it can be seen in these two figures that the DOA and range estimated values of the near-field signals can fit well with the actual ones, and the proposed algorithm can estimate the parameters of the coherent near-field sources.

Scatter diagram of DOA estimation.

Scatter diagram of range estimation.
Conclusion
In this article, an SVR method of combing elevation and range is implemented by PCA dimensionality reduction. With the performance guaranteed, the computational complexity is almost not increased. First, the upper triangular elements of the covariance matrix of the received signal from the sample data are extracted, and the dimensionality is reduced through PCA. Second, the reduced dimensionality matrix is taken as the input feature of the MSVR machine. Finally, the multi-output SVR algorithm is used for modeling to obtain the parameter model of the near-field estimation. PCA greatly reduces the dimension of input features of SVM, and it also reduces the complexity of data processing, and the training time is also shortened accordingly. At the same time, the noise is restrained without losing the lossless original data information, and the SNR is herein improved; as a result, the estimation accuracy is improved. The proposed method has a more superior performance compared with BP and GRNN algorithms in low SNR. This method has no special requirements for the array structure and is suitable for both uniform and non-uniform linear arrays. Because the model parameters are obtained through data training, the array error does not affect the accuracy of parameter estimation. Simulation results show that the proposed method based on PCA dimension reduction and multiple output SVR has high estimation accuracy.
