Abstract
Introduction
Array-based beamforming has become an indispensable acoustic source identification technology in some industries such as aviation aircraft,1–4 rail train, 5 wind turbine,6–9 mining industry, 10 and automobile11–13 over decades. Conventional beamforming (CB) has the advantages of fast measurement speed, high computational efficiency, and suitable for medium-to-long distance measurement. However, due to the insufficient spatial resolution and lack of spurious sources attenuation, the acoustic source identification accuracy is low.11,14–17 To overcome the above disadvantages, many deconvolution algorithms are developed, such as deconvolution approach for the mapping of acoustic sources (DAMAS),18,19 non-negative least square (NNLS), 20 Richardson–Lucy (RL), 20 etc. The core idea of these algorithms is to obtain the real information of the acoustic source through the deconvolution operation, because the output of CB can be approximated as the spatial convolution of the sound source distribution and the point spread function (PSF). However, due to disadvantages like time consuming and slow convergence, the practical application of the above algorithms is somewhat limited. To improve the computational efficiency and convergence of these algorithms, the spatial convolution is transformed into the product in wave-number domain based on fast Fourier transform (FFT). The corresponding algorithms are named DAMAS2, 21 FFT-NNLS, 20 FFT-RL, 20 etc., respectively. All the above Fourier-based deconvolution algorithms are built on the assumption of a shift-invariant PSF. It results in poor identification performance of the acoustic source away from the center of the calculation plane. Xenaki et al. 22 proposed a specific coordinate transformation that suppresses the spatial shift-variant of PSF to expand the effective angle of the source identification for the Fourier-based deconvolution algorithm. Ehrenfried and Koop 20 and Chu and Yang 23 compared the computational efficiency and acoustic source identification accuracy of several Fourier-based deconvolution algorithms. The results show that FFT-NNLS has excellent comprehensive performance.
Recently, Lylloff et al. 24 proposed a Fourier-based fast iterative shrinkage thresholding algorithm (FFT-FISTA). Compared with FFT-NNLS, FFT-FISTA enjoys higher computational efficiency and faster convergence. Like other Fourier-based deconvolution algorithms, FFT-FISTA with irregular focus point grid can overcome the deficiency of failing to accurately identify the sound source far from the center of the calculation plane. 25 FFT-FISTA is derived from FISTA and FISTA is originally proposed to solve the linear inverse problem in image processing. 26 Bhotto et al. 27 introduced a positive definite weight matrix in the gradient function minimization of FISTA and proposed improved FISTA (IFISTA) to enhance the convergence and the image reconstruction accuracy of FISTA. Inspired by Lylloff et al., 24 Chu et al., 25 Beck and Teboulle, 26 and Bhotto et al., 27 to further improve the computational efficiency and the convergence of FFT-FISTA, a Fourier-based IFISTA (FFT-IFISTA) for acoustic source identification is proposed in this paper.
In this paper, a FFT-IFISTA is proposed to enhance the calculating efficiency and convergence of deconvolution. In addition, to achieve a better identification performance, the irregular focus grid is applied. Results of the simulation and experiment indicate that the proposed FFT-IFISTA can acquire a better acoustic identification performance with higher calculating efficiency and better convergence than other methods such as FFT-NNLS and FFT-FISTA.
Methods
Figure 1 shows the layout of beamforming measurement. Vector

Layout of beamforming measurement.
Furthermore, the error function
To obtain
Due to the introduction of positive definite weight matrix
For PSF matrix
Matrix
Substituting equation (9) into equation (6)
Substituting equations (7) and (15) into equation (5)
Under the periodic boundary condition, the Fourier transform of equation (17) can be expressed as
Hence, FFT-IFISTA converts equation (3) into a Fourier-based minimization equation
Results and discussions
Simulations
As shown in Figure 1, the 36-channel Brüel & Kjær sector microphone array with 0.65 m diameter is used in the simulation. Two monopole sources are located at (−0.2, 0.2, 1) m and (0.2, 0.2, 1) m, with a sound pressure contribution of 100 dB. As shown in Figure 2(a), the calculation plane is

Focus grid. (a) Regular mesh planar and (b) irregular mesh planar.
Figure 3(a) and (b) shows the results of CB at 2000 and 6000 Hz, respectively. The outputs are normalized based on the maximum value and the display dynamic range of the image is 15 dB. The cross “

CB results at different frequencies. (a) 2000 Hz and (b) 6000 Hz.
Figure 4 shows the 1000 iterations imaging results of FFT-NNLS, FFT-FISTA, and FFT-IFISTA with weight coefficient from 1 to 6 at 2000 Hz. Compared to Figure 3(a), each deconvolution algorithm can effectively attenuate the sidelobe, narrow the mainlobe, and separate the fused acoustic sources. The identified position of the acoustic source is consistent with the true position. However, the above simulations are all based on the regular focus grid as shown in Figure 2(a) whose PSF shift-variant is significant, which deteriorates the identification performance of these Fourier-based deconvolution algorithms. That is, the PSF used in deconvolution is different from the theoretical one at the acoustic source position, which resulted in the shapes of the acoustic sources being irregular and spurious sources appearing near the mainlobe. In short, the identification performance is not good enough.

Comparison of different deconvolution algorithms with regular focus grid at 2000 Hz. (a) FFT-NNLS; (b) FFT-FISTA; (c) FFT-IFISTA,
To suppress the shift-variant of PSF, an irregular focus grid introduced in Xenaki et al. 22 and Chu et al. 25 as shown in Figure 2(b) is applied. For convenience comparison, the imaging region size of the irregular focus grid is taken the same as that of the regular focus grid.
Figure 5 shows the identification results of the deconvolution algorithm with the irregular focus grid. The frequency is 2000 Hz and the number of iterations is 1000. Compared with the identification results of the regular focus grid, the deconvolution algorithm with irregular focus grid can effectively narrow the mainlobe, improve the spatial resolution, eliminate the spurious source, and round the shape of acoustic source. Figure 5 shows that FFT-NNLS has the widest mainlobe. The mainlobe width of FFT-IFISTA with the value of weight coefficient 1 is similar to that of FFT-FISTA. As the weight coefficient increases, the mainlobe width of FFT-IFISTA gradually decreases. It means that the increment of weight coefficient can make the identification performance of FFT-IFISTA better than the other two algorithms. However, for the array used in this paper, when the weight coefficient increment exceeds 4, the decrease of the mainlobe width is not obvious.

Comparison of different deconvolution algorithms with irregular focus grid at 2000 Hz. (a) FFT-NNLS; (b) FFT-FISTA; (c) FFT-IFISTA,
Figure 6 shows the simulation acoustic source mapping of deconvolution algorithm with the irregular focus grid at 6000 Hz, and the number of iterations is 1000. Compared with Figure 3(b), the deconvolution algorithms can effectively attenuate spurious sources of CB. After 1000 iterations, a few spurious sources appear in the identification results of FFT-NNLS and FFT-FISTA. While FFT-IFISTA with the weight coefficient 1 and 2 can effectively eliminate the spurious source and achieve a better acoustic source identification. When the weight coefficient is 3, spurious sources appear in Figure 6(e). Figure 6(e) to (h) shows that the larger the weight coefficient is, the more spurious sources appear, which makes the acoustic source identification of FFT-IFISTA worse. Therefore, it is recommended to use a smaller weight coefficient to avoid spurious sources at high frequency. According to the identification results at different frequencies, for the array used in this paper, the recommended weight coefficient should not be larger than 3.

Comparison of different deconvolution algorithms with irregular focus grid at 6000 Hz. (a) FFT-NNLS; (b) FFT-FISTA; (c) FFT-IFISTA,
Figures 7 and 8 show the above-mentioned deconvolution algorithm results with the signal-to-noise ratio (SNR) of 10 dB at 2000 and 6000 Hz, respectively, and the number of iterations is 1000. Compared to Figure 5, the acoustic source in Figure 7 is deformed by the influence of noise, but still it can be accurately identified by these deconvolution algorithms. Comparing Figure 8(a) and (b) with Figure 6(a) and (b), spurious sources of FFT-NNLS and FFT-FISTA are increased. A similar phenomenon of FFT-IFISTA occurs in Figure 8(f) to (h) when the weight coefficient is greater than 3. However, compared with Figure 6(c) to (e), no more spurious sources appear in Figure 8(c) to (e), which indicates that FFT-IFISTA can effectively suppress noise interference when the weight coefficient is not greater than 3. To sum up, FFT-IFISTA with smaller weight coefficient is more robust to noise than FFT-NNLS and FFT-FISTA.

Comparison of different deconvolution algorithms with irregular focus grid and 10 dB SNR at 2000 Hz. (a) FFT-NNLS; (b) FFT-FISTA; (c) FFT-IFISTA,

Comparison of different deconvolution algorithms with irregular focus grid and 10 dB SNR at 6000 Hz. (a) FFT-NNLS; (b) FFT-FISTA; (c) FFT-IFISTA,
To analyze the causes of sidelobes in FFT-IFISTA when the weight coefficient is large, a fixed spurious sound source is selected in the figure and the sidelobe level comparison of FFT-IFISTA with the weight coefficients from 3 to 6 at 6000 Hz is given in Figure 9. The display dynamic range of the image in this figure is 40 dB. Comparing Figure 9(a) with Figure 6(e), the selected spurious source in Figure 9(a) is −17.11 dB, which does not appear in Figure 6(e) of 15 dB display dynamic range. With the increase of the weight coefficient, the intensity of the selected spurious source in Figure 9(a) to (d) increases from −17.11 to −8.48 dB gradually. The reason is that due to the increase of the weight coefficient

Sidelobe level comparison of FFT-IFISTA with different weight coefficient at 6000 Hz. (a) FFT-IFISTA,
Furthermore, the convergence and computational efficiency of the above deconvolution algorithms are compared. The convergence can be quantified by the standard deviation function between the theoretical acoustic source contribution and the reconstructed one obtained by iteration
20
; the standard deviation function is
Figure 10 shows the convergence performance of FFT-NNLS, FFT-FISTA, and FFT-IFISTA at 2000 Hz. The black dash-dotted line and the blue dotted line represent the convergence curves of FFT-NNLS and FFT-FISTA, respectively. Pink dash-dotted line, black solid line and red solid line, pink dash-dotted line, black solid line, red dotted line, blue dash-dotted line, and pink dotted line represent the convergence curves of FFT-IFISTA with the weight coefficients 1, 3, and 5, respectively. In Figure 10(a), at the beginning of the iteration, the descent speed of FFT-NNLS standard deviation is the fastest and tends to be stable after 100 iterations, while the curves of FFT-IFISTA and FFT-FISTA are still decreasing. Comparing the curves of the FFT-FISTA and FFT-IFISTA with different weight coefficients, it can be found that the descent speed of FFT-IFISTA with the weight coefficient 1 is slightly faster than that of FFT-FISTA, and the larger the weight coefficient is, the faster the standard deviation of FFT-IFISTA decreases. By contrast of the stable standard deviation of FFT-NNLS, FFT-FISTA, and FFT-IFISTA, it is found that the larger the weight coefficient of FFT-IFISTA is, the less the iterations number of stable standard deviation requires, and the standard deviation is smallest. From the convergence comparison of three algorithms, conclusion can be drawn that the convergence of FFT-IFISTA is the best, followed by FFT-FISTA, and the last is FFT-NNLS. Furthermore, increasing the weight coefficient appropriately can improve the convergence of FFT-IFISTA. Figure 10(b) describes the mainlobe source power cross-sectional plot of the deconvolution map after 1000 iterations. It shows that the source power estimation of FFT-NNLS is the smallest, followed by FFT-FISTA. The larger the weight coefficient, the more accurate FFT-IFISTA source strength estimate.

Convergence comparison of three deconvolution algorithms. (a) The relation between standard deviation and iterative times and (b) cross-section of maps at y = 0.2 after 1000 iterations. FFT: fast Fourier transform; FFT-FISTA: Fourier-based fast iterative shrinkage thresholding algorithm; FFT-IFISTA: Fourier-based improved fast iterative shrinkage thresholding algorithm; NNLS: non-negative least square.
To compare the computational efficiency of FFT-NNLS, FFT-FISTA, and FFT-IFISTA, time-consuming curves of three algorithms with different iterations are given in Figure 11. Figure 11(a) describes the relation between time consuming and iterative times. It is worth mentioning that, for FFT-IFISTA, the value of the weight coefficient does not affect its computational efficiency, so the figure only gives the result of the weight coefficient 6. Within five iterations, computational efficiency of FFT-IFISTA is slightly lower than that of FFT-FISTA, after that, its computational efficiency is higher than that of the others. At 5000 iterations, the consuming time of FFT-IFISTA is 3.11 s, FFT-FISTA is 8.35 s, and FFT-NNLS is 12.71 s. Figure 11(b) describes the relation between standard deviation and time. As can be seen from Figure 10(a), since the standard deviation of FFT-NNLS converges to a large value and remains stable, Figure 11(b) does not include the relationship between FFT-NNLS standard deviation and time. It shows that FFT-IFISTA takes less computational time than FFT-FISTA to achieve the same standard deviation and the larger the weight coefficient, the less time it takes. From the relation between standard deviation and computing time, it more intuitively shows that FFT-IFISTA converges faster than FFT-FISTA.

Computational efficiency comparison of three deconvolution algorithms. (a) The relation between time consuming and iterative times and (b) the relation between standard deviation and time. FFT: fast Fourier transform; FFT-FISTA: Fourier-based fast iterative shrinkage thresholding algorithm; FFT-IFISTA: Fourier-based improved fast iterative shrinkage thresholding algorithm; NNLS: non-negative least square.
In summary, FFT-IFISTA has the highest computation efficiency and it takes less time to achieve a satisfactory acoustic source identification performance than FFT-NNLS and FFT-FISTA.
Experiments
Figure 12 shows the experimental configuration. The 36-channel Brüel & Kjær sector microphone array with 0.65 m diameter is used in the experiment and the distance between array and loudspeaker source plane is 1 m. The loudspeakers are located near (−0.2, 0.2, 1) m and (0.2, 0.2, 1) m.

Experimental configuration.
Acoustic source mapping of CB at 2000 and 6000 Hz is shown in Figure 13(a) and (b), respectively. Source mainlobes fuse together due to the poor spatial resolution at 2000 Hz, and many spurious sources appear in the mapping due to the high sidelobe level at 6000 Hz. The experimental results are consistent with the simulation results.

CB experimental results. (a) 2000 Hz and (b) 6000 Hz.
Figures 14 and 15 show the acoustics source mapping of three deconvolution algorithms using an irregular focus grid at 2000 and 6000 Hz, respectively. The number of iterations is 1000. According to the recommendation, in the following experimental results of FFT-IFISTA, the weight coefficients 1 and 3 are used.

Comparison of different deconvolution algorithms with irregular focus grid at 2000 Hz. (a) FFT-NNLS; (b) FFT-FISTA; (c) FFT-IFISTA,

Comparison of different deconvolution algorithms with irregular focus grid at 6000 Hz. (a) FFT-NNLS; (b) FFT-FISTA; (c) FFT-IFISTA,
Comparing Figures 14 and 15 to Figure 12, three deconvolution algorithms can significantly narrow the mainlobe, improve the spatial resolution, and attenuate the spurious source. From the comparison of all mappings in Figure 14, it can be found that the mainlobe of FFT-IFISTA with the weight coefficient 3 is narrowest, followed by FFT-IFISTA with the weight coefficient 1, next is FFT-FISTA, and the last is FFT-NNLS. From the comparison of all mappings in Figure 15, for FFT-IFISTA, when the weight coefficient is taken as 1, no spurious sources appear in the identification results. When 3 is taken, a few spurious sources appear, resulting in a slightly poorer acoustic source identification. However, it is still better than that of the other two algorithms.
To sum up, when weight coefficient does not exceed 3, the identification performance of FFT-IFISTA is better than FFT-NNLS and FFT-FISTA, which is consistent with the simulation results.
Conclusions
This paper proposes FFT-IFISTA-based deconvolution algorithms for acoustic source identification. Like other deconvolution algorithms such as FFT-NNLS and FFT-FISTA, FFT-IFISTA with irregular focus grid can achieve better acoustic source identification performance than that with regular focus grid. Compared with FFT-NNLS and FFT-FISTA, FFT-IFISTA has a higher computational efficiency and better convergence. By selecting the appropriate weight coefficient, FFT-IFISTA can achieve excellent performance with the advantages of narrow mainlobe and spurious sources attenuation. Synthesizing various aspects of performance, for the array used in this paper, the recommended weight coefficient is 3.
