Sage Journals: Discover world-class research

Abstract

Multi-sensor image fusion is always an important and opening problem, which can enhance visual quality and benefit some social security applications. In this article, we use contrast pyramid to decompose visible and infrared images, respectively, and the directional filter banks are applied to obtain multiple directional sub-band image features. Then, we compute the decomposition coefficients of visible and infrared images using a low-pass filter on the decomposed data; and finally, we introduce the whale optimization algorithm to search optimal coefficients to reconstruct the final fusion image. The experiments are conducted on multiple datasets with subjective and objective comparisons, in which the qualitative and quantitative analyses indicate the validity of the proposed method.

Keywords

Image fusion multi-sensors directional features optimization whale optimization algorithm

Introduction

Traditional imaging device can only capture a limited spectral emission or radiation of lights of the target scene, in which exist unpredictable objects or materials. Compound imaging and advanced signal processing are widely studied to enhance imaging capability and adjustability. Heterogeneous image fusion is a typical image-enhancing technique that combines at least two distinct source or sensor images into a single composite image. The fused image will be more informative, which may be deteriorated and limited by low signal-to-noise ratio condition. Hence, many applications are benefited from computational theory of multi-source image fusion, especially for the social security surveillance, network security monitoring, network video analysis, and military observation.^1–3

According to the heat radiation emitted by the object, infrared rays can be used to detect hidden heat sources in complex environments. To capture the heat radiation of specific targets, the infrared image sensor is designed in such a way that it is only sensitive to light wavelength between 780 and 2000 nm, but at the cost to lose the information of visible light parts. On the contrary, visible images contain more rich texture of surrounding environment, which may also be important to effective target observation.^4,5 In this article, we discuss the visible and infrared image fusion problem by contrast and directional features optimization (CDFO).

The image fusion can usually be divided into three levels: pixel level, feature level, and decision level. Traditional feature-level fusion method includes weight-averaging method, Bayesian estimation method, and cluster analysis method.⁶ These methods combine the features extracted from multi-source information on different sensors (including shape, edge, region, contour, texture, and horn). The feature extraction and comprehensive analysis can be well processed even with number of different sensor images. However, the hierarchical information fusion at different resolutions is still an open problem. Decision-level fusion method⁷ could find feature correspondence via classification and identification of each image, and then further fusion process will be carried out by global optimization. The non-uniform resolution problem cannot produce a good fusion result due to simple decision-level fusion with multi-resolution images. In recent years, pixel level–based fusion methods^8,9 show more advantages on those cases with complex images. For example, Kannan and Perumal¹⁰ proposed a pixel-level image fusion method based on discrete wavelet transform (DWT), in which an image can be decomposed into a sequence of different spatial resolution images. Then, the fusion process can be applied on different image scales. Li and Dong¹¹ and He et al.¹² presented a contrast pyramid (CP) fusion method. The original image is resolved into a series of decomposing layers with different resolution and frequencies. The fusion process is carried out separately on the components of each spatial frequency layer. In order to get better direction information, Jin et al.^13,14 suggested fusing the satellite infrared image with visible image by introducing directional filter banks. This method can extract the directional sub-graph of infrared image, so that the directional information can be well preserved during the fusion process.

The signal processing by human visual system is performed on different channels of different scales,¹⁵ so we introduced CP to pre-processing. Considering the direction information in full and the edge feature of high-frequency information, directional filter banks are used to get more direction features. First, the CP is used to decompose the image and adjust the contrast in different scales. The input multi-channel images are decomposed on different frequency bands, and the multi-resolution CP sequences form a pyramid-like structure. Each scale of CP corresponds to different spatial frequency characteristics. Second, we use directional filter banks to process the decomposition image on the basis of CP. Filter group provides an efficient analysis and synthesis of discrete signal structure. Specific structure of filter bank can obtain continuous multi-resolution base. Hence, we can obtain the abundant direction information and multi-scale information. During the fusion process, traditional methods determine the fusion coefficient weights according to the experience of human, but we cannot get a good fusion effect. Therefore, whale optimization algorithm (WOA) is introduced to search the optimal fusion coefficients adaptively. Compared to majority of the traditional methods, the proposed method based on CDFO provides an accurate image fusion result for getting the most significant features from the multiple sensor images. In this article, the overall technical route of image fusion based on CDFO is shown in Figure 1.

Figure 1.

Overall technical route of CDFO image fusion.

Related work

Simple pixel-level fusion can be treated as a linear summation of pixels from the distinct source images,¹⁶ as shown in equation (1)

$g (x, y) = a_{1} f_{1} (x, y) + a_{2} f_{2} (x, y) + \dots + a_{n} f_{n} (x, y)$ (1)

where a₁ + a₂ + ⋯ + a_n = 1 represents the total contributions of different sources that should be normalized for the fusion image result g(x, y). The mechanism of pixel-level image fusion is shown in Figure 2. For two original images, pre-treatment is done first: inverted histogram matching and histogram equalization are performed; feature extraction and image registration are processed on it; next, fusion rule is applied to fuse image; finally, the corresponding coefficients are selected through the fusion rules to get the final fused image.

Figure 2.

Pixel-level image fusion mechanism.

CP decomposition

In order to obtain multi-resolution-level image information, the original visible and infrared image should be pre-processed. The principles of CP decomposition are as follows.

Establishing Gaussian decomposition

CP is based on Gaussian pyramid. First, we use I(i, j) to represent the original image. Second, Gaussian pyramid^17,18 is used to get the decomposition image G_l. Finally, interpolation method is used to enlarge the G_l, so that we can get the image ${G_{l}}^{*}$ . The size of ${G_{l}}^{*}$ is the same as G_l₋₁. The equation is as follows

${G_{l}}^{*} (i, j) = 4 \sum_{m = - 2}^{2} \sum_{n = - 2}^{2} w (m, n) G_{l}' (\frac{i + m}{2}, \frac{i + n}{2})$ (2)

where 0 < l ≤ N, 0 < i ≤ u, and 0 < j ≤ v; and u and v are the number of rows and columns, respectively. In the l layer, w(m, n) is a window function and is actually a low-pass filter that satisfies the following constraints

Separability: w(m, n) = w(m)·w(n), $m \in [- 2, 2]$ , $n \in [- 2, 2]$

Normalization: $\sum_{n = - 2}^{2} w (n) = 1$

Symmetry: w(n) = w(−n)

The limitation of the above constraint is to ensure the low-pass property and the smoothness of the image after the image is reduced and expanded, and no seam effect occurs. Thus, we construct w(0) = 3/8, w(1) = w(−1) = 1/4, and w(2) = w(−2) = 1/16 and use the common window width of 5 × 5.

Building the CP

In equation (2), we can calculate $G'_{l}$ as follows

$\begin{matrix} G'_{l} (\frac{i + m}{2}, \frac{i + n}{2}) \\ = {\begin{matrix} G_{l} (\frac{i + m}{2}, \frac{i + n}{2}), & \frac{i + m}{2}, \frac{i + n}{2} is integer \\ 0, & others \end{matrix} \end{matrix}$ (3)

The window function has low-pass filter properties, thus ${G_{l}}^{*}$ can be seen as the background of $G_{l}$ . Therefore, the definition of CP can be decomposed as equation (4)

$\begin{matrix} C_{1} = \frac{G_{1}}{{G^{*}}_{l + 1}} = \frac{G_{l}}{Expand (G_{l + 1})} - I, & 0 ⩽ l < N \\ C_{N} = G_{N}, & l = N \end{matrix}$ (4)

where C_l is the lth layer of CP decomposition; G_l is the lth layer of Gaussian decomposition; and Expand(G_l_+ 1) is the interpolation magnification of G_l_+ 1, and its size is same as G_l.

Rebuilding the original image from the CP

Transforming equation (4), we can get a reconstruction equation of CP

$\begin{matrix} G_{N} = C_{N}, & l = N \\ G_{l} = (C_{l} + I) Expand (G_{l + 1}), & 0 ⩽ l < N \end{matrix}$ (5)

According to equation (5), we can iteratively start from the top layer, C_N, of CP (C_N, C_N₋₁, …, C₀) and make l = N, N−1, …, 0 at the same time. Each layer of the Gaussian pyramid can be obtained. Finally, the precise reconstruction of the original image, G₀, can be recovered exactly by reversing the above steps.

Construction of directional features

Considering the multi-sensor image characteristics, we use directional filter banks^19,20 to capture the directional features. The traditional directional filter banks lack multi-scale analysis capabilities, so in this article, we design a new directional filter bank to make it suitable for multi-sensor visible and infrared image.

It consists of two-channel diamond filter bank and parallelogram filter bank that cascade through the tree structure and through the L-level binary tree decomposition; the spectrum plane [−π, π) is decomposed into 2^l sub-bands and the band is divided into wedges, where each wedge spectral sub-band extracts image information that is perpendicular to its direction, where l is the cascade series as shown in Figure 3.

Figure 3.

Directional filter bank spectral segmentation: (a) spectrum division diagram and (b) decomposition part.

In Figure 3(b), Q₁, D₀, D₁, D₂, and D₃ are the inseparable sampling matrices, as shown in equation (6)

$\begin{matrix} Q_{1} = (\begin{matrix} 1 & 1 \\ - 1 & 1 \end{matrix}), D_{0} = (\begin{matrix} 0 & 2 \\ - 1 & 1 \end{matrix}), \\ D_{1} = (\begin{matrix} 2 & 0 \\ - 1 & 1 \end{matrix}), D_{2} = (\begin{matrix} 1 & 1 \\ 0 & 2 \end{matrix}), D_{3} = (\begin{matrix} 1 & 1 \\ - 2 & 0 \end{matrix}) \end{matrix}$ (6)

Figure 3(b) shows the three-stage cascaded decomposition of the directional filter bank (the reconstructed part is the inverse of the decomposed part). The wedge-shaped directional spectrum partitioning makes it suitable for extracting the edge contours with spatial localization in the image.

The WOA

Traditional fusion uses the maximum fusion rule or experience to determine the fusion coefficients, thus the fusion effect can be greatly deteriorated by unreasonable coefficient selection. In this article, we adopt a heuristic intelligent optimization algorithm to determine the optimal fusion coefficients.

Some brain regions of the whale are similar to the brain structure of human beings.²¹ Prey is one of the most characteristic behaviors of the whale. By studying the hunting behaviors of whales’ group in the nature environment, researchers have proposed a meta-heuristic optimization algorithm which is called WOA. In 2015, Mirjalili and Lewis²² proposed this method by simulating the hierarchy of the whale population and their hunting behaviors.

The proposed CDFO method

Contrast and directional features

The traditional CP decomposition has good physical meaning but ignores the direction information and affects the fusion effect. The directional filter bank adds direction based on the CP decomposition, which not only maintains the physical meaning of contrast but also detects the direction information well and can provide more feature information for the fused image.

In directional filter banks, we define a sampling of multi-dimensional space Z^d in the grid, and the grid is expressed by a non-singular integer matrix M. The size of M is d×d. The equation of M is as follows

$LAT (M) = {M_{n}, n \in Z^{d}}$ (7)

First, we give a proposition: LAT(A) = LAT(B), if and only if A = BE, among them, E is an unimodular integer matrix.

In this article, we use the decomposition of two-dimensional, two-channel ladder structure filter and quincunx sampling network. Two matrices are used to represent the quincunx sub-network

$\begin{matrix} Q_{0} = (\begin{matrix} 1 & - 1 \\ 1 & 1 \end{matrix}), & Q_{1} = (\begin{matrix} 1 & 1 \\ - 1 & 1 \end{matrix}) \end{matrix}$ (8)

In directional filter banks, the following four basic unimodular matrices are used to achieve the invariance of rotation operation

$\begin{matrix} R_{0} = (\begin{matrix} 1 & 1 \\ 0 & 1 \end{matrix}), R_{1} = (\begin{matrix} 1 & - 1 \\ 0 & 1 \end{matrix}), \\ R_{2} = (\begin{matrix} 1 & 0 \\ 1 & 1 \end{matrix}), R_{3} = (\begin{matrix} 1 & 0 \\ - 1 & 1 \end{matrix}) \end{matrix}$ (9)

here, we use Smith’s decomposition which can diagonalize any integer matrix M into a product form of UDV. In this form, U and V are the unimodular integer matrices and D is an integer diagonal matrix.²³ Thus, the quincunx matrix in equation (8) can be represented by Smith’s form as follows

$\begin{matrix} Q_{0} = R_{1} D_{0} R_{2} = R_{2} D_{1} R_{1} \\ Q_{1} = R_{0} D_{0} R_{3} = R_{3} D_{1} R_{0} \end{matrix}$ (10)

where $D_{0} = (\begin{matrix} 2 & 0 \\ 0 & 1 \end{matrix}) and D_{1} = (\begin{matrix} 1 & 0 \\ 0 & 2 \end{matrix})$ . D₀ and D₁ are the two-dimensional diagonal matrix and each dimension corresponds to a binary sampling.

In the first two layers of directional filter banks, quincunx filter banks (QFBs) will be used on each layer. Q₀ is the sampling matrix on the first layer and Q₁ on the second layer. From the third layer, the rest of the tree structure begins to extend. To obtain better frequency division, we use the QFB for resampling. The first half of the channel uses R₀ and R₁ resampling and gets a sub-band based on the direction of the basic level: +45° and 45°. The second half of the channel is resampled with R₂ and R₃ to produce the remaining sub-bands.

We can simplify the total sampling matrix of the QFB with the quincunx matrix of Smith’s form. We can calculate the sampling matrix using the following equation

$\begin{matrix} p_{0} = R_{0} Q_{0} = D_{0} R_{2} \\ p_{1} = R_{1} Q_{1} = D_{0} R_{3} \end{matrix}$ (11)

According to the proposition presented above, we can get the following equation

$LAT (p_{0}) = LAT (p_{1}) + LAT (D_{0})$ (12)

The above equation shows the sampling network of QFB, which uses the R₀ and R₁ resampling and is equivalent to the down-sampling that is along the n₀-dimensional space. Although CP with directional filter banks has some excellent features of multi-scale and multi-directional geometric analysis, for image fusion algorithms, the choice of fusion coefficients is also very important. WOA has attracted much more attention due to its excellent performance and powerful search capability.

Fusion coefficient optimization with the WOA

The traditional fusion coefficient is selected through human experience. In this article, WOA is introduced to optimize the adaptive fusion coefficient, so that the image contrast information and direction information can be maximally retained and the fusion effect is obviously improved.

The WOA builds a global search mechanism by simulating the hunting behavior of whale, which can simulate the attacking mechanism of humpback whales (bubble-net hunting). The details of the WOA algorithm are described as follows.

Encircling prey

Before hunting, humpback whales identify the location of prey and then encircle their prey. In search space, the optimal position of the target prey is unknown at the beginning of time, thus WOA assumes that the current optimal solution is the target prey or close to the optimal position. When we ascertain the optimal search agent, other searches will update their own position based on this agent. Mirjalili and Lewis²² described this behavior in a mathematical model as follows

$\begin{matrix} \vec{D} = | \vec{C} \cdot {\vec{X}}^{*} (t) - \vec{X} (t) | \\ \vec{X} (t + 1) = {\vec{X}}^{*} (t) - \vec{A} \cdot \vec{D} \end{matrix}$ (13)

where t indicates the current iteration, $\vec{A}$ and $\vec{C}$ are the coefficient vectors, $X^{*}$ is the position vector of the target, and $\vec{X}$ is the optimal position vector of the humpback whales. We can calculate $\vec{A}$ and $\vec{C}$ using the following equation

$\begin{matrix} \vec{A} = 2 \vec{a} \cdot \vec{r} - \vec{a} \\ \vec{C} = 2 \cdot \vec{r} \end{matrix}$ (14)

here, the value of $\vec{a}$ is linearly decreased from 2 to 0 and $\vec{r}$ is the random vector in the interval [0, 1].

The two-dimensional feedback behavior is shown in Figure 4. In Figure 4, X^* is the best position we have got so far, and we can update the current position (X, Y) by (X^*, Y^*). The mathematical model to describe the hunting behavior of bubble-net is mentioned in equation (13).

Figure 4.

Predicted possible locations of position vectors in 2D space.

Bubble-net searching mechanism has two patterns: the first is shrinking encircling pattern and the second is spiral updating position.

Shrinking encircling pattern. Like a quadrangle, we updated the location of the optimal search agent by changing the value of $\vec{a}$ in equation (14), thus the fluctuation range of $\vec{A}$ decreased. When the value of $\vec{a}$ decreased from 2 to 0, $\vec{A}$ becomes a random value between −a and a; When the value of $\vec{A}$ falls between −1 and 1, the next position of humpback whales can be any position between their current position and the prey. In Figure 5, we can observe this relationship clearly.

Spiral updating position. In this mathematical model, we use a spiral equation to calculate the distance between whale (X, Y) and prey (X^*, Y^*) as follows

$\vec{X} (t + 1) = \vec{X}' \cdot e^{bl} \cdot \cos (2 π l) + {\vec{X}}^{*} (t)$ (15)

where $\vec{D}' = | {\vec{X}}^{*} (t) - \vec{X} (t) |$ , in which $\vec{D}'$ indicates the distance between the ith whale and the prey; b is a constant for defining the shape of the spiral equation; and l is a random number in the interval [−1,1].

Figure 5.

Shrinking encircling mechanism.

In the process of optimization, we assume that if p < 0.5, we use the mechanism of shrinking encircling pattern; and if p ≥ 0.5, the spiral updating position is applied. The mathematical model is as follows

$\vec{X} (t + 1) = {\begin{matrix} {\vec{X}}^{*} (t) - \vec{A} \cdot \vec{D}, & if p < 0.5 \\ \vec{D}' \cdot e^{bl} \cdot \cos (2 π l) + {\vec{X}}^{*} (t), & if p ⩾ 0.5 \end{matrix}$ (16)

where p is a random number in the interval [0, 1]. Figure 6 shows the spiral structure clearly.

Figure 6.

Mechanism of bubble-net searching with spiral pattern.

Search for prey (exploration phase)

Unlike the exploitation phase, the search agent updates its location according to the random selection during the exploration phase, thus the random value of $\vec{A}$ is used to force the search agent to stay away from the reference whale, and in this situation, the value of $\vec{A}$ is greater than 1 or less than −1. The mathematical model is as follows

$\begin{matrix} \vec{D} = | \vec{C} \cdot {\vec{X}}_{rand} - \vec{X} | \\ \vec{X} (t + 1) = {\vec{X}}_{rand} - \vec{A} \cdot \vec{D} \end{matrix}$ (17)

where p_i, p = {p₁, p₂, …, p_n}, is a random position vector chosen from the current population.

On the whole, WOA algorithm is a global optimization, because it has the capabilities of exploration and exploitation. Pseudo code of WOA is given as follows.

Algorithm 1. Whale optimization algorithm
1. Initialize the whale population, X_i (i = 1, 2, …, n);2. Calculate the fitness of each search agent;3. X^= the best search agent;4. while (t < maximum number of iterations)5. for each search agent6. Update a, A, C, l, and p;7. if (p < 0.5)8. if (\|A\|< 1)9. Update the position of the current search agent by equation (13);10. else11. Select a random search agent (X_rand);12. Update the position of the current search agent by equation (17);13. end if14. else15. Update the position of the current search agent by the equation (15);16. end if17. end for18. Check if any search agent goes beyond the search space and amend it;19. Calculate the fitness of each search agent;20. Update X^ if there is a better solution;21. t = t + 1;22. end while23. Return X^*;

Algorithm 1. Whale optimization algorithm

1. Initialize the whale population, X_i (i = 1, 2, …, n);2. Calculate the fitness of each search agent;3. X^*= the best search agent;4. while (t < maximum number of iterations)5. for each search agent6. Update a, A, C, l, and p;7. if (p < 0.5)8. if (|A|< 1)9. Update the position of the current search agent by equation (13);10. else11. Select a random search agent (X_rand);12. Update the position of the current search agent by equation (17);13. end if14. else15. Update the position of the current search agent by the equation (15);16. end if17. end for18. Check if any search agent goes beyond the search space and amend it;19. Calculate the fitness of each search agent;20. Update X^* if there is a better solution;21. t = t + 1;22. end while23. Return X^*;

To generalize the proposed algorithm, our fusion can be summarized by the following steps:

Initialize parameters. We adopt window function, w, to decompose, and the decomposition level is set as 3. The “Phoong, Kim, Vaidyanathan and Ansari” directional filter banks of the trapezoidal structure are used, and the maximum decomposition layer is 4.

Decompose CP on each original image separately and obtain the decomposition images M₁ and M₂ with the size of 1/2³ of the original image.

Use the directional filter banks built according to section “Fusion coefficient optimization with the WOA,” and decompose M₁ and M₂ to get the filtering images $M'_{1}$ and $M'_{2}$ and figures D₁ and D₂ carried by directions at this layer.

Take the filtering images $M'_{1}$ and $M'_{2}$ obtained in equation (3) as inputs and iterate and implement step 3 until the completion of the set decomposition layer.

In each final obtained image, add information D₁ and D₂ carried by directions into separate decomposition results y₁ and y₂ as the row vectors.

Conduct the low-pass filtering of decomposition results y₁ and y₂ separately and obtain two groups of decomposition coefficients C₁ and C₂.

Use the WOA proposed in this article to optimize the coefficients C₁ and C₂ to obtain the new fusion coefficient C.

Consider C as the input coefficient of directional filter banks for reconstitution to obtain reconstitution result y.

Conduct contrast reconstruction and obtain the final fusion result Y.

Computational complexity analysis

The computational complexity of the algorithm in this article consists of two parts: the computational complexity of the algorithm in section “Contrast and directional features” and that of the algorithm in section “Fusion coefficient optimization with the WOA.”

In section “Contrast and directional features,” CP decomposes the original images first before entering the directional filter bank. After down-sampling and filtering, the amount of data is reduced greatly, thereby improving the efficiency. Suppose the size of both images to be fused is N×N, and the number of levels to be CP decomposed is l, then after down-sampling and filtering, the size of images is (N/2 ^l ) × (N/2 ^l ). Hence, the size of the images to enter the directional filter bank is (N/2^l) × (N/2^l). The element number contained in two groups of coefficients, C₁ and C₂, can be calculated by formula (18)

$M = \frac{5}{4} (N / 2^{l})^{2} = \frac{5 N^{2}}{4^{l + 1}}$ (18)

where N is the size of the images to be fused, l is the number of levels to be CP decomposed, and M is the element number contained in two groups of coefficients, “C₁” and “C₂.” The complexity of comparison operation for finding the maximal absolute value is O(M), that is, $O (5 N^{2} / 4^{l + 1})$ , which is the computational complexity of the algorithm in section “Contrast and directional features.”

In the optimization of the fusion coefficient in section “Fusion coefficient optimization with the WOA,” if the maximum number of iterations is k and the size of the whale population is N, then the time complexity of the optimization coefficient of the WOA optimization algorithm is $O (kN)$ .

As a whole, the computational complexity of the algorithm in this article is $O (\frac{5 N^{2}}{4^{l + 1}}) + O (kN)$ , which can be regarded as $O (5 N^{2} / 4^{l + 1})$ .

Experiment and analysis

Numerical indexes

At present, there is no unified standard to evaluate the fusion effect for all kinds of images. In this article, we adopt some evaluation metrics presented in relevant references to evaluate the fusion results objectively, such as information entropy (IE), average grads, standard deviation, and spatial frequency. For example, the IE of the image reflects the amount of information of the image, where the larger the value, the more information the image contains. The average gradient (AG) of the image reflects the ability of the image to express contrast to the tiny details. The standard deviation (SD) of the image is a measure of the degree of dispersion of the pixel values of the image compared to the pixel average, which is the contrast information of the image, where the larger the value, the greater the contrast of the image. The spatial frequency (SF) effectively reflects the details of the image, where the larger the value, the clearer the image. The edge strength (ES) value of the image is an evaluation index that describes the visually important edge intensity information and direction information of the fusion result. The execution time (ET) reflects the efficiency of different fusion methods.

Experimental analysis

In order to verify the advantages of the proposed algorithm, we choose several fusion algorithms to compare, such as fast filtering image fusion (FFIF) method,²⁴ CP-based fusion method,^11,12 DWT-based fusion method,¹⁰ and principal component analysis (PCA)-based fusion method.²⁵ As shown in Figure 7, the first dataset represents a section (256 × 256 pixels) of visible and infrared images without smoke shielding acquired by a Sony Camcorder and a long-wave infrared (LWIR) sensor. The second group of visible and infrared images is near the street signs and a man in the doorway. The third group of images (256 × 256 pixels) is crossroads at night. The fourth group of images (256 × 256 pixels) is three soldiers and a jeep, and the last group of images (256 × 256 pixels) is two men in front of a house. The fusion results are shown in Figure 8.

Figure 7.

Five pairs of visible and infrared images: (a) building images without smoke shielding, (b) man in the doorway, (c) crossroads at night, (d) soldiers with jeep, and (e) two men in front of a house.

Figure 8.

Fusion results by different methods: (a) FFIF-based, (b) CP-based, (c) DWT-based, (d) PCA-based, and (e) CDFO-based.

In the experiments, fusion performance is evaluated by qualitative analysis and objective indicator (as shown in Tables 1 –5). Qualitative analysis is adopted by the visual effect and spectral fidelity, including spatial resolution, clarity, and sophisticated details, which indicate the extent of preserving original spectral signal or characteristics.

Table 1.

Numerical results of the first line of Figure 8.

Methods	IE	AG	SD	SF	ES	ET
IR	6.3274	2.0478	23.0849	5.4848	2.1069	–
VS	7.0887	4.1367	61.0146	10.5986	4.5757	–
FFIF	6.9156	4.5399	36.7987	10.6207	4.9612	0.7513
CP	6.4694	4.7274	27.4743	10.5398	5.1311	0.7643
DWT	6.248	2.4895	15.5729	6.5587	2.5496	1.0064
PCA	7.1936	5.0315	70.3889	12.5029	5.6660	0.4939
CDFO	7.2640	11.2167	49.1775	25.5808	12.3569	0.5086

IE: information entropy; AG: average gradient; SD: standard deviation; SF: spatial frequency; ES: edge strength; ET: execution time; IR: original infrared image; FFIF: fast filtering image fusion; CP: contrast pyramid; DWT: discrete wavelet transform; PCA: principal component analysis; CDFO: contrast and directional features optimization; VS: original visible image.