Abstract
Keywords
1. Introduction
An Unmanned Aircraft System (UAS) [1] is an aircraft or ground station that can be either remote controlled manually or is capable of flying autonomously under the guidance of pre-programmed GPS waypoint flight plans or more complex on-board intelligent systems. UAS aircrafts have recently found extensive applications in military reconnaissance and surveillance, homeland security, precision agriculture, wildlife conservation, fire monitoring and analysis and other different kinds of aid during disasters. Through surveillance videos captured by a UAS digital imaging payload over the areas of interest, different UAS missions can be conducted. However, the data analysis of UAS videos is frequently limited by motion blurring, resulting from frame-to-frame movement induced by aircraft rolling, wind gusting, less than ideal atmospheric conditions, the noise inherent within the image sensors, etc. Therefore, the super-resolution mosaicing of low-resolution UAS surveillance video frames has become a critical requirement for UAS video processing and a pre-step for further effective image understanding.
Given multiple images of a particular scene, multi-frame super-resolution reconstructs a high-resolution image with a resolution above the limits of the camera [2–4]. The super-resolved image should have more details than any of the low-resolution images. Mosaicing is the alignment or stitching of two or more images into a single more informative composition representing a 3D scene [5–6]. Generally speaking, the mosaicing creates a panorama, which is impossible to visualize with only one video frame.
Super-resolution mosaicing combines both multi-frame super-resolution and mosaicing and has a number of applications when surveillance video from UAS or satellite is applied. One clear application is the surveillance of certain areas, even during night, with the use of an infrared (IR) imaging system. The UAS can fly over areas of interest and generate super-resolved mosaics that can be analysed at the ground control station. Other important applications involve the supervision of high voltage transmission lines, oil pipes, highway systems, etc. NASA also uses super-resolution mosaics to study the surface of Mars, the Moon and other planets.
Super-resolution mosaicing has been studied by several researchers. Zomet and Peleg [7] applied overlapping areas within a sequence of video frames to create a super-resolved mosaic. In their method, the SR reconstruction technique proposed in [8] is applied to a strip rather than a whole image. This means that the resolution of each strip is enhanced by the fusion of all the frames that contain that particular strip. The disadvantage is that this method is computationally expensive. Ready and Taylor [9] introduced a Kalman filter to compute the super-resolved mosaic. They added unobserved data to the mosaic using Dellaert's method. Basically, they constructed a matrix for the observed pixels to estimate pixel values. This matrix is constructed using a homography matrix and the point spread function (PSF). The problem is that this matrix is extremely large, so they used a Kalman filter and diagonalization of the covariance matrix to reduce the amount of storage and computation. The drawback of this algorithm is the use of a large matrix and the best results with synthetic data obtain a PSNR of 31.6dB. Simolic and Wiegand [10] developed a method based on image warping. In this method, each pixel from every frame is mapped into the SR mosaic and its grey level value is assigned to the corresponding pixel in the SR mosaic within a range of ±0.2 pixel units. The drawback of this method is that it requires that the motion vectors and homography must be highly accurate, which is very difficult for real surveillance videos from UAS. Wang, Fevig and Schultz [11] used the overlapped area within five consecutive frames from a video sequence. Then sparse matrices were applied to model the relationship between the LR and SR frames, which can be solved using maximum a posteriori estimation. To deal with the ill-posed problem of the super-resolution model, they adopted hybrid regularization. The drawback of this method is that several sparse matrices have to be built for every five frames. Therefore, this method is not appropriate for dealing with a real video sequence, which contains thousands of frames, in real time. Pickering and Ye [12] proposed an interesting model for mosaicing and super-resolution of video sequences, where the regularization factor is based on the Laplacian operator. The problem with the Laplacian factor is that it forces spatial smoothness. Therefore, both noise and edge pixels are removed in the regularization process. Arican and Frossard [13] use the Levenberg-Marquardt (LM) algorithm to compute the SR of omnidirectional images. Chung [14] proposed a nonlinear least square solution based on the Gauss-Newton method. The disadvantage of this is that it only works for small images.
Our method combines the ideas of most of these techniques, but it also deals with super-resolution mosaicing in a different manner, which does not require the construction of sparse matrices. Therefore, it is feasible to apply the algorithm to a relatively long image sequence and obtain a video mosaic. In addition, we adopt Huber regularization, which preserves high frequency pixels and then sharp edges are also preserved. Furthermore, we model the super-resolution mosaicing problem in a convex framework [4], which guarantees the convergence of the proposed algorithm.
2. Mathematical modelling
2.1 Observation Model
Assuming that there are
Here,
where ‖ ‖ denotes the Euclidean norm. As the SR reconstruction is an ill-posed inverse problem, we need to add another term for regularization, which must contain prior information for the SR mosaicing. This regularization term helps to convert the ill-posed problem into a well-posed solvable problem. Here we adopt the Huber regularization:
The Huber function is defined as:
2.2 Super-resolution Mosaicing Using Steepest Descent Method
Based on the gradient descent algorithm for minimizing (3), the robust iterative update for
where G is the gradient operator over the cliques [8, 18] and
Furthermore, the derivative of the Huber function is given as:
The gradient operator G has the advantage over the Total Variation (TV). The Huber function and its gradient with respect to
The spatial interactions are adopted in our proposed method. The clique structure determines the spatial interactions, where the activity is computed with finite difference approximations to the second-order directional derivatives (vertical, horizontal and two diagonal directions) in each super-resolution mosaic
2.3 Super-resolution Mosaicing Using Conjugate Gradient Method
The solution of (3) can be estimated using conjugate gradient as:
where
The gradient vector, Vf(
The gradient operator G is the same as that in the steepest descent method.
2.4 Super-resolution Mosaicing Using Levenverg Marquardt Method
Similar to the gradient methods, the Levenberg-Marquardt method [20] can converge from an initial guess, which may be outside of the convergence region of other methods. In order to minimize (3), we define
where J(
The Levenberg-Marquardt method is an iterative process. Initiating at the starting point
where
where
where
After
Here c is the Levenberg-Marquardt damping term that determines the behaviour of the gradient in each iteration. If c is close to zero, then the algorithm behaves like a Gauss-Newton (GN) method, but if c → ∞, then the algorithm behaves like the steepest descent (SD) algorithm. The values of c during the iterative process are chosen in the following way. At the beginning of the iterations, c is set to a large value, so that the LM method integrates the robustness of SD and the initial guess of the solution to (3) can be chosen with less caution. It is necessary to save the errors for each iteration and carry out the comparison between two consecutive errors. If error(k) < error(k−1), c is decreased by a certain amount so that LM behaves like the Gauss-Newton method and it speeds up convergence. Otherwise, c is increased to a larger value, the searching area is then extended, which means that LM behaves like SD. The error(k) is defined as:
3. Experimental results
The experimental tests are based on three sets of data. One is the synthetic data. The other two are the real UAS data, where one is grey-level image data set and the other is the colour image data set. We created synthetic LR frames from a single high-resolution image. These LR frames were first produced using different translations (18 to 95 pixels), rotations (5° to 10°) and scales (1 to 1.5) and then they were blurred with a Gaussian Kernel. The real grey video data were captured by an experimental small UAS operated by Lockheed Martin Corporation flying a custom-built electro-optical (EO) and an uncooled thermal infrared (IR) imager. The time series of images are extracted from the UAS videos with a low-resolution of 60 × 80. The colour image data are collected with a regular camera mounted in a UAS by Cloud Cap Technology company.
We conducted the three proposed algorithms for super-resolution mosaicing on both synthetic data and real data and then compared their performance. The mosaicing results constructed from the low-resolution input images are set as the initializations for the proposed algorithms. The comparisons are based on PSNR (Peak Signal to Noise Ratio), running time and iteration error for the synthetic data sets and running time and iteration error for the real data from UAS videos because there is no ground truth data available to compute the PSNR for real data. Figures 1, 2 and 3 show the super-resolution mosaics produced from the three different algorithms on the synthetic test data and two sets of real video data. Tables 1, 2 and 3 list the corresponding quantitative comparisons for outcomes from the three different algorithms. From Figures 1, 2, and 3 and Tables 1, 2, and 3, it can be seen that all the methods improve the resolution of the LR mosaic and all of them improve the colour, details and sharpness. However, when the image is grey (IR images), the Levenberg-Marquardt method produces some artefacts since it solves a linear square equation that is close to being singular (13). The final error for the steepest descent and conjugate gradient algorithms decreases with every iteration, which means that they converge to the optimal solution in every step. However, this error from the Levenberg-Marquardt algorithm can decrease or increase due to the use of the damping factor, c, which accelerates the search for the optimal solution. The Levenberg-Marquardt method, interpolating between the Gauss–Newton method and the Gradient Descent method, avoids the time-consuming computation of the inverse of the pseudo-Hessian matrix in regular singular value decomposition (SVD).
Comparison of the three proposed algorithms to compute super-resolution mosaics for synthetic colour images.
Comparison of the three proposed algorithms to compute super-resolution mosaics for real video IR frames captured by UAS.
Comparison of the three proposed algorithms to compute super-resolution mosaics for real colour video frames captured by UAS.

Test on synthetic images. Comparison of the three proposed algorithms: steepest descent, conjugate gradient and Levenberg-Marquardt. (a) LR mosaic. (b) Ground truth HR mosaic. (c) SR mosaic using steepest descent method. (d) SR mosaic using conjugate gradient method. (e) SR mosaic using Levenberg-Marquardt method.

Test on real IR video images captured from UAS. Comparison of the three proposed algorithms: steepest descent, conjugate gradient and Levenberg-Marquardt. (a) LR mosaic. (b) SR mosaic using steepest descent method. (c) SR mosaic using conjugate gradient method. (d) SR mosaic using Levenberg-Marquardt method.

Test on real colour images captured from UAS. Comparison of the three proposed algorithms: steepest descent, conjugate gradient and Levenberg Marquardt. The images belong to the first set of colour video frames captured from UAS. (a) LR mosaic. (b) SR mosaic using steepest descent. (c) SR mosaic using conjugate gradient. (d) SR mosaic using Levenberg Marquardt.
Based on test results on synthetic data and real video data captured from UAS, the Conjugate Gradient method produces the best super-resolution mosaicing results in visual performance. There is almost no difference in visual performance on the super-resolution of the mosaic images between the Levenberg-Marquardt method and the Steepest Descent method. However, the experimental outcomes show that the Steepest Descent method used the least time among the three approaches to reach the convergence and is the most efficient method.
4. Conclusions
The three optimization methods: the Steepest Descent method, the Conjugate Gradient method and the Levenberg-Marquardt method, are applied to model the super-resolution of the mosaic images. Their running efficiency and visual performance on synthetic test data and physical test data collected from UAS are compared. Experimentally, the Conjugate Gradient method gives the best super-resolution mosaic results in visual performance while the Steepest Descent method is the most efficient method to converge. There is no large difference in visual performance in the super-resolution mosaicing from the Levenberg-Marquardt method and the Steepest Descent method.
5. Acknowledgments
This research was supported in part by the Defence Experimental Program to Stimulate Competitive Research (DEPSCoR) and, Army Research Office grant number 50441-CI-DPS, Computing and Information Sciences Division “Real-Time Super-Resolution ATR of UAV-Based Reconnaissance and Surveillance Imagery” (PI, Richard R. Schultz, Principal Investigator,). This research was also supported in part by the Joint Unmanned Aircraft Systems Centre of Excellence, contract number FA4861-06-C-C006, “Unmanned Aerial System Remote Sensing and Avoidance System and Advanced Payload Analysis and Investigation,” as well as the North Dakota Department of Commerce grant, “UND Center of Excellence for UAV and Simulation Applications”. Additionally, the authors would like to acknowledge the contributions of the Unmanned Aircraft Systems Engineering (UASE) Laboratory team at the University of North Dakota, This research was also supported by Fincyt (Perú) under the SuperRIVAM project.
