Abstract
Keywords
Introduction
Face recognition (FR), a popular area of research in computer vision and machine learning, has been widely studied for decades.1,2 The automatic FR systems, including those using representation classification (RC) techniques such as sparse RC (SRC),
3
have already achieved very impressive performance over large-scale images in the constrained environment (cooperative users with controlled indoor illumination).4,5 It remains a challenge on how to enhance the performance of the SRC-based FR systems in an unconstrained environment. As SRC is an non-deterministic polynomial (NP)-hard combinatorial problem, it is usually relaxed into an
It has been shown that the nonnegative coefficient constrained SRC is more effective.6,8 This is mainly due to the fact that this approach can avoid overfitting. The algorithms proposed in the Bioucas-Dias and Figueiredo study 6 can be expanded to a finite number of terms, which collectively resemble the typical neural network layers. Consequently, a lengthy sequence of iterations can be treated as a deep learning (DL) network with shared layer weights, 9 while the rectified linear unit (ReLU) actually corresponds to the nonnegative coefficient constraint. As well known, however, DL is computationally very expensive.
It should be pointed out that approaches using the representation-based classification techniques cannot be directly used for FR in an unconstrained environment. An algorithm of face alignment by sparse and low-rank matrix decomposition (SLMD) was proposed in Wu et al. 10 However, it is not applicable for large-scale image data set. The conventional modeling of face alignment is based on affine transform with two-dimensional (2D) model face detection and registration.11,12 Recently, a face frontalization framework was proposed with three-dimensional (3D) reference model in Hassner et al., 13 which is more effective in FR and gender estimation. A multi-view FR based on tensor subspace analysis was proposed by Gao and Tian, 14 but it is very difficult to apply in the uncontrolled environment. The mirror face image information was adopted in Xu et al. 15 to improve the recognition performance; this improvement is, however, very limited as the multi-view face is not frontalized. A robust sparse coding (RSC) method was proposed by Yang et al., 16 which is the maximum likelihood estimation (MLE) solution of the sparse coding problem. However, the sparse coding algorithm for RSC, involving iterative re-weighting, is costly for a large-scale data set.
In this article, we propose a 3D model–based robust local nonnegative sparse representation classification (RLNSRC), which is intended to deal with FR in uncontrolled scenarios. The main contributions in this article are given as follows:
Instead of using the raw face image, a 3D frontalization based on the aligned downsampling local binary pattern (ADLBP) feature is adopted. This allows us to deal with the uncontrolled environments effectively.
A compressive sensing–based RLNSRC (CS-RLNSRC) scheme is proposed for FR. An algorithm is derived for designing the optimal projection matrix that is used to compress the ADLBP feature signals. Such a system is intended to reduce the computation complexity and to prevent the overfitting problem.
The article is outlined as follows. The “Related works and problem formulation” section is devoted to providing some existing works on FR systems using the representation-based classification, which are closely related to ours, to be presented in this article. Our main contribution is given in “A novel framework for FR” section, in which a CS-RLNSRC framework is proposed. Experiments are carried out in the “Experiment results” section to examine the performance of the proposed system. To end this article, some concluding remarks are given in the “Conclusion” section.
Related works and problem formulation
In this section, we will review some existing representation-based techniques developed for classification and formulate the problem we investigate in this article.
Suppose we have
In the linear representation-based classification approach, any sample
where
where
For a given query face, represented by the feature vector
In the next subsections, we will specify some of the existing representation-based classifiers.
Linear regression classification
In the linear regression classification (LRC), the coefficient vectors are obtained from the following minimization
As
The corresponding classification deviations
Sparse representation–based classification
The SRC assumes that
where
It can be shown
17
that if certain conditions are satisfied, the solution of
Similarly, the corresponding classification deviations
Robust sparse coding
In Yang et al., 16 a robust representation–based classification approach was proposed. Denote the residual of signal representation as
Assume that
The optimized sparse vector can be solved by
As it is a non-convex problem, Yang et al. 16 proposed an iterative method which is described as follows.
Define
where
and hence
Replace
where
and
To reduce the complexity in updating
where
Nonnegative coefficient sparse representation–based classification
The nonnegative coefficient sparse representation–based classification is a constrained SRC, formulated with
As understood, this is a non-convex problem and is hard to be solved. However, when the nonnegative sparse solution of equation (16) exists and is unique, it is equivalent to nonnegative linear regression. 19 Therefore, equation (16) is alternated to
which can be solved by the nonnegative least squares function
The classification deviations
Problem formulation
It should be pointed out that all the listed representation–based classification approaches work well only for the constrained environment. The environment in real life is usually unconstrained, and hence a pre-process such as frontalization is definitely needed. Also, the RSC algorithm presented above is very slow due to the step of updating the weighting
A novel framework for FR
Figure 1 depicts the proposed FR scheme. Such a system consists of four main modules that will be described in the following subsections.

The proposed CS-RLNSRC face recognition system.
Image pre-process
In this stage, we just concentrate on the process of face frontalization. The method used here is similar to that proposed by Hassner et al. 13 The basic idea is to extract face features that are used to match a 3D face mode and then frontalize the matched model to get the front face. See Figures 2 and 3, which show the six original faces (of the same person) and those frontalized using the process, respectively. In the traditional representation–based classification, one would put many different unfrontalized face images into the dictionary in order to deal with the unconstrained environment. Thus, the frontalization process can not only handle complicated environments but also avoid increasing the size of the dictionary as it contains the front face images only.

Six images from LFW data set.

The six frontalized face images.
Feature extraction
After frontalization, the images are processed using a downsampling LBP algorithm to reduce the illumination and misalignment effect.
21
Let
where
The augmented Lagrange multiplier method can be used in the SLMD process. 22
Dimension reduction
The obtained feature matrix
where
The recognition can then be carried out much efficiently using
Theorem 1
Let
where
Then, the false acceptance rate (FAR) probability bound of this estimation is given by
where
In the representation-based classification framework, the ADLBP feature signal
where
where
In the classical SRC-based FR framework, the signals are the original (front) face images, while in the proposed FR scheme the signals are the compressed version
The compressive sensing theory suggests that an optimized projection/sensing matrix can outperform significantly the random one in terms of keeping the information contained in the high-dimensional signals and enhancing the recovery accuracy.24,25 The problem of optimizing projection matrix was first proposed in Elad 24 and a large class of existing algorithms is of the following formulation 25
where
Denote
where
In our proposed system, the target Gram
where
with
where
The motivation for such a choice of target Gram is explained below. We note that with the same partition as
With
be a singular value decomposition of
the diagonal matrix with
where
When
With
An improved RSC
Recall the RSC in the “Robust sparse coding” section. As mentioned before, the procedure used in such an algorithm for updating the weighting factors is time-consuming. This prevents the algorithm from real-time application.
In the proposed CS-RLNSRC FR system, the coefficient vector for the
which can be solved by the nonnegative least squares function
The classical hypothesis of noise interference follows the Gauss distribution. The actual distribution is not always the case. In human eyes, we usually focus on the most of face feature points that match, ignoring those mismatched. Based on this observation, we propose a much simplified procedure to update the weighting matrix
Denoting
The query face image is then classified with
We use
Remark 3.1
As seen from equation (29), the weighting matrix can be easily obtained without using an iterative procedure and statistical information of the residuals. Experiments show that such a weighting strategy is very effective for enhancing recognition performance.
We note that the choice of weighting factors given by equation (29) is intended to keep the
Clearly, equation (29) is equivalent to the above for the case where
In many situations, a query face image may not belong to any of the
Since
Suppose that we have
Furthermore, let
Obviously, both
Remark 3.2
Lack of training samples is an important factor affecting the performance of representation-based classification techniques. It has been noted that in face images, the facial features have symmetrical attributes. This means that if we have one face image, we can get another with its mirror face image, achieved using MATLAB command
Experiment results
In this section, we will examine the performance of the proposed CS-RLNSRC system for FR and compare it with some of the existing FR systems. It contains two portions. In the first one, several existing FR systems, which are all similar to our proposed one, using the representation-based classification are implemented for comparison, while in the second portion, three FR systems implemented with different strategies are used for the comparison.
Labeled faces in the wild (LFW) are used in our experiments. The LFW data set contains 13,323 web images of 5749 celebrities. A subset of the LFW data set, having
Portion I—RC-based FR systems
It should be pointed out that in this subsection, all the systems have the same structure, containing four modules among which the first three are the same as those in our proposed system. This means that one system differs from another just in the way how the classification is done. We will consider four representation-based classification algorithms:
Six images of a person are displayed in Figure 2. These images are pre-processed for frontalization, and the resultant images
Figure 4 displays the evolution of the first frontalized image in Figure 3 from frontal face to ADLBP featured image which is

ADLBP extraction process: (a) frontalized image, (b) LBP, (c) DLBP, and (d) ADLBP.
Set
The effect of parameter
We note that in designing the sensing matrix, one has to choose the parameter

The parameter
As it is shown in Figure 5 that the recognition rate can be increased with a proper choice of
Effect of the number of training samples
Table 1 displays the effect of
The number
The values in bold are the best one in each column of the table.
As displayed in Table 1, increasing the number of training samples can enhance the recognition rate. This is coincident with equation (21). The results also indicate that our proposed algorithm outperforms the others.
Effect of data compression ratio M/N
We fix

Relationship between
One observes that for each algorithm, the recognition rate increases with
Effect of features extraction
Let
Feature selection versus recognition rate
One can see that for each algorithm, the ADLBP feature is always better than any of the others in terms of recognition rate. The price paid for that is an increase of computation complexity.
Open-universe experiment
We add 8909 face images as distractors to the set of testing images, and then run the four FR systems. For a given

Relationship between the actual recognition rate
As seen, the proposed
Portion II—comparison with differently structured systems
In the previous subsection, we compare our proposed FR systems with three systems that share the same structure as ours, and the main difference is that each system uses a different representation-based classification algorithm.
In this subsection, we will compare our proposed system with three other FR systems which use significantly different strategies from ours. Here, we use
As it is indicated in Table 3, the 3D-based frontalization algorithm, used in
Recognition rate
FR: face recognition.
Conclusion
In this article, a novel framework has been proposed, in which a 3D-based frontalization strategy is adopted as a pre-process and the ADLBP features of the frontalized images are employed for recognition. In addition, an optimized projection matrix is designed to reduce the implementation complexity and an improved RSC algorithm has been derived for classification using the lower dimensional measurements. Experimental results on open and closed universe of LFW data set demonstrate the effectiveness of the proposed approaches and show that our proposed system outperforms those RC-based FR systems as well as the three FR systems that have a significantly different structure from ours.
In order to make the proposed FR system be used in a real-time application, more efficient algorithm should be developed for the ADLBP feature extraction. The key to the success of the proposed FR system is to prevent overfitting problem by applying nonnegative coefficient constraint. Similar phenomenon has also been noted in DL.32,33 DL has become a powerful tool and been used in many areas. How to embed our approach into the DL context will be another direction for future investigation.
