Abstract
1. Introduction
WSNs emerge as a new research hot spot in recent years. In a WSN, the resources in each sensor node are limited. Therefore, it is a huge challenge to reduce energy consumption and extend the lifetime of a sensor node. The energy cost of transmitting image in WSN remains to be a main factor that affects the lifetime of a sensor node. To reduce the bandwidth and energy consumption in image transmission, it is necessary to propose a more effective image compression method.
Currently, the image compression algorithms in WSNs are subject to the random changes in image contents. It is unrealistic to describe various images in real world with only one kind of image model. To address this issue, the neural network is adopted in WSNs to compress images. As an important branch of neural network, Deep Learning [1, 2] has many computation models. Restricted Boltzmann Machine (RBM) [3, 4] is one of the prime models of Deep Learning. When the multilayer RBM based network is used to compress images, the quality of compressed images has much to do with the likelihood of training data in each RBM layer. Moreover, the training complexity of RBM has a great effect on the energy consumption of image compression coding.
Most of the current RBM training algorithms are carried out only with large quantities of sampling using Markov Chain Monte Carlo (MCMC) method. The average joint probability between visible units and hidden units is estimated based on these samples without calculating the normalizing parameter. However, the frequency of state transitions should be enough to ensure that the acquired samples satisfy the target distribution when MCMC sampling is conducted. Also, large quantities of sampling are needed to improve the accuracy of the estimated value, which increases the difficulties of RBM training. Aiming at the problems encountered in the current RBM training process, alternative algorithm is used in the RBM training process.
We have adopted the alternative iteration algorithm into the process of RBM training. In this algorithm, the normalizing parameter is considered as another unknown parameter. Therefore, the likelihood function can be transformed into two subfunctions. One is about the normalizing parameter and the other is about the model distribution parameter. The model distribution parameter which is about to be assessed is calculated alternatively with the normalizing parameter and eventually can be obtained through a highly efficient training process. This training process is of low complexity. This algorithm can improve the likelihood of RBM for training data.
Furthermore, we have used the improved RBM training process in image compression in WSNs. A multilayer improved RBM based image compression method is presented in this paper. This image compression method can extract more abstract data to coding based on the image features and has a better compression effect. In the simulations, the reconstructed image quality of multilayer RBM networks is superior to that of another image compression method under the same compression ratio, which will be stated in detail in Section 5. At the same time, the proposed image compression method can reduce the energy consumption during image data transmission process.
The rest of the paper is organized as follows. In Section 2, related work on image compression and RBM training algorithms is discussed. Section 3 presents the basic idea of the multilayer RBM network based image compression method. And the RBM model and the improved RBM algorithm based on alternative iteration are depicted in Section 4. The performance of the proposed algorithm is compared with some typical algorithms in Section 5. At last, conclusions and future work are presented in Section 6.
2. Related Work
Typical image compression algorithms include time-space related data compression algorithm, wavelet transform based data compression algorithm, distributed data compression algorithm, and improved traditional data compression algorithm.
The space-time relativity based data compression algorithm mainly includes prediction coding and linear fitting method for time series. A prediction coding method is proposed in [5]. It can effectively evaluate the source data based on the time relativity of the source data. However, the prediction coding based data compression method does not involve large amount of image data transmission. Reference [6] proposes a curve fitting technology based data flow compression method. It compresses data collected on each node and restores the data in the base station. But this method is very complex, and it does not consider the transmission delay in each sensor node. Reference [7] presents a space-time data compression technology based on simple linear regression model. This method can eliminate data redundancy in single node and collector node, respectively. But only the data that satisfies the error requirement is considered in this method. Abnormal data is not involved in this method.
Wavelet transform is a time-frequency analysis method which is superior to traditional signal analysis methods. Reference [8] considers the existence of stream data in the data transmission of sensor networks. It compresses data by using wavelet transform based on the data aggregation and the DC routing algorithm. In [9, 10], a ring model based distributed time-space data compression algorithm and a wavelet based self-fitting data compression algorithm are proposed. Storage efficient two-dimension and three-dimension continuous wavelet data compression methods are proposed in [11]. They are based on the ring model of fitting sensor network wavelet transform and the overlapping cluster partition model, respectively. They are storage efficient and can save the transmission energy consumption in networks.
The distributed data compression algorithm is based on the fact that all the centralized and decentralized information services can be implemented. The feature of a distributed data compression algorithm is that it can reduce the data amount by the cooperative work among different sensor nodes. A chain model based distributed data compression algorithm is proposed in [12] based on the random lengths of wavelets. This algorithm designs a chain model that is suitable for wavelet transform. It is suitable for random lengths of wavelet functions.
Traditional lossless data compression methods mainly include Run Length Encoding technology, Huffman coding compression, dictionary compression method, and arithmetic compression method. These methods are mainly adopted in advanced computers or workstations. In the application of sensor networks, the processing capacity of each processor is limited. Its memory is small. Therefore, it is essential to optimize the traditional compression algorithm. In [13], the difference between the two perceptual pieces of data is encoded based on the self-fitting Huffman coding algorithm. Reference [14] proposes a region of interest (ROI) based lossy-lossless image compression method. It carries out different coding compression methods on the small area that is important to itself and the other large area. In this way, compression ratio is improved under the condition that sensitive information is reserved.
In recent years, Deep Learning (DL) is widely used in WSNs to carry out image compression. Deep Learning extracts the characteristics of data from low to high layers by modeling the layer model of analyzing in human brains. However, the effect of image compression using DL is subject to the likelihood of RBM for training data and the training complexity of RBM. Therefore, an improved training algorithm based on RBM training is also proposed in this paper.
Currently, researchers have made lots of researches on RBM training algorithms. In 2002, Hinton proposed a fast learning algorithm of RBM, Contrastive Divergence (CD) [15]. This algorithm is a RBM approximate learning algorithm of high efficiency. However, the RBM model acquired by the CD algorithm is not a maximum entropy model and does not have high likelihood when training data [16].
In 2008, Tijmen Tieleman proposed a Persistent Contrastive Divergence (PCD) algorithm [17]. This algorithm has remedied the deficiency in CD algorithm. It has the same efficiency of CD algorithm and does not violate the maximum likelihood learning. In addition, the RBM obtained by PCD training has more powerful pattern generation capacity. In 2009, Tieleman and Hinton made further improvement of PCD algorithm [18] and proposed Fast Persistent Contrastive Divergence (FPCD) algorithm. A group of auxiliary parameters are involved in improving the Markov chain composite rate in PCD. Another group of parameters, which are called Fast Weight and denoted by
Some RBM learning algorithms of MCMC sampling methods based on tempering also appear during these years. A parallel tempering algorithm based on RBM is introduced in [19]. This algorithm maintains a state for every distribution under a certain temperature. During state transition, the low temperature distribution state can be transmitted to high temperature distribution state by exchanging the two distribution states. In this way, there is a high chance that the low temperature distribution state can be transmitted to a remote peak value; therefore, the whole distribution can be sampled. In 2014, Xu et al. proposed a tempered based MCMC method, Tempered Transition, in [20] to learn RBM model. The main idea of Tempered Transition is to maintain the current state in the target distribution. When a new state appears, the state transition is carried out step by step from low to high temperature, by which the state gravity from the current peak value can be decreased. At last, a group of state transitions from high to low temperature are conducted until the temperature gets normal. The essence of the above two algorithms can improve the RBM training efficiency by adopting the MCMC sampling method based on tempering [21].
3. Image Compression Using Multilayer RBM Network
In a WSN, the data transmission process can be divided into two parts: data compression encoding process and data decoding process. The image sending and receiving process in WSNs can be shown in Figure 1.

The image sending and receiving process.
The basic idea of the multilayer based RBM network data compression encoding method is as follows: firstly, an image whose pixel is
Then, the image matrix

Basic idea of image compression using multilayer RBM.
The bottom input layer consists of
Image decoding is the inverse process of image compression coding process. The compressed image is inputted into the topmost layer and then is decoded from layer to layer. At last, the bottom level outputs the original image.
The main part of the improved image compression coding method is RBM. It is essential to improve the likelihood of RBM for image data so as to ensure high similarity between the original image and the image after decoding. Therefore, we have improved the training method of RBM.
4. An Improved RBM Algorithm Based on Alternative Iteration Algorithm
4.1. The RBM Model
RBM can be assumed as an undirected graph model [22]. As is shown in Figure 3,

Graph of RBM model.
Assume that there are
The object of RBM learning process is to determine
The stochastic gradient ascent is usually used to get the optimum parameter
From the current researches, the approximate value of joint distribution can be obtained by some sampling methods like Gibbs Sampling [24], CD, and so on. However, most of these methods have the defect that the RBM training is very complex because of the frequent state transitions and large quantities of sampling. We propose applying alternative iteration algorithm to RBM training. When the model parameter cannot be calculated because of some other uncertain parameters, alternative iteration algorithm can be applied to get the maximum estimated value of these parameters using iteration strategy.
4.2. The Alternative Iteration
The alternative iteration algorithm is a common method to solve optimization problems. For instance, there is a maximization problem
The alternative iteration algorithm is adopted in this paper to solve the problem of RBM training. The likelihood function
The traditional way of getting the value of
4.3. The Process of Calculating RBM Parameters by Alternative Iteration
Assume that there is a group of training samples
When the algorithm begins, the model parameter is firstly initialized by
When sample
Multiply the denominator and numerator of the right fraction in (6) by
We can deduce from the Jensen inequality and the property concave function the following equation:
Equation (8) is true if and only if
According to
When
The normalized parameter can be estimated and we get a value
At this time, we need to choose the equation to calculate
We can get the marginal distribution of the joint probability distribution
Then, we keep
However, the initial value
The improved RBM algorithm is described in Algorithm 1.
Setting the convergence threshold Input: the value of the model parameter by pre training iteration times Output: the final value of the model parameter For For Computing Computing Judging the reconstruction error of the model reaches End for Judging the difference of the likelihoods of the two RBMs defined by the adjacent parameters is within End for
5. Simulation Experiments and Results Analysis
The experiment consists of three parts: the performance analysis of RBM; the analysis of the compression performance of the proposed image compression method and the evaluation of reconstructed image quality; the analysis of energy consumption in WSNs when multilayer RBM network image compression method is used. MATLAB 2013a is used to carry out the simulations.
5.1. Performance Analysis of the Improved RBM
The datasets of our experiment are the famous handwritten digital database of MNIST [25] and the toy dataset. The MNIST dataset consists of 50,000 groups of training samples and 10,000 groups of testing samples. Each group of samples consists of a grayscale image whose resolution is

Part of samples of MNIST database.
Compared with MINST dataset, the toy dataset is simpler and lower dimensional. It consists of 10,000 images. Each image has
We compare the proposed algorithm with PCD algorithm, parallel tempering algorithm (PT-K) [27], and parallel tempering with equienergy moves (PTEE) [28] in the experiments. In PT-K,
We evaluate their qualities by the likelihood of the RBM for training data with two methods: the reconstruction error and enumerating states of hidden units.
Firstly, we compare the reconstruction errors of four algorithms with different numbers of hidden nodes on the MNIST dataset and toy dataset. The first 30,000 groups of samples in MINIST are divided into three parts. Each part includes 10,000 groups of samples. The number of hidden units is set to 10, 15, 20, 25, 30, 50, 100, 200, 250, 300, and 350. The number of iterations on each part ranges from 1 to 45. And the average reconstruction errors of the three parts after 15 and 30 times of iterations are shown, respectively, below. Then, the experiments on the toy dataset are also executed. We set the number of hidden units to 10, 15, 20, 25, 30, 50, 100, 150, 200, 250, and 300. Results obtained by using PT-10, PTEE-10, PCD, and the proposed algorithm are shown in Figures 5–8.

The reconstruction errors of the four algorithms after 15 times of iterations on the MNIST dataset.

The reconstruction errors of the four algorithms after 30 times of iterations on the MNIST dataset.

The reconstruction errors of the four algorithms after 15 times of iterations on the toy dataset.

The reconstruction errors of the four algorithms after 30 times of iterations on the toy dataset.
From Figures 5 and 6, we can see that the average reconstruction error of the proposed algorithm is always smaller than the other three algorithms on the MNIST dataset. And we can get similar results from Figures 7 and 8.
Figures 5–8 show that all of the reconstruction errors of four algorithms decrease when the number of hidden units increases. When there are a small number of hidden units, the reconstruction error obtained by the proposed algorithm is close to that of the other three algorithms. However, when the number of hidden units increases, the superiority of the proposed algorithm appears gradually. And we can see decreasing ratios of the average reconstruction errors of PT-10, PTEE-10, and our proposed algorithm compared with PCD on the MNIST and the toy dataset when there are the same numbers of hidden units. When the number of hidden units is 350, after 30 times of iterations, the reconstruction error of the proposed algorithm is 26.60% lower than that of PCD on MNIST dataset. Under the same conditions and compared with PCD, PT-10 is 8.20% lower and PTEE-10 is 16.64% lower.
Next, a small scale of experiment with 15 hidden units is conducted on the MNIST dataset. The log-likelihood can be obtained by enumerating states of hidden units. Therefore, high accuracy can be achieved. Figure 9 shows the averaged log-likelihood by training each model 5 times.

The average likelihood of the four algorithms when there are 15 hidden units on the MNIST dataset.
Figure 9 shows that the likelihood of the proposed algorithm is not as good as the other three algorithms within 10,000 times of parameter updates. However, as the number of updates gradually increases, the likelihood for data of the proposed algorithm also increases which can be better than that of PTEE-10. When the number of updates is 30,000, the likelihood of PCD reaches the peak. And it decreases because the number of Gibbs transitions increases and the model distribution also gets steeper and steeper. PT-10 algorithm is a Monte Carlo method based on tempering. It has more even distribution when the temperature gets higher so that it can overcome the steep distribution difficulties by conducting state transitions from low to high temperatures. So it has a better effect than that of PCD. PTEE-10 proposes a new type of move called equienergy move, which improves the swap rates between neighboring chains to some extent. But after

The average likelihood of the two algorithms when there are 10 hidden units on the toy dataset.
Moreover, we also recorded the running time of different algorithms on one epoch and on one training sample when they are applied to the toy dataset and the MNIST dataset. All experiments were conducted on a Windows operating system machine with Intel® Core
The running time (in seconds) of different algorithms when they are applied to toy dataset and MNIST dataset.
Based on the results in Table 1, we can see that the running time of our proposed algorithm is less than PT-10 and PTEE-10.
From all the simulation results above, we can see that the reconstruction error of the proposed algorithm is better than that of PCD, PT-10, and PTEE-10. Under the same conditions, PT-10 and PTEE-10 perform better than PCD, but PT-10 and PTEE-10 will spend tenfold time. However, the experiment results obtained on MINST dataset show that the proposed algorithm has better performance over the three other algorithms, which is also validated on the toy dataset. Moreover, the proposed algorithm only takes 2 or 3 times the time that PCD will take.
5.2. Performance Analysis of the Multilayer RBM Network Image Compression Method
In this section, the
In this experiment, ROI image compression method [14] is compared with the proposed algorithm. The compression ratio of the ROI compression method is calculated based on the region of interest:
In the proposed algorithm, the compression ratio is determined by the number of neural units in hidden layer:
During the experiment, the number of hidden layer units
The SNR and PSNR of Lena image when using multilayer RBM network compression algorithm and interest based compression algorithm.
The compression ratio is in inverse proportion to the number of hidden units
From Table 2, we can also conclude that the reconstructed image quality of multilayer RBM networks is superior to that of ROI compression method under the same compression ratio. The ROI compression algorithm can compress the interest region and the background region, respectively, and therefore it can get high compression ratio. But the overall reconstructed image quality is not good because of the high compression ratio of background region. In multilayer RBM networks, the compression ratio can be improved by setting the number of neural units in each layer of RBM. In addition, the training process in a multilayer RBM network is layered. The data from the bottom input layer to the bottom hidden layer is the first compression. The data from the first hidden layer to the second hidden layer is the second compression. The second compression is based on the first compression. RBM in each layer will compress the image and greatly remove the redundancy of the original image.
5.3. The Energy Consumption Analysis of Wireless Sensor Network
In this section, the energy consumption of a WSN is analyzed in the aspect of image transmitting. The energy consumed during the transmitting process can be calculated using the formula below:
In the process of simulation, when cluster head nodes receive images, they transmit these images to coding nodes to carry out compression coding. Then, the compressed image is assigned to the transmitting node. In the experiment, we calculate the energy consumption of every transmitting node. We compare the proposed algorithm with the ROI lossy-lossless image compression method under three conditions: (1) no image compression methods are used; (2) only the multilayer RBM network compression method is used; (3) only the ROI compression method is used. We compare the energy consumption of transmitting nodes under the three conditions. In conditions (2) and (3), we compare the energy consumption of the two algorithms under the same SNR.
The parameter settings are as follows:

The energy consumption of transmitting nodes.
Figure 11 shows that more energy is consumed when the transmitting distance increases. When the transmitting distances are the same, the energy consumption of transmitting nodes using multilayer RBM network is obviously smaller than that using ROI compression method. Although ROI compression method can code the interest region and background region, respectively, and get high compression ratio, it cannot ensure high quality of the reconstructed image. However, in the multilayer RBM network compression method, data redundancy is reduced in every layer and therefore it has high compression ratio.
We continue to find out the relationship between image compression performance and compression energy consumption. The image compression energy consumption can be calculated using the formula below:
We continue to test the total energy consumption in WSN when the three image compression algorithms are used, respectively, and Figure 12 shows the results.

Total energy consumption in WSN.
Figure 12 shows that although RBM training process extends
6. Conclusions and Future Work
Image compression is an important research field in WSNs. It is difficult to find a comprehensive method of image compression because of the complex features in sensor networks. A multilayer RBM network based image compression method is proposed in this paper. And an improved RBM training algorithm based on alternative iteration is presented to improve the likelihood of RBM. However, there remain many problems to be solved when using multilayer RBM network to compress image. The multilayer RBM network can affect the delay in the sensor network. We should find more suitable normalizing parameter function during the RBM training process. Besides, the problem of finding routing path should also be considered. Therefore, we endeavor to find out more integrated image compression method so as to accelerate the application of WSNs in real life.
