Abstract
Introduction
With the continuous development of computer network technology, more and more data, especially the multimedia data (images, audios, and videos) captured by the sensors, are uploaded to cloud platforms by users to alleviate storing and computing burdens. 1 However, some unexpected results may happen under this circumstance, such as some illegal individuals or groups infringe copyright, disclose privacy, steal commercial confidential information, and so on. Therefore, how to manage users’ data in the complex network environment with a privacy-preserved way has become an urgent problem for the enterprises. 2
Digital image is one of the most typical multimedia. An important feature of digital images is data redundancy, which means that strong correlations always exist in the spatial domain of natural images. This is also the reason why images can be compressed effectively. In addition, because the information entropy of the image is generally far from the maximum value, data hiding in the host image is possible. Data hiding has been considered as an effective way to manage the host media in terms of data authentication, protection, classification, retrieval, and so on.3–5 Generally, there are two criterions to evaluate the performance of data hiding. One is data embedding capacity, which can be measured by bit per pixel (bpp). The other is the quality of stego-image, which can be measured by peak signal-to-noise ratio (PSNR).2,6
With the aim of preserving privacy, a growing number of original data is encrypted before uploading it onto the cloud.7–10 Therefore, data hider needs to embed additional information into encrypted images directly without knowing any information about the original data, which is called data hiding in encrypted image.2,6,11–16 Compared with traditional data hiding in plaintext image, few redundancy spaces can be used to embed additional data, since the information entropy of the host media is close to the maximum value after encryption. In other words, encryption brings new challenge to data hiding. 17 To perform data hiding in encrypted images effectively, additional space should be vacated in the host image. According to the sequence of creating additional space and encryption, existing embedding mechanisms in encrypted image can be classified into reserving room before encryption (RRBE)11,15,18 and vacating room after encryption (VRAE).12–14,19,20 However, both of the two schemes have disadvantages. RRBE-based embedding mechanism demands extra pre-processing operation for the content owner before image encryption, and VRAE-based embedding mechanism has greater computational complexity due to the maximized entropy after encryption. In contrast, the proposed scheme aims to encrypt image and reserve room simultaneously for effective data hiding.
Image encryption techniques have been greatly advanced recently. 21 One of the most attractive signal-processing techniques today is compressive sensing (CS), which is used for efficient signal acquiring and reconstructing. The principle of CS is that a sparse signal can be recovered by optimization from far fewer samples than the required according to the Nyquist–Shannon sampling theorem. 22 CS can also be designed with cryptosystem property to achieve signal compression and encryption simultaneously,23,24 and it is especially suitable for the resource restricted wireless sensors network (WSN). 25 However, data hiding in CS domain is difficult because the information entropy of the signal obtained by CS has increased greatly. Many solutions combining CS with data hiding have been proposed.26–31 Some of them are based on the segmentation or pre-treatment of the host signal,27–30 while others processed the measurements directly.26,31 As stated before, reversible data hiding can only be achieved under the condition that the information entropy of host media has not reached the maximum value. In other words, additional space must be vacated in the encrypted domain before data hiding, and it is better to achieve CS-based encryption and vacating room simultaneously.
Motivated by the requirements of the privacy and fidelity of the host image and high embedding capacity of the additional data, this work proposes a novel privacy-preserved data-hiding scheme. The original image is pre-processed and divided into multiple classes through fuzzy C-means (FCM) clustering algorithm by image owner, and then all classes are classified into two major categories according to the given threshold. The category where the absolute differences of each class are greater than the threshold is encrypted via exclusive-or operation. The other category is encrypted and compressed synchronously by CS technology, where the measurement matrix is served as encryption key. The whole encrypted and compressed image generated will be transmitted to data hider to embed additional information with data-hiding key. After obtaining the marked, encrypted image, the receiver can extract additional information exactly and recover original image with satisfactory visual quality. The proposed scheme can achieve secure transmission of secret data, protection and management of host images by applying data hiding in the CS domain.
The contributions of our scheme are summarized as follows: (1) The original image is divided into multiple classes, but only partial classes whose the absolute difference less than the threshold are compressed to vacate room for additional data. Thus, the visual quality of recovered image can be well guaranteed. (2) The combination of FCM clustering algorithm and CS technology makes full use of the sparsity of original image to achieve higher embedding capacity. (3) By combining CS with cryptosystem, data privacy can be protected effectively while making room for embedding additional information. (4) The result of classification is determined by class number and threshold, and the size of vacated capacity is controlled by CS sampling rate; therefore, the visual quality of the recovered image and the embedding rate of the host image can be adjusted according to specific requirements.
The remaining article is arranged as following. Relevant knowledge is shown in section “Relevant knowledge.” The details of our scheme are introduced elaborately in section “Proposed scheme.” Experimental results and performance comparisons with other related works are given in section “Experimental results and performance analysis.” Finally, section “Conclusion” concludes our work.
Relevant knowledge
Tent-logistic chaotic system
Chaos system is extremely sensitive to initial state and regulatory parameters, so any slight change of initial state can be exponentially amplified. 32 Tent-logistic chaotic system (TLCS) is a combination of two common one-dimensional (1D) chaotic systems, that is, tent map and logistic map, which are defined by equations (1) and (2) respectively
where parameters
The chaos range of TLCS is much larger than single logistic or tent system, and it can realize more unpredictability chaotic performance.
CS technology
CS is a novel signal sampling technology described in detail by Donoho. 33 The signal can be sampled randomly by applying the sparse character of nature signal in certain transform domain under the condition that the sampling rate is much lower than Nyquist. Inversely, the original signal can be recovered through the nonlinear reconstruction algorithm accurately or with high probability.
Assume that the original signal
where
The process can also be shown in Figure 1.

CS sampling process.
Due to the number of measurements M≪N, the random measurement process of a signal is regarded as compression process. We can get the approximate value
where
FCM clustering algorithm
FCM is a fuzzy clustering algorithm based on objective function for data clustering analysis, the core idea of which is the objects divided into the same cluster having the largest similarity but different clusters possessing the smallest similarity. In addition, FCM as an improvement of the ordinary C-means algorithm is a flexible fuzzy division. 34 The explicit process about implementation of FCM is described as follows.
Suppose the sample set
which also can be written as
Equation (9) means the value of
where parameter
The implementation of the whole FCM clustering process is illustrated in Figure 2, which is also described step by step as follows.

The flow diagram of FCM clustering algorithm.
Proposed scheme
The proposed scheme mainly consists of image encryption, data hiding, data extraction, and image recovery. The overall framework of our scheme is illustrated in Figure 3. In the phase of image encryption, the original image is encrypted and compressed synchronously via traditional stream cipher and CS technology. As for data hiding, after receiving the whole processed image, data hider can embed additional information directly with data-hiding key to facilitate management and operation of the host data. In the end, the receiver can extract additional information exactly and recovery original image with satisfactory visual quality from the marked, encrypted image.

Overall framework of our scheme.
Image encryption
During this stage, the original image
The classification results of Lena with

Classification results through FCM clustering algorithm with Lena image. (a) Experimental result with parameters
In the process of image encryption, all pixels in the same class belonging to categories
where
Meanwhile, each class
where
Here, we set
where

Flow diagram of encryption process.
Data embedding in encrypted domain
After receiving the cipher image
where

Illustration of image partition and space reservation.
Data extraction and image recovery
In this section, there are three cases to handle data extraction and image recovery by receiver. (1) The decrypted operation can be accomplished only when encryption key is available. (2) Additional data can be extracted precisely only when data-hiding key is available. (3) Not only additional data can be obtained accurately, but also the decryption can be implemented when both encryption key and data-hiding key are available. The flowchart of this section is illustrated in Figure 7.

Process of data extraction and image recovery.
Only encryption key is available
In this stage, five steps should be executed to accomplish image decryption.
where
Only data-hiding key is available
In this situation, we can just extract all additional data perfectly with data-hiding key without knowing any information about original content. For example, in Figure 6, we can directly extract all information in domain
Both encryption key and data-hiding key are available
The receiver can not only decrypt cipher image and but also extract additional data with both encryption key and data-hiding key. The order of two operations is exchangeable and independent. That is, the receiver can decrypt cipher image with encryption key first by applying the procedure which is the same as section “Only encryption key is available,” and then extract additional data with data-hiding key as stated in section “Only data-hiding key is available,” and vice versa.
Experimental results and performance analysis
In this section, to show the effectiveness and superiority of our scheme, sufficient experimental results and comparisons are presented with a mass of standard images. Our experiment is conducted in a personal computer with a 3.20 GHz Inter i5 processor, 4.00 GB memory and MATLAB R2014a in windows 10 operating system. There are mainly two aspects of the experiment, that is, security analysis of cipher image, and performance analysis in terms of recovered image quality and embedding capacity.
Security of cipher image
Histogram analysis
Image histogram is an intuitive means to reflect the distribution status of pixels value in a gray image. In general, an effective encryption scheme should mask the pixel value distribution of original image and generate an absolutely uniform distribution. Histograms of multiple plain images and their corresponding cipher images are given in Figure 8 to show the security performance of our encryption scheme. Figure 8(a1)–(h1) represents the original images and the corresponding encrypted images with crowd, woman, peppers, and plane. Figure 8(a2)–(h2) are the histograms corresponding to Figure 8(a1)–(h1) with cluster number

Test mages and respective histograms. (a1) Crowd, (b1) encrypted crowd image, (c1) woman, (d1) encrypted woman image, (e1) peppers, (f1) encrypted peppers image, (g1) plane, (h1) encrypted plane image. (a2)–(h2) Histograms corresponding to (a1)–(h1).
Information entropy analysis
In 1949, Shannon proposed the concept of information entropy to solve information quantification and measurement problem. In general, information entropy is used for measuring the uncertainty by calculating the probability of a randomness variable appearing. The information entropy
where
Information entropy results.
Correlation analysis
An effective encryption scheme should obviously decrease the correlation between adjacent pixels of original image. 4000 pairs of adjacent pixels are selected randomly from original image and encrypted image to analyze the spatial correlations in three directions: horizontal, vertical and diagonal. They can be calculated according to equations (30)–(32)
where
Correlation coefficients with plain images and cipher images in three directions.

Correlation analysis. (a) Plain image, (b)–(d) correction distribution of (a) in horizontal, vertical, and diagonal directions, respectively, (e) cipher image of (a), (f)–(h) correction distribution of (e) in horizontal, vertical, and diagonal directions, respectively.
Recovered image and embedding rate
In this section, two typical criterions are applied to evaluate quality of recovered images, which are PSNR and structural similarity index measurement (SSIM). We can calculate PSNR and SSIM by equations (33)–(38)
where
where
The SSIM value of two images
Quality of recovered image
Figure 10 shows the experimental results of original images and recovered images with some standard test images sized

Original test images and recovered images under parameters

Recovered images of Barbara under parameters

Recovered images of peppers under parameters

Recovered images of lake under parameters
Embedding capacity
Tables 3–6 list the embedding rates and PSNR values of recovered images with test images Lena, baboon, lake, and milkdrop under different parameter values. We set the values of parameter
Embedding rate of Lena image with parameter
Embedding rate of baboon image with parameter
Embedding rate of lake image with parameter
Embedding rate of Milkdrop image with parameter
Maximum embedding rate in proposed scheme and some related literatures with different images.
Conclusion
In this article, we proposed a privacy-preserved data-hiding scheme in encrypted domain by FCM clustering and CS technologies. Low-correlated classes of the original image are encrypted by traditional stream cipher, and high-correlated classes are encrypted and compressed by CS. Additional data can be embedded into the room vacated by CS in the high-correlated classes. The receiver can extract the embedded data and recover the encrypted image in a flexible way according to different keys. Vast experimental results and comparisons show that the proposed scheme has higher security performance, larger embedding rate, and better visual quality of recovered image than other related works. However, an obvious deficiency is the indistinctive improvement of recovered image quality caused by CS reconstruction even when the embedding capacity is lower. Thus, we expect especially that a more precise and appropriate CS reconstruction algorithm can be explored to optimize the visual quality of reconstructed image.
