Camouflage Generative Adversarial Network: Coverless Full-image-to-image Hiding

Image hiding, one of the most important data hiding techniques, is widely used to enhance cybersecurity when transmitting multimedia data. In recent years, deep learning-based image hiding algorithms have been designed to improve the embedding capacity whilst maintaining sufficient imperceptibility to malicious eavesdroppers. These methods can hide a full-size secret image into a cover image, thus allowing full-image-to-image hiding. However, these methods suffer from a trade-off challenge to balance the possibility of detection from the container image against the recovery quality of secret image. In this paper, we propose Camouflage Generative Adversarial Network (Cam-GAN), a novel two-stage coverless full-image-to-image hiding method named, to tackle this problem. Our method offers a hiding solution through image synthesis to avoid using a modified cover image as the image hiding container and thus enhancing both image hiding imperceptibility and recovery quality of secret images. Our experimental results demonstrate that Cam-GAN outperforms state-of-the-art full-image-to-image hiding algorithms on both aspects.


I. INTRODUCTION
Image hiding [1] is an important data hiding technology that allows to conceal a secret message into images when using public communication channels in order to transmit sensitive information, with many image hiding algorithms have been developed to enhance data security [2], [3] . The design of these algorithms focusses mainly on optimising a trade-off between increasing the hidden capacity to embed secret images and preserving their imperceptibility to a third party.
Conventional image hiding techniques, such as LSB [4], HUGO [5], WOW [6], and S-UNIWARD [7], exploit the less significant components from either the image spatial space or frequency space to embed a secret message to achieve imperceptibility. These conventional image hiding methods unavoidably leave traces of the modification which might be detected by some advanced steganalysis tools.
Coverless image hiding via texture synthesis has become one of the key mechanisms to solve this problem [8]- [11]. These methods design mapping functions based on convolutional neural networks to transfer secret images into another synthesised texture form. Consequently, the embedding stage in the classical image hiding pipeline is not required, making hidden messages much harder to be detected from container images [8].
However, insufficient hidden capacity is one of the major issues of both conventional image hiding techniques and stateof-the-art coverless methods. None of these methods is capable of hiding a full-size secret image. This problem hinders further development of image hiding techniques to satisfy the increasing demand of protecting privacy and security of largesize multimedia data.
Recently, several advanced deep learning-based methods have reached the capacity of a full-size secret image, thus supporting full-image-to-image hiding [12]- [15]. These approaches minimise both the error between cover and container image and the reconstruction error of the secret image using convolutional neural networks. However, an optimal balance between imperceptibility of the container image and reconstruction quality of the secret image is difficult to achieve in these end-to-end deep learning image hiding methods.
In this paper, we propose a two-stage coverless fullimage-to-image hiding algorithm named Camouflage Generative Adversarial Network (Cam-GAN) by exploiting both the advantage of coverless image hiding and deep learningbased techniques to achieve full-image-to-image hiding with insignificant reconstruction error of the secret image. The key contributions of Cam-GAN are as follows: (i) A coverless image hiding structure based on image synthesis is deployed to address the error trade-off for better imperceptibility and  higher reconstruction quality of the secret image. (ii) A cycle consistent GAN is designed to ensure full-image size hidden capacity by synthesising realistic container images. To our knowledge, it is the first coverless image hiding algorithm that can achieve full-image size hidden capacity. iii) Another refining GAN is introduced together with the cycle consistent GAN as the image hiding decoder to further improve the reconstruction quality of the secret image. The remainder of the paper is organised as follows. In Section II, we explain the details of our proposed Cam-GAN method. Section III presents experimental results that demonstrate the superiority of our approach. Finally, conclusions are drawn in Section IV.

II. CAM-GAN NETWORK
As illustrated in Fig. 1, we propose a two-stage generative adversarial network (GAN) model for image hiding. The idea is mainly inspired by recent advancements of GANs [16], [17]. Our proposed framework comprises two components: an image hiding encoder and an image hiding decoder. The image encoder is composed of a generator G 1 in Cam-GAN Stage I which synthesises a container image to hide secret images via a texture synthesis function. The image decoder is composed of the other generator, G 2 , in the Cam-GAN Stage I network, and the generator G 3 in the Cam-GAN Stage II network to recover the secret image. In the following, we explain the details of our proposed architecture and its loss functions.

A. Cam-GAN Architecture
In Cam-GAN Stage I, an asymmetric cycle consistent GAN is trained to hide a secret image into a container image via an encoder generator G 1 , and then recover the secret image with insignificant quality loss via a decoder generator G 2 . Here, G 2 is an approximate inverse function of the generator G 1 . The Arnold transformation [18] is applied to scramble the secret image before feeding it into G 1 to conceal the global structure of the secret image. The network architectures of G 1 and G 2 are illustrated in Fig. 2. They comprise four convolutional layers, nine residual network (ResNet) blocks and four deconvolutional layers. This architecture has been successfully deployed in various image transfer tasks [17]. Furthermore, a discriminator D 1 is used to make the container image hard to be discriminated from the gallery of images. This discriminator is a CNN with four convolutional layers and one fully connected layer as illustrated in Fig. 3. The output of the discriminator is a possibility value indicating whether the input image belongs to the image gallery or a synthesised image. The training stage performs adversarial learning as the generators learn a mapping function to synthesise more realistic images so that these images can fool the discriminator.
In Cam-GAN Stage II, a refining GAN is trained to improve the quality of the recovered secret image. The generator G 3 learns the mapping between the recovered and original secret images to improve the recovery quality, while a discriminator D 2 is employed to further enhance the reconstruction quality.
ReflectionPad (3,3) +Conv2d 7x7/1 +Tanh ReflectionPad (3,3)  As illustrated in Fig. 4, G 3 uses a U-Net architecture which contains nine convolutional layers and nine deconvolutional layers and shortcut connections between the convolutional and deconvolutional layers to reuse the feature representations generated from different layers via various convolutional kernels for better recovery quality. The discriminator D 2 shares a similar architecture with the discriminator D 1 (Fig. 3) although the first layer has a different input size.

B. Cam-GAN Loss Function
In Cam-GAN Stage I, there are two terms in our loss function: a reconstruction loss term L 1 defined as and a discriminator loss term L 2 defined as leading to (3) Here, x represents the secret image, y one of the gallery images, G 1 and G 2 denote the two generators to encode and decode the secret image, D 1 is the discriminator to make the synthesised container image more realistic, and λ 1 is a weight to balance the reconstruction and discriminator terms. With the iterative update of D 1 , G 1 and G 2 , the encoder and decoder generators can be converged to hide secret images with texture synthesis.
In Cam-GAN Stage II, we also have a loss function with two terms, a reconstruction term L 3 defined as and a discriminator term L 4 defined as giving where x represents the secret image, x the recovered secret image from the first stage, G 3 is the generator to refine the quality of the secret image, D 2 the discriminator to enhance the recovered secret image with higher quality, and λ 2 is a weight to balance the reconstruction and discriminator terms.

C. Cam-GAN Implementation
For the implementation, the weight parameters in the loss functions, λ 1 and λ 2 , are set to 10 and 100, respectively, while as training parameters we employ a batch size of 1. The number of epochs of the asymmetric cycle consistent GAN is 300 to 400, while the number of epochs of the refining GAN is fixed to 300. Both learning rates are set to 0.0002 with a linear decay after 150 epochs. For different secret images, we use individual keys to randomly choose the asymmetric cycle consistent GAN models trained after different epochs (randomly selected from 300 to 400) to further enhance the security.

A. Experimental Setup
In our experiments, we first evaluate the image quality of the encoder and decoder of Cam-GAN both subjectively and objectively. We then test the performances of Cam-GAN against steganalysis. To ensure a fair comparison, we use a several image hiding algorithms which have full-image-to-image hidden capacity, including [12] and [13] as benchmark methods, because both image quality and hiding undetectability are directly affected by the size of the hidden message. Further, we compare the hidden capacity of Cam-GAN with [8], [10], [11] to demonstrate superiority compared with state-of-the-art coverless image hiding algorithms.
Our dataset includes 400 gallery images of paintings [17] and 1200 secret images to hide. The image size of both gallery and secret images is 256 × 256 pixels. The secret images contain 400 facades [19], 400 faces [20] and 400 aerial images [17]; 800 of them are used for Cam-GAN training, while the other 400 are used for testing.

B. Image Quality Evaluation
We first evaluate the performance of the Cam-GAN image hiding encoder and decoder visually, and then provide quantitative comparison results of the reconstruction performance against the benchmark methods in Table I. Note that the proposed method is a coverless image hiding technique, and thus there is no cover image needed in the hiding process. Fig. 5 and Fig. 6 illustrate that our proposed Cam-GAN can generate realistic container images. In addition, it yields the best reconstruction quality of secret images. As shown in Fig. 5, the residual cover images of both [12] and [13] clearly show patterns from the secret images that can be easily detected by visual analysis. In contrast, the synthesised painting images from Cam-GAN are visually similar to real paintings. The significant advantage of our proposed method is that the container image can better fool both human perception and machine steganalysis without modifying cover images as in the other methods. As illustrated in Fig. 6, for Cam-GAN, the residuals between recovered and original secret images are much more insignificant compared to the other algorithms, demonstrates Cam-GAN to outperform them in terms of recovery quality for the secret image.
To further demonstrate the effectiveness of our proposed Cam-GAN, an objective image quality comparison is performed based on pixel errors, peak signal-to-noise ratio (PSNR) and structural similarity index (SSIM).
PSNR is defined as where I is the original image, I the recovered image, H and W are the image dimensions, and (i, j) indicate pixel coordinates. SSIM is calculated as where µ x , µ y are the averages of x and y, σ 2 x , σ 2 y are the variances and σ xy are covariance for x and y respectively. C 1 and C 2 are balancing constants.
Lower pixel errors, a higher PSNR and a higher SSIM on the secret image (cover image) indicate better recovery quality (image hiding imperceptibility).
The obtained results are shown in Table I from where we can see that the average pixel errors of each channel of the cover images are 9.83 and 8.92 for [12] and [13], respectively. Our proposed Cam-GAN completely avoids using a modified cover images by synthesising new container images, which ensures higher security. Moreover, the average pixel error of the recovered secret image is 2.85 for Cam-GAN, which is significantly lower than for the other two methods, while we also obtain better results in terms of both PSNR and SSIM, clearly demonstrating superior recovery quality.

C. Steganalysis
Avoiding detection from automatic steganalysis tools is another important performance measure of image hiding methods. Following [12], we use a publicly available steganalysis tool, StegExpose [21], for this purpose. We perform detection on a dataset of 400 clean images from [17] and 400 container images from Cam-GAN, [12] and [13] (i.e., 800 images for each method).
A hypothesis test is used to evaluate the efficacy of the proposed method due to the following two reasons: (1) visual comparison based on the receiver operating characteristic (ROC) curve is subjective; it is hard to judge which algorithm performs better if the ROC curves from two methods have many overlapping regions; (2) the curves become unstable when the test sample size is relatively small since the area under the curve (AUC) can change significantly when adding or removing samples. A significance level p from the hypothesis test can provide an objective comparison by calculating a reliable statistical indicator. We assume that the null hypothesis H 0 is that the ROC curve generated from an algorithm has no significant difference to that obtained from random guessing. Once a p-value is higher than 0.05, this indicates there is no statistically significant difference to the null hypothesis, while a higher p-value indicates better imperceptibility. The obtained p-values are 0.098 for Cam-GAN, 0.0056 for [12], and 0.033 for [13]. Thus, for the two benchmark methods, the p-values are below 0.05 and thus indicate strong evidence against the null hypothesis. Cam-GAN achieves the best result with a p-value larger than 0.05 thus obeying the null hypothesis that there is no statistical difference between our algorithm and random guessing, confirming the hiding imperceptibility capability of our proposed algorithm,

D. Hidden Capacity
Finally, we further compare our Cam-GAN and three coverless image hiding algorithms via image synthesis [8], [10], [11] in terms of hidden capacity as shown in Table II. From there, we can find that Cam-GAN enlarges the hidden capacity significantly compared with these state-of-the-art coverless image hiding algorithms and only our approach yields a capacity 24 bits per pixel (bpp) to enable full-image-to-image hiding.

IV. CONCLUSIONS
In this paper, we have proposed Cam-GAN, a novel coverless full-image-to-image hiding algorithm. To our best knowledge, it is the first coverless image hiding algorithm that can ensure full-size image hiding capacity. Extensive experimental results have demonstrated that our proposed method also outperforms state-of-the-art full-image-to-image hiding algorithms. On one hand, Cam-GAN can generate realistic container images that are difficult to distinguish from gallery images in order to hide the secret image. In this manner, image hiding imperceptibility is significantly enhanced. On the other hand, recovery of the secret image is accompanied by only insignificant image quality loss via a two-stage adversarial learning network. In future work, we will investigate the use of the proposed method in real-world applications. original secrect images recovered secrect images of [12] residual secrect images (×5 ) of [12] recovered secrect images of [13] residual secrect images (×5 ) of [13] recovered secrect images of Cam-GAN residual secrect images (×5 ) of Cam-GAN original secrect images original cover images modified cover (container) images of [12] residual cover images (×5 ) of [12] modified cover (container) images of [13] residual cover images (×5 ) of [13] synthesed container images of Cam-GAN