• Open Access

Adaptive least significant bit matching revisited with the help of error images

Authors


Abstract

State-of-the-art steganographic schemes, such as highly undetectable stego (HUGO) and its extended version, aim at least significant bit (LSB)-based approaches embedding up to 1 bpp and are more concerned about the undetectability level of the stego image rather than the peak signal-to-noise ratio. The complexity of such methods is quite high, too. In this work, a steganographic scheme is proposed in a spatial domain that takes advantage of error images resulting from applying an image quality factor (the same as the ones used in JPEG compression) in order to find the pixels where a slight change could be made. The amount of change is adaptively embedded using LSB matching revisited. We show that our proposed method is less detectable than HUGO and almost as undetectable as the extended HUGO while it has a greater time performance. Copyright © 2014 John Wiley & Sons, Ltd.

1 Introduction

Steganography conceals the very existence of any secret information in terms of bits. It usually embeds secret data into any type of digital cover media, such as an image, video, audio, and so on. The manipulated image or video looks innocent, and the message cannot be detected with the human eye. On the other hand, exposing the existence of any hidden information in a cover image is what steganalysis does. Such steganalytic algorithms are able to estimate the probable existence of secret bits in different ways. If steganalysis detects the hidden information with a minimum probability of testing error, the steganographic scheme has been broken.

There are two factors that have to be considered in designing a modern steganographic scheme, namely embedding rate and undetectability, with a trade-off between them. The higher the embedding rate, the greater the detectability. Some approaches are more concerned about embedding capacity, with higher imperceptibility levels provided by greater peak signal-to-noise ratio (PSNR) values, but there are many that try to be more undetectable rather than having higher PSNR values. Least significant bit (LSB) replacement [1] embeds the information in the LSB of a pixel, independent of its value. The LSB is directly replaced by the secret bit. This adds some unwanted statistical artifacts, by which the existence of secret bits can be exposed. Such artifacts are paired with values in a histogram of the stego image made by the LSB replacement method. This makes detection easier for a chi-square attack [2]. LSB matching (LSBM) [3] applies minor changes after LSB replacement because it randomly increments or decrements the LSB of a pixel according to a pseudo-random number generator if the secret bit does not match the pixel's LSB. It is also called non-adaptive ±1 embedding. Unlike LSB replacement and LSBM, which deal with the pixel values independently (non-adaptively), LSBM revisited (LSBMR) [4] is another approach that modifies the LSBM algorithm in such a way that the choice of incrementing or decrementing the pixel value is no longer random. It performs the operation by using a pair of pixels as a unit. The first pixel value changes in such a way that the first secret bit is saved in its LSB and the second secret bit equals a function of the two modified pixel values. Both LSBM and LSBMR are undetectable with a chi-square attack because, statistically, the probability of change is the same as the increment/decrement performed, either randomly or by using a function. Although the asymmetry artifacts of LSB replacement are almost completely avoided, they can still be detectable using stronger steganalytic attacks. These LSB-based approaches do not consider the difference between the pixel and its neighbors. Edge adaptive image steganography [5] embeds secret bits based on the LSBMR method. It begins embedding from the edge regions as far as possible, while keeping other smooth areas as they are. The maximum embedding capacity of this approach is limited to 1 bpp while the visual quality and security of stego images are proving to be better than those of LSB-based and edge-based methods. Another approach is to use high-dimensional image models to perform highly undetectable stego (HUGO) [6]. The source code is available at Break Our Steganographic System (BOSS) website [7]. This method calculates distortions corresponding to modification of each pixel by ±1 and sets the stego image pixel value as the minimum of these numbers. The best embedding order starts from pixels with the high cost of embedding to the lowest, which is ascertained by an additive distortion function. The default parameters of the distortion function were σ = 1 and γ = 1, and the switch –T 90, which means that the distortion function was computed with threshold T = 90 used in the BOSS challenge [7]. Security of HUGO is evaluated by training support vector machine (SVM)-based steganalyzers utilizing second-order subtractive pixel adjacency model (SPAM) features [8]. A filter suppresses the stego image content and exposes the added noise in the stego image. Dependencies between neighboring pixels are modeled as a higher-order Markov chain. The resulting sample transition probability matrix is a vector feature that is a SPAM of covers. The second-order Markov chain results in a second-order SPAM including 686 features for a typical stego image. In this work, the undetectability level of the mentioned methods is benchmarked utilizing the second-order SPAM as input features to state-of-the-art ensemble classifiers [9]. They proved to have better performance compared with SVM-based steganalyzers in terms of both time and accuracy. The classifier has to be trained with a database of pictures to detect the information more accurately, so BOSS [7] version 1.01 was used to create sufficient stego images. The BOSS database consists of 10 000 8-bit grayscale images at 512 × 512 pixels. Kodovský et al. [10] use T = 255 in order to remove a weakness of HUGO with original threshold value T = 90 that makes the algorithm vulnerable to first-order attacks because of an artifact present in the histogram of pixel differences. Thus, they have compared the detection error for six different payloads (0.05, 0.1, 0.2, 0.3, 0.4, and 0.5 bpp) and two settings of HUGO when using the histogram features (dim 4), the SQUARE feature (338) [11], and a combination of both (SQUARE+ hx), which is equal to 342 features. HUGO embeds in those places of the cover image where it is hard to model, and that is why they are more secure and less detectable compared with ±1 embedding [11].

The current work embeds in the spatial domain because of the simplicity of the algorithmic nature and ease of mathematical analysis. Also, spatial-domain techniques can carry the largest messages (embedding rate) compared with transform domains, namely discrete cosine transform (DCT)-based embedding techniques and LSB-based approaches [12]. The reason is that transformation domain techniques can only embed in nonzero coefficients, whereas all pixels can be utilized in the spatial domain. Modern steganographic schemes are supposed to be undetectable, rather than stressing the PSNR value, so the current scheme also shows the undetectability level. The detectability level is shown by ensemble classifiers using SQUARE+ hx feature (dim 342) and second-order SPAM feature (dim 686). The algorithm uses LSBMR for embedding the secret bits; however, unlike LSBMR method, the target pixels are adaptively chosen based on a preprocessing phase. This paper is organized as follows. Section 2 describes the proposed method. The experimental results are compared and evaluated with respect to modern and classical steganographic schemes in Section 3. Finally, Section 4 highlights and discusses the conclusions, based on the results.

2 Proposed method

The proposed method pre-processes the cover image sized M × N pixels in order to create an error image. The required error image is computed by applying a suitable image quality factor (IQF), like the one suggested in JPEG compression, as follows:

  1. Partition the cover image into non-overlapping blocks sized 8 × 8 pixels.
  2. Apply DCT to every block using the JPEG standard quantization table. The cover image can be transformed using a JPEG IQF, which can be any float number between 0 (the least expected quality or full substitution of pixel values) and 100 (identical image without any change).
  3. Apply inverse DCT to the matrix of coefficients resulting from DCT of each block.
  4. The compressed image resembles the original image. It consists of some imperceptible added noise. The amount of noise has a direct relation to compression level and embedding rate. The larger the embedding rate, the more noise.
  5. The noise becomes greater if a smaller IQF is employed. In this regard, for every pixeli,j from CoverImgM × N where i ≤ M and j ≤ N:
display math(1)

According to the pixel values from the error image, multiple bases are calculated for every corresponding pixel. BaseM × N represents the matrix of multiple bases:

display math(2)

The whole procedure of embedding is illustrated in Figure 1. According to the embedding order, which is line by line from top to bottom of the cover image, the algorithm reads the corresponding Base from BaseM × N. Pixels with a Base value less than 2 are skipped and left the way they are. Thus, only pixels with a base greater than or equal to 2 are allowed to be embedded. The secret bits are partitioned into non-overlapping 2-bit blocks. We applied 2-bit blocks to embed secret bits as described by LSBMR method. Thus, for every 2-bit block, LSBMR [4] is applied to minimize the embedding effect for two pixel units. Note that, unlike the LSBM concept, there might be some pixels between two such pixels to which LSBM is not applied.

Figure 1.

Process of calculating Base matrix for 4 × 4 pixels of a typical cover image.

The structure of the proposed scheme is modeled in Figure 2. The embedding is repeated until every 2-bit block of secret bits is processed. If all pixels from the cover image have been processed and some of the secret bits have not been embedded yet, IQF has to be decreased toward 0.01. If the secret bits are still partially embedded, the cover image cannot take more secret bits, and that would be the maximum payload of the current method (up to 1 bpp).

Figure 2.

Structure of the proposed embedding method.

The algorithm extracts the secret bits from the stego image pixels using the same IQF and Base matrix. The stego pixels with a corresponding base greater than 2 implies where secret bits are embedded. Using the embedding map (Base matrix), every pair of secret bits can be easily extracted using LSBMR algorithm.

The embedding procedure is illustrated in Table 1. Four pixels with grayscale values 80, 101, 157, and 213 are excluded because their corresponding Base values are either 0 or 1. Pixels with values of 87 and 165 have formed a pair of pixel units to be given to LSBMR [4] for embedding (zero in both of them). The pixel with a value of 165 is identical to the original pixel value. In this case, only one pixel is spared from being changed, yet the message bits are still extractable. Finally, the extracted values are shown in boldface. According to the LSBMR algorithm, in the worst case, only one of the two pixels is incremented or decremented.

Table 1. A numeric illustration of embedding and extracting.
Secret bits: Only two bits can be embedded-0--0-
  1. LSB, least significant bit; LSBMR, least significant bit matching revisited.

Pre-processing phaseCover image pixel values8087101157165213
Corresponding Base numbers031020
Embed using LSBMRStego image pixel valuesNo embedding86No embeddingNo embedding165No embedding
ExtractionUsing LSBMR [4]No extractingLSB(86) = 0No extractingNo embeddingLSB(math formula + 165) = 0No embedding

3 Experiment and results

One of the most important aspects of any performance evaluation is to use a standard data set with a variety of image textures. The proposed scheme employs the image database of BOSS version 1.01—it consists of 10 000 grayscale images sized 512 × 512 pixels—which is also used to evaluate modern steganographic schemes with embedding rates less than or equal to 1 bpp. The proposed method was implemented and executed using MATLAB R2012a (MathWorks, California Office 970 West 190th Street Suite 530 Torrance, CA 90502, UNITED STATES) on an Intel Core i5-2500, 3.3–3.6 GHz, with 8 GB RAM.

Modern steganography is more concerned with undetectability levels than imperceptibility, so the PSNR value is always supposed to be high. In this regard, HUGO is a modern steganography method in which the LSBM concept is applied to manipulate the LSB of the pixels. The undetectability level is shown using a probability of error provided by ensemble classifiers using second-order SPAM features and SQUARE+ hx features (dim 342). When the value of the probability error goes down, there is a greater chance of detection. HUGO [6] has been proven to have the greatest probability of error compared with the edge adaptive method using SPAM features [11].

Table 2 shows the average testing error over 10 splits of BOSS image database, which is calculated by ensemble classifier using second-order SPAM features (dim 686). It can be seen that HUGO, T = 90 [6], has greater detectability error compared with the proposed method. In Table 3, it is shown that HUGO [6] is more vulnerable to SQUARE+ hx features (dim 342), because it has smaller detection errors compared with both adaptive LSBMR (A-LSBMR) with the help of error images and extended HUGO [10]. Furthermore, our proposed method, A-LSBMR has a contribution beyond HUGO with T = 90 in terms of lesser detectability (higher average testing error). Meanwhile, it is almost as undetectable as extended HUGO with T = 255.

Table 2. Detectability comparison between the proposed method and the HUGO (T = 90) [6] approach.
BOSS 1.01 database (10 000 images)Average testing error over 10 splits (pe) using second-order SPAM features (dim 686)
Payload (bpp)Capacity (bits)HUGO (T = 90) [[6]]A-LSBMR (PSNR, IQF)
  1. BOSS, Break Our Steganographic System; SPAM, subtractive pixel adjacency model; HUGO, highly undetectable stego; A-LSBMR, adaptive least significant bit matching revisited; PSNR, peak signal-to-noise ratio; IQF, image quality factor.

0.0513 1010.50000.3782 (65.28 dB, 60.0)
0.126 2140.48440.3224 (00.00 dB, 55.0)
0.252 4280.44690.2665 (59.25 dB, 50.0)
0.378 6430.40100.1963 (57.49 dB, 45.0)
0.4104 8570.36000.1870 (56.24 dB, 40.0)
Table 3. Detectability comparison between the proposed method, HUGO (T = 90), and HUGO (T = 255).
BOSS 1.01 database (10 000 images)Average testing error over 10 splits (pe) using SQUARE+ hx features (dim 342)
Payload (bpp)Capacity (bits)HUGO (T = 90) [[6]]Extended HUGO (T = 255) [[10]]A-LSBMR (PSNR, IQF)
  1. BOSS, Break Our Steganographic System; HUGO, highly undetectable stego; A-LSBMR, adaptive least significant bit matching revisited; PSNR, peak signal-to-noise ratio; IQF, image quality factor.

0.0513 1010.32330.44320.3399 (65.28 dB, 60.0)
0.126 2140.29110.39930.2841 (62.27 dB, 55.0)
0.252 4280.22540.32620.2282 (59.25 dB, 50.0)
0.378 6430.16480.26300.1937 (57.49 dB, 45.0)
0.4104 8570.12840.20080.1844 (56.24 dB, 40.0)

Table 4 shows the global pe, which is the minimum average classification error for every embedding method. Generally, SQUARE features provide the minimum detectability error for any of HUGO methods with two settings and A-LSBMR. The reason is that SQUARE features are resulted from third-order residuals and have a better detection accuracy than first-order and second-order residuals (SPAM) for every method [11].

Table 4. The global pe of three embedding methods resisting second-order SPAM features (dim 686) and SQUARE+ hx features (dim 342).
BOSS 1.01 database (10 000 images)Average testing error over 10 splits (pe) using second-order SPAM features (dim 686) and SQUARE+ hx features (dim 342)
Payload (bpp)Capacity (bits)HUGO (T = 90) [[6]]Extended HUGO (T = 255) [[10]]A-LSBMR
  1. BOSS, Break Our Steganographic System; SPAM, subtractive pixel adjacency model; HUGO, highly undetectable stego; A-LSBMR, adaptive least significant bit matching revisited.

0.0513 1010.32330.44320.3399
0.126 2140.29110.39930.2841
0.252 4280.22540.32620.2282
0.378 6430.16480.26300.1937
0.4104 8570.12840.20080.1844

In Table 5, time performance of the proposed algorithm is about two times greater than that of the HUGO methods [6, 10], whereas A-LSBMR executes a MATLAB source code, which is actually slower than a source code in C language. That is to say that the proposed method could have been executed faster if it was implemented in C language. Furthermore, the complexity of A-LSBMR code is similar to classical schemes and can be implemented more easily. HUGO with threshold value T = 255 has to compute more features so that it results in a longer computation time, which is around two times slower than A-LSBMR method.

Table 5. Embedding time (in seconds) of HUGO (T = 90), HUGO (T = 255), and the proposed method (A-LSBMR) with the 512 × 512 grayscale image Lena as the cover.
Embedding rates0.050.10.20.30.40.5
  1. HUGO, highly undetectable stego; A-LSBMR, adaptive least significant bit matching revisited.

HUGO [6]4.324.375.215.275.635.83
Extended HUGO [10]4.754.805.725.796.196.41
A-LSBMR2.022.042.432.722.923.02

Figure 3 shows the Lena image, which is one of the most commonly known standard test images in steganography. In a simple experiment, we attempted to prove that higher IQF values would result in more undetectability of the stego image. Theoretically, PSNR values will be greater if we employ a higher IQF, and as it is shown in our experiments, the detection error has a direct relation to the value of PSNR. Note that there are two error images introduced in the proposed method. One is computed in pre-processing phase of the algorithm. As with Figure 4, another error image is calculated between the stego image and the original image to show where, and to what extent, the impact of embedding has happened. In Figure 4, the experiment is performed to embed a payload of 0.4 bpp using an IQF of 40. The Base matrix (Figure 4) implies where to embed with a white pixel for a Base value greater than 2. The black pixels are not meant to hold any secret data because they have a Base value less than 2. The white area represents where to embed more. As can be seen, the secret bits are scattered through the stego image (PSNR 56.33 dB), and we can hardly see the detailed texture of the image (Figure 3) in the resulted error image which is calculated from differentiating stego image from the original image. The black area of the Base matrix has become wider in the error image, and that is due to the fact that LSBMR will leave 50% of its pairs intact. The same fact is true for the white areas. Those white pixels in Base matrix are changed with the same probability of embedding change. Thus, the number of white pixels is halved, consequently.

Figure 3.

Third cover image from image database BOSS 1.01 called ‘3.pgm’.

Figure 4.

The error image (left) and the Base matrix (right) (IQF 40, payload 0.4 bpp, 56.33 dB).

4 Conclusion

The proposed method shows how ensemble classifiers would be affected by the JPEG IQF and JPEG decompression artifacts. They make it more confusing to steganalysis schemes. It is also proven that the proposed method guarantees less detectability compared with the HUGO (T = 90) method and is close to the detectability level of the HUGO (T = 255) method because, unlike a conventional LSBMR method, the current algorithm distributes the secret bits through the cover image more adaptively and selects the right pixels to hold secret bits using a calculated Base matrix as a guide. The guide exploits the texture of the cover image so that the secret data bits are distributed through the white target pixels using LSBMR method. The execution time of the proposed method is proven to be around two times faster than that of the HUGO methods with two different settings. The proposed scheme requires a Base matrix to extract the secret information successfully. Hence, there could be a solution to obtain the same Base matrix calculated from the stego image so that the existence of a cover image is not necessary. This can be investigated in future work.

Ancillary