IH&MMSec '18- Proceedings of the 6th ACM Workshop on Information Hiding and Multimedia Security


SESSION: Keynote 1

Covert and Deniable Communications

  •      Ross Anderson

At the first Information Hiding Workshop in 1996 we tried to clarify the models and assumptions behind information hiding. We agreed the terminology of cover text and stego text against a background of the game proposed by our keynote speaker Gus Simmons: that Alice and Bob are in jail and wish to hatch an escape plan without the fact of their communication coming to the attention of the warden, Willie. Since then there have been significant strides in developing technical mechanisms for steganography and steganalysis, with new techniques from machine learning providing ever more powerful tools for the analyst, such as the ensemble classifier. There have also been a number of conceptual advances, such as the square root law and effective key length. But there always remains the question whether we are using the right security metrics for the application. In this talk I plan to take a step backwards and look at the systems context. When can stegosystems actually be used? The deployment history is patchy, with one being Trucrypt's hidden volumes, inspired by the steganographic file system. Image forensics also find some use, and may be helpful against some adversarial machine learning attacks (or at least help us understand them). But there are other contexts in which patterns of activity have to be hidden for that activity to be effective. I will discuss a number of examples starting with deception mechanisms such as honeypots, Tor bridges and pluggable transports, which merely have to evade detection for a while; then moving on to the more challenging task of designing deniability mechanisms, from leaking secrets to a newspaper through bitcoin mixes, which have to withstand forensic examination once the participants come under suspicion. We already know that, at the system level, anonymity is hard. However the increasing quantity and richness of the data available to opponents may move a number of applications from the deception category to that of deniability. To pick up on our model of 20 years ago, Willie might not just put Alice and Bob in solitary confinement if he finds them communicating, but torture them or even execute them. Changing threat models are historically one of the great disruptive forces in security engineering. This leads me to suspect that a useful research area may be the intersection of deception and forensics, and how information hiding systems can be designed in anticipation of richer and more complex threat models. The ever-more-aggressive censorship systems deployed in some parts of the world also raise the possibility of using information hiding techniques in censorship circumvention. As an example of recent practical work, I will discuss Covertmark, a toolkit for testing pluggable transports that was partly inspired by Stirmark, a tool we presented at the second Information Hiding Workshop twenty years ago.

SESSION: Keynote 2

Deep Learning in Multimedia Forensics

  •      Luisa Verdoliva

With the widespread diffusion of powerful media editing tools, falsifying images and videos has become easier and easier in the last few years. Fake multimedia, often used to support fake news, represents a growing menace in many fields of life, notably in politics, journalism, and the judiciary. In response to this threat, the signal processing community has produced a major research effort. A large number of methods have been proposed for source identification, forgery detection and localization, relying on the typical signal processing tools. The advent of deep learning, however, is changing the rules of the game. On one hand, new sophisticated methods based on deep learning have been proposed to accomplish manipulations that were previously unthinkable. On the other hand, deep learning provides also the analyst with new powerful forensic tools. Given a suitably large training set, deep learning architectures ensure usually a significant performance gain with respect to conventional methods, and a much higher robustness to post-processing and evasions. In this talk after reviewing the main approaches proposed in the literature to ensure media authenticity, the most promising solutions relying on Convolutional Neural Networks will be explored with special attention to realistic scenarios, such as when manipulated images and videos are spread out over social networks. In addition, an analysis of the efficacy of adversarial attacks on such methods will be presented.

SESSION: JPEG & H.264 Steganography

Defining Joint Distortion for JPEG Steganography

  •      Weixiang Li
  • Weiming Zhang
  • Kejiang Chen
  • Wenbo Zhou
  • Nenghai Yu

Recent studies have shown that the non-additive distortion model of Decomposing Joint Distortion ($DeJoin$) can work well for spatial image steganography by defining joint distortion with the principle of Synchronizing Modification Directions (SMD). However, no principles have yet produced to instruct the definition of joint distortion for JPEG steganography. Experimental results indicate that SMD can not be directly used for JPEG images, which means that simply pursuing modification directions clustered does not help improve the steganographic security. In this paper, we inspect the embedding change from the spatial domain and propose a principle of Block Boundary Continuity (BBC) for defining JPEG joint distortion, which aims to restrain blocking artifacts caused by inter-block adjacent modifications and thus effectively preserve the spatial continuity at block boundaries. According to BBC, whether inter-block adjacent modifications should be synchronized or desynchronized is related to the DCT mode and the adjacent direction of inter-block coefficients (horizontal or vertical). When built into $DeJoin$, experiments demonstrate that BBC does help improve state-of-the-art additive distortion schemes in terms of relatively large embedding payloads against modern JPEG steganalyzers.

Facing the Cover-Source Mismatch on JPHide using Training-Set Design

  •      Dirk Borghys
  • Patrick Bas
  • Helena Bruyninckx

This short paper investigates the influence of the image processing pipeline (IPP) on the cover-source mismatch (CSM) for the popular JPHide steganographic scheme. We propose to deal with CSM by combining a forensics and a steganalysis approach. A multi-classifier is first trained to identify the IPP, and secondly a specific training set is designed to train a targeted classifier for steganalysis purposes. We show that the forensic step is immune to the steganographic embedding. The proposed IPP-informed steganalysis outperforms classical strategies based on training on a mixture of sources and we show that it can provide results close to a detector specifically trained on the appropriate source.

Cover Block Decoupling for Content-Adaptive H.264 Steganography

  •      Yun Cao
  • Yu Wang
  • Xianfeng Zhao
  • Meineng Zhu
  • Zhoujun Xu

This paper makes the first attempt to achieve content-adaptive H.264 steganography with the quantised discrete cosine transform (QDCT) coefficients in intra-frames. Currently, state-of-the-art JPEG steganographic schemes embed their payload while minimizing a heuristically defined distortion. However, porting this concept to schemes of compressed videos remains an unsolved challenge. Because of H.264 intra prediction, the QDCT coefficient blocks are highly depended on their adjacent encoded blocks, and modifying one coefficient block will set off a chain reaction in the following cover blocks. Based on a thorough investigation into this problem, we propose two embedding strategies for cover block decoupling to inhibit the embedding interactions. With this methodology, the latest achievements in the JPEG domain are expected to be incorporated to construct H.264 steganographic schemes for better performances.

SESSION: Biometrics and Forensics

Do EEG-Biometric Templates Threaten User Privacy?

  •      Yvonne Höller
  • Andreas Uhl

The electroencephalogram (EEG) was introduced as a method for the generation of biometric templates. So far, most research focused on the optimisation of the enrolment and authentication, and it was claimed that the EEG has many advantages. However, it was never assessed whether the biometric templates obtained from the EEG contain sensitive information about the enrolled users. In this work we ask whether we can infer personal characteristics such as age, sex, or informations about neurological disorders from these templates.

To this end, we extracted a set of 16 feature vectors from EEG epochs from a sample of 60 healthy subjects and neurological patients. One of these features was the classical power spectrum, while the other 15 features were derived from a multivariate autoregressive model, considering also interdependencies of EEG channels. We classified the sample by sex, neurological diagnoses, age, atrophy of the brain, and intake of neurological drugs.

We obtained classification accuracies of up to .70 for sex, .86 for the classification of epilepsy vs. other populations, .81 for the differentiation of young vs. old people's templates, and .82 for the intake of medication targeted to the central nervous system. These informations represent privacy sensitive information about the users, so that our results emphasise the need to apply protective safeguards in the deployment of EEG biometric systems.

Fake Faces Identification via Convolutional Neural Network

  •      Huaxiao Mo
  • Bolin Chen
  • Weiqi Luo

Generative Adversarial Network (GAN) is a prominent generative model that are widely used in various applications. Recent studies have indicated that it is possible to obtain fake face images with a high visual quality based on this novel model. If those fake faces are abused in image tampering, it would cause some potential moral, ethical and legal problems. In this paper, therefore, we first propose a Convolutional Neural Network (CNN) based method to identify fake face images generated by the current best method [20], and provide experimental evidences to show that the proposed method can achieve satisfactory results with an average accuracy over 99.4%. In addition, we provide comparative results evaluated on some variants of the proposed CNN architecture, including the high pass filter, the number of the layer groups and the activation function, to further verify the rationality of our method.

Generalized Benford's Law for Blind Detection of Morphed Face Images

  •      Andrey Makrushin
  • Christian Kraetzer
  • Tom Neubert
  • Jana Dittmann

A morphed face image in a photo ID is a serious threat to image-based user verification enabling that multiple persons could be matched with the same document. The application of machine-readable travel documents (MRTD) at automated border control (ABC) gates is an example of a verification scenario that is very sensitive to this kind of fraud. Detection of morphed face images prior to face matching is, therefore, indispensable for effective border security. We introduce the face morphing detection approach based on fitting a logarithmic curve to nine Benford features extracted from quantized DCT coefficients of JPEG compressed original and morphed face images. We separately study the parameters of the logarithmic curve in face and background regions to establish the traces imposed by the morphing process. The evaluation results show that a single parameter of the logarithmic curve may be sufficient to clearly separate morphed and original images.

SESSION: Deep Learning for Steganography

CNN-based Steganalysis of MP3 Steganography in the Entropy Code Domain

  •      Yuntao Wang
  • Kun Yang
  • Xiaowei Yi
  • Xianfeng Zhao
  • Zhoujun Xu

This paper presents an effective steganalytic scheme based on CNN for detecting MP3 steganography in the entropy code domain. These steganographic methods hide secret messages into the compressed audio stream through Huffman code substitution, which usually achieve high capacity, good security and low computational complexity. First, unlike most previous CNN based steganalytic methods, the quantified modified DCT (QMDCT) coefficients matrix is selected as the input data of the proposed network. Second, a high pass filter is used to extract the residual signal, and suppress the content itself, so that the network is more sensitive to the subtle alteration introduced by the data hiding methods. Third, the $ 1 \times 1 $ convolutional kernel and the batch normalization layer are applied to decrease the danger of overfitting and accelerate the convergence of the back-propagation. In addition, the performance of the network is optimized via fine-tuning the architecture. The experiments demonstrate that the proposed CNN performs far better than the traditional handcrafted features. In particular, the network has a good performance for the detection of an adaptive MP3 steganography algorithm, equal length entropy codes substitution (EECS) algorithm which is hard to detect through conventional handcrafted features. The network can be applied to various bitrates and relative payloads seamlessly. Last but not the least, a sliding window method is proposed to steganalyze audios of arbitrary size.

Adversarial Examples Against Deep Neural Network based Steganalysis

  •      Yiwei Zhang
  • Weiming Zhang
  • Kejiang Chen
  • Jiayang Liu
  • Yujia Liu
  • Nenghai Yu

Deep neural network based steganalysis has developed rapidly in recent years, which poses a challenge to the security of steganography. However, there is no steganography method that can effectively resist the neural networks for steganalysis at present. In this paper, we propose a new strategy that constructs enhanced covers against neural networks with the technique of adversarial examples. The enhanced covers and their corresponding stegos are most likely to be judged as covers by the networks. Besides, we use both deep neural network based steganalysis and high-dimensional feature classifiers to evaluate the performance of steganography and propose a new comprehensive security criterion. We also make a tradeoff between the two analysis systems and improve the comprehensive security. The effectiveness of the proposed scheme is verified with the evidence obtained from the experiments on the BOSSbase using the steganography algorithm of WOW and popular steganalyzers with rich models and three state-of-the-art neural networks.

SESSION: CNN-based Media Forensics

Identification of Audio Processing Operations Based on Convolutional Neural Network

  •      Bolin Chen
  • Weiqi Luo
  • Da Luo

To reduce the tampering artifacts and/or enhance audio quality, some audio processing operations are often applied in the resulting tampered audio. Like image forensics, the detection of various post processing operations has become very important for audio authentication. In this paper, we propose a convolutional neural network (CNN) to detect audio processing operations. In the proposed method, we carefully design the network architecture, with particular attention to the frequency representation for the audio input, the activation function and the depth of the network. In our experiments, we evaluate the proposed method on audio clips with 12 commonly used audio processing operations and of three different small sizes. The experimental results show that our method can significantly outperform related methods based on hand-crafted features and other CNN architectures, and can achieve state-of-the-art results for both binary and multiple classification.

Learning Unified Deep-Features for Multiple Forensic Tasks

  •      Owen Mayer
  • Belhassen Bayar
  • Matthew C. Stamm

Recently, deep learning researchers have developed a technique known as deep features in which feature extractors for a task are learned by a CNN. These features are then provided to another classifier, or even used to perform a different classification task. Research in deep learning suggests that in some cases, deep features generalize to seemingly unrelated tasks. In this paper, we develop techniques for learning deep features that can be used across multiple forensic tasks, namely image manipulation detection and camera model identification. To do this, we develop two approaches for building deep forensic features: a transfer learning approach and a multitask learning approach. We experimentally evaluate the performance of both approaches in several scenarios and find that: 1) features learned for camera model identification generalize well to manipulation detection tasks but manipulation detection features do not generalize well to camera model identification, suggesting a task asymmetry, 2) deeper features are more task specific while shallower features generalize well across tasks, suggesting a feature hierarchy, and 3) a single, unified feature extractor can be learned that is highly discriminative for multiple forensic tasks. Furthermore, we find that when there is limited training data, a unified feature extractor can significantly outperform a targeted CNN.

Image Forgery Localization based on Multi-Scale Convolutional Neural Networks

  •      Yaqi Liu
  • Qingxiao Guan
  • Xianfeng Zhao
  • Yun Cao

In this paper, we propose to utilize Convolutional Neural Networks (CNNs) and the segmentation-based multi-scale analysis to locate tampered areas in digital images. First, to deal with color input sliding windows of different scales, we adopt a unified CNN architecture. Then, we elaborately design the training procedures of CNNs on sampled training patches. With a set of tampering detectors based on CNNs for different scales, a series of complementary tampering possibility maps can be generated. Last but not least, a segmentation-based method is proposed to fuse these maps and generate the final decision map. By exploiting the benefits of both the small-scale and large-scale analyses, the segmentation-based multi-scale analysis can lead to a performance leap in forgery localization of CNNs. Numerous experiments are conducted to demonstrate the effectiveness and efficiency of our method.

Densely Connected Convolutional Neural Network for Multi-purpose Image Forensics under Anti-forensic Attacks

  • Yifang Chen
  • Xiangui Kang
  • Z. Jane Wang
  • Qiong Zhan

Multiple-purpose forensics has been attracting increasing attention worldwide. However, most of the existing methods based on hand-crafted features often require domain knowledge and expensive human labour and their performances can be affected by factors such as image size and JPEG compression. Furthermore, many anti-forensic techniques have been applied in practice, making image authentication more difficult. Therefore, it is of great importance to develop methods that can automatically learn general and robust features for image operation detectors with the capability of countering anti-forensics. In this paper, we propose a new convolutional neural network (CNN) approach for multi-purpose detection of image manipulations under anti-forensic attacks. The dense connectivity pattern, which has better parameter efficiency than the traditional pattern, is explored to strengthen the propagation of general features related to image manipulation detection. When compared with three state-of-the-art methods, experiments demonstrate that the proposed CNN architecture can achieve a better performance (i.e., with a 11% improvement in terms of detection accuracy under anti-forensic attacks). The proposed method can also achieve better robustness against JPEG compression with maximum improvement of 13% on accuracy under low-quality JPEG compression.

SESSION: Embedding Impact in Steganography

Maintaining Rate-Distortion Optimization for IPM-Based Video Steganography by Constructing Isolated Channels in HEVC

  •      Yu Wang
  • Yun Cao
  • Xianfeng Zhao
  • Zhoujun Xu
  • Meineng Zhu

This paper proposes an effective intra-frame prediction mode (IPM)-based video steganography in HEVC to maintain rate-distortion optimization as well as improve empirical security. The unique aspect of this work and one that distinguishes it from prior art is that we capture the embedding impacts on neighboring prediction units, called inter prediction unit (inter-PU) embedding impacts caused by the predictive coding widespread employed in video coding standards, using a distortion measure. To avoid the emergence of neighboring IPMs mutually affecting each other within the same channel, three-layered isolated channels are established in terms of the property of IPM coding. According to theoretical analysis for embedding impacts on the current prediction unit, called intra prediction unit (intra-PU) embedding impacts on coding efficiency (both visual quality and compression efficiency), a novel distortion function purposely designed to discourage the embedding changes with impacts on adjacent channels is proposed to express the multi-level embedding impacts. Based on the defined distortion function, two-layered syndrome-trellis codes (STCs) are utilized in practical embedding implementation alternatively. Experimental results demonstrate that the proposed scheme outperforms other existing IPM-based video steganography in terms of rate-distortion optimization and empirical security.

Exploring Non-Additive Distortion in Steganography

  •      Tomas Pevny
  • Andrew D. Ker

Leading steganography systems make use of the Syndrome-Trellis Code (STC) algorithm to minimize a distortion function while encoding the desired payload, but this constrains the distortion function to be additive. The Gibbs Embedding algorithm works for a certain class of non-additive distortion functions, but has its own limitations and is highly complex.

In this short paper we show that it is possible to modify the STC algorithm in a simple way, to minimize a non-additive distortion function suboptimally. We use it for two examples. First, applying it to the S-UNIWARD distortion function, we show that it does indeed reduce distortion, compared with minimizing the additive approximation currently used in image steganography, but that it makes the payload more -- not less -- detectable. This parallels research attempting to use Gibbs Embedding for the same task. Second, we apply it to distortion defined by the output of a specific detector, as a counter-move in the steganography game. However, unless the Warden is forced to move first (by fixing the detector) this is highly detectable.

On the Relationship Between Embedding Costs and Steganographic Capacity

  •      Andrew D. Ker

Contemporary steganography in digital media is dominated by the framework of additive distortion minimization: every possible change is given a cost, and the embedder minimizes total cost using some variant of the Syndrome-Trellis Code algorithm. One can derive the relationship between the cost of each change c_i and the probability that it should be made pi_i, but the literature has not examined the relationship between the costs and the total capacity (secure payload size) of the cover. In this paper we attempt to uncover such a relationship, asymptotically, for a simple independent pixel model of covers. We consider a 'knowing' detector who is aware of the embedding costs, in which case sum pi_i^2 c_i should be optimized. It is shown that the total of the inverse costs, sum c_i^-1, along with the embedder's desired security against an optimal opponent, determines the asymptotic capacity. This result also recovers a Square Root Law. Some simple simulations confirm the relationship between costs and capacity in this ideal model.

SESSION: Encryption, Authentication, Anonymization

Real or Fake: Mobile Device Drug Packaging Authentication

  •      Rudolf Schraml
  • Luca Debiasi
  • Andreas Uhl

Shortly, within the member states of the European Union a serialization-based anti-counterfeiting system for pharmaceutical products will be introduced. This system requires a third party enabling to track serialized and enrolled instances of each product from the manufacturer to the consumer. An alternative to serialization is authentication of a product by classifying it as being real or fake using intrinsic or extrinsic features of the product. Thereby, one approach is packaging material classification using images of the packaging textures. While the basic feasibility has been proven recently, it is not clear if such an authentication system works with images captured with mobile devices. Thus, in this work mobile drug packaging authentication is investigated. The experimental evaluation provides results on single- and cross-sensor scenarios. Results indicate the principal feasibility and acknowledge open issues for a mobile device drug packaging authentication system.

Forensic Analysis and Anonymisation of Printed Documents

  •      Timo Richter
  • Stephan Escher
  • Dagmar Schönfeld
  • Thorsten Strufe

Contrary to popular belief, the paperless office has not yet established itself. Printer forensics is therefore still an important field today to protect the reliability of printed documents or to track criminals. An important task of this is to identify the source device of a printed document. There are many forensic approaches that try to determine the source device automatically and with commercially available recording devices. However, it is difficult to find intrinsic signatures that are robust against a variety of influences of the printing process and at the same time can identify the specific source device. In most cases, the identification rate only reaches up to the printer model. For this reason we reviewed document colour tracking dots, an extrinsic signature embedded in nearly all modern colour laser printers. We developed a refined and generic extraction algorithm, found a new tracking dot pattern and decoded pattern information. Through out we propose to reuse document colour tracking dots, in combination with passive printer forensic methods. From privacy perspective we additional investigated anonymization approaches to defeat arbitrary tracking. Finally we propose our toolkitdeda which implements the entire workflow of extracting, analysing and anonymisation of a tracking dot pattern.

Applicability of No-Reference Visual Quality Indices for Visual Security Assessment

  •      Heinz Hofbauer
  • Andreas Uhl

From literature it is known that full-reference visual quality indices are a poor fit for the estimation of visual security for selective encryption. The question remains whether no-reference visual quality indices can perform where full reference indices falter. Furthermore, no-reference visual quality indices frequently use machine learning to train a model of natural scene statistics. It would be of interest to be able to gauge the impact of learning statistics from selectively encrypted images on performance as quality estimators for encryption. In the following we will answer these two questions.