IH&MMSec '22: Proceedings of the 2022 ACM Workshop on Information Hiding and Multimedia Security

IH&MMSec '22: Proceedings of the 2022 ACM Workshop on Information Hiding and Multimedia Security

IH&MMSec '22: Proceedings of the 2022 ACM Workshop on Information Hiding and Multimedia Security


Full Citation in the ACM Digital Library

SESSION: Keynote Talks

Session details: Keynote Talks

  • B.S. Manjunath

Towards Generalization in Deepfake Detection

  • Luisa Verdoliva

In recent years there have been astonishing advances in AI-based synthetic media generation. Thanks to deep learning-based approaches it is now possible to generate data with a high level of realism. While this opens up new opportunities for the entertainment industry, it simultaneously undermines the reliability of multimedia content and supports the spread of false or manipulated information on the Internet. This is especially true for human faces, allowing to easily create new identities or change only some specific attributes of a real face in a video, so-called deepfakes. In this context, it is important to develop automated tools to detect manipulated media in a reliable and timely manner. This talk will describe the most reliable deep learning-based approaches for detecting deepfakes, with a focus on those that enable domain generalization [1]. The results will be presented on challenging datasets [2,3] with reference to realistic scenarios, such as the dissemination of manipulated images and videos on social networks. Finally, new possible directions will be outlined.

Looking for Signals: A Systems Security Perspective

  • Christopher Kruegel

Over the last 20 years, my students and I have built systems that look for signals of malice in large datasets. These datasets include network traffic, program code, web transactions, and social media posts. For many of our detection systems, we used feature engineering to model properties of the data and then leveraged different types of machine learning to find outliers or to build classifiers that could recognize unwanted inputs. In this presentation, I will cover three recent works that go beyond that basic approach.

First, I will talk about cross-dataset analysis. The key idea is that we look at the same data from different vantage points. Instead of directly detecting malicious instances, the analysis compares the views across multiple angles and finds those cases where these views meaningfully differ.

Second, I will cover an approach to perform meta-analysis of the outputs (events) that a detection model might produce. Sometimes, looking at a single event is insufficient to determine whether it is malicious. In such cases, it is necessary to correlate multiple events. We have built a semi-supervised analysis that leverages the context of an event to determine whether it should be treated as malicious or not.

Third, I will discuss ways in which attackers might attempt to thwart our efforts to build detectors.

Specifically, I will talk about a fast and efficient clean-label dataset poisoning attack. In this attack, correctly labeled poison samples are injected into the training dataset. While these poison samples look legitimate to a human observer, they contain malicious characteristics that trigger a targeted misclassification during detection (inference).

Intellectual Property (IP) Protection for Deep Learning and Federated Learning Models

  • Farinaz Koushanfar

This talk focuses on end-to-end protection of the present and emerging Deep Learning (DL) and Federated Learning (FL) models. On the one hand, DL and FL models are usually trained by allocating significant computational resources to process massive training data. The built models are therefore considered as the owner's IP and need to be protected. On the other hand, malicious attackers may take advantage of the models for illegal usages. IP protection needs to be considered during the design and training of the DL models before the owners make their models publicly available. The tremendous parameter space of DL models allows them to learn hidden features automatically.

We explore the 'over-parameterization' of DL models and demonstrate how to hide additional information within DL. Particularly, we discuss a number of our end-to-end automated frameworks over the past few years that leverage information hiding for IP protection, including: DeepSigns[5] and DeepMarks[2], the first DL watermarking and fingerprinting frameworks that work by embedding the owner's signature in the dynamic activations and output behaviors of the DL model; DeepAttest[1], the first hardware-based attestation framework for verifying the legitimacy of the deployed model via on-device attestation. We also develop a multi-bit black-box DNN watermarking scheme[3] and demonstrate spread spectrum-based DL watermarking[4]. In the context of Federated Learning (FL), we show how these results can be leveraged for the design of a novel holistic covert communication framework that allows stealthy information sharing between local clients while preserving FL convergence. We conclude by outlining the open challenges and emerging directions.

SESSION: Session 1: Forensics

Session details: Session 1: Forensics

  • Rainer Böhme

FMFCC-V: An Asian Large-Scale Challenging Dataset for DeepFake Detection

  • Gen Li
  • Xianfeng Zhao
  • Yun Cao
  • Pengfei Pei
  • Jinchuan Li
  • Zeyu Zhang

The abuse of DeepFake technique has raised enormous public concerns in recent years. Currently, the existing DeepFake datasets suffer some weaknesses of obvious visual artifacts, minimal Asian proportion, backward synthesis methods and short video length. To make up these weaknesses, we have constructed an Asian large-scale challenging DeepFake dataset to enable the training of DeepFake detection models and organized the accompanying video track of the first Fake Media Forensics Challenge of China Society of Image and Graphics (FMFCC-V). The FMFCC-V dataset is by far the first and the largest public available Asian dataset for DeepFake detection, which contains 38102 DeepFake videos and 44290 pristine videos, corresponding more than 23 million frames. The source videos in the FMFCC-V dataset are carefully collected from 83 paid individuals and all of them are Asians. The DeepFake videos are generated by four of the most popular face swapping methods. Extensive perturbations are applied to obtain a more challenging benchmark of higher diversity. The FMFCC-V dataset can lend powerful support to the development of more effective DeepFake detection methods. We contribute a comprehensive evaluation of six representative DeepFake detection methods to demonstrate the level of challenge posed by FMFCC-V dataset. Meanwhile, we provide a detailed analysis of the top submissions from the FMFCC-V competition.

Know Your Library: How the libjpeg Version Influences Compression and Decompression Results

  • Martin Beneš
  • Nora Hofer
  • Rainer Böhme

Introduced in 1991, libjpeg has become a well-established library for processing JPEG images. Many libraries in high-level languages use libjpeg under the hood. So far, little attention has been paid to the fact that different versions of the library produce different outputs for the same input. This may have implications on security-related applications, such as image forensics or steganalysis, where evidence is generated by tracking small, imperceptible changes in JPEG-compressed signals. This paper systematically analyses all libjpeg versions since 1998, including the forked libjpeg-turbo (in its latest version). It compares the outputs of compression and decompression operations for a range of parameter settings. We identify up to three distinct behaviors for compression and up to six for decompression.

Identity-Referenced Deepfake Detection with Contrastive Learning

  • Dongyao Shen
  • Youjian Zhao
  • Chengbin Quan

With current advancements in deep learning technology, it is becoming easier to create high-quality face forgery videos, causing concerns about the misuse of deepfake technology. In recent years, research on deepfake detection has become a popular topic. Many detection methods have been proposed, most of which focus on exploiting image artifacts or frequency domain features for detection. In this work, we propose using real images of the same identity as a reference to improve detection performance. Specifically, a real image of the same identity is used as a reference image and input into the model together with the image to be tested to learn the distinguishable identity representation, which is achieved by contrastive learning. Our method achieves superior performance on both FaceForensics++ and Celeb-DF with relatively little training data, and also achieves very competitive results on cross-manipulation and cross-dataset evaluations, demonstrating the effectiveness of our solution.

SESSION: Session 2: Security of Machine Learning

Session details: Session 2: Security of Machine Learning

  • Yassine Yousfi

Sparse Trigger Pattern Guided Deep Learning Model Watermarking

  • Chun-Shien Lu

Watermarking neural networks (NNs) for ownership protection has received considerable attention recently. Resisting both model pruning and fine-tuning is commonly considered to evaluate the robustness of a watermarked NN. However, the rationale behind such a robustness is still relatively unexplored in the literature. In this paper, we study this problem to propose a so-called sparse trigger pattern (STP) guided deep learning model watermarking method. We provide empirical evidence to show that trigger patterns are able to make the distribution of model parameters compact, and thus exhibit interpretable resilience to model pruning and fine-tuning. We find the effect of STP can also be technically interpreted as the first layer dropout. Extensive experiments demonstrate the robustness of our method.

BlindSpot: Watermarking Through Fairness

  • Sofiane Lounici
  • Melek Önen
  • Orhan Ermis
  • Slim Trabelsi

With the increasing development of machine learning models in daily businesses, a strong need for intellectual property protection arised. For this purpose, current works suggest to leverage backdoor techniques to embed a watermark into the model, by overfitting to a set of particularly crafted and secret input-output pairs called triggers. By sending verification queries containing triggers, the model owner can analyse the behavior of any suspect model on the queries to claim its ownership. However, when it comes to scenarios where frequent monitoring is needed, the computational overhead of these verification queries in terms of volume demonstrates that backdoor-based watermarking appears to be too sensitive to outlier detection attacks and cannot guarantee the secrecy of the triggers.

To solve this issue, we introduce BlindSpot, to watermark machine learning models through fairness. Our trigger-less approach is compatible with a high number of verification queries while being robust to outlier detection attacks. We show on Fashion-MNIST and CIFAR-10 datasets that BlindSpot is efficiently watermarking models while robust to outlier detection attacks, at a performance cost on the accuracy of 2%.

Hiding Needles in a Haystack: Towards Constructing Neural Networks that Evade Verification

  • Árpád Berta
  • Gábor Danner
  • István Hegedus
  • Mark Jelasity

Machine learning models are vulnerable to adversarial attacks, where a small, invisible, malicious perturbation of the input changes the predicted label. A large area of research is concerned with verification techniques that attempt to decide whether a given model has adversarial inputs close to a given benign input. Here, we show that current approaches to verification have a key vulnerability: we construct a model that is not robust but passes current verifiers. The idea is to insert artificial adversarial perturbations by adding a backdoor to a robust neural network model. In our construction, the adversarial input subspace that triggers the backdoor has a very small volume, and outside this subspace the gradient of the model is identical to that of the clean model. In other words, we seek to create a "needle in a haystack" search problem. For practical purposes, we also require that the adversarial samples be robust to JPEG compression. Large "needle in the haystack" problems are practically impossible to solve with any search algorithm. Formal verifiers can handle this in principle, but they do not scale up to real-world networks at the moment, and achieving this is a challenge because the verification problem is NP-complete. Our construction is based on training a hiding and a revealing network using deep steganography. Using the revealing network, we create a separate backdoor network and integrate it into the target network. We train our deep steganography networks over the CIFAR-10 dataset. We then evaluate our construction using state-of-the-art adversarial attacks and backdoor detectors over the CIFAR-10 and the ImageNet datasets. We made the code and models publicly available at https://github.com/szegedai/hiding-needles-in-a-haystack.

SESSION: Session 3: Security & Privacy I

Session details: Session 3: Security & Privacy I

  • B.S. Manjunath

Covert Communications through Imperfect Cancellation

  • Daniel Chew
  • Christine Nguyen
  • Samuel Berhanu
  • Chris Baumgart
  • A. Brinton Cooper

We propose a method for covert communications using an IEEE 802.11 OFDM/QAM packet as a carrier. We show how to hide the covert message so that the transmitted signal does not violate the spectral mask specified by the standard, and we determine its impact on the OFDM packet error rate (PER). We show conditions under which the hidden signal is not usable and those under which it can be retrieved with a usable bit error rate (BER). The hidden signal is extracted by cancellation of the OFDM signal in the covert receiver. We explore the effects of the hidden signal on OFDM parameter estimation and the covert signal BER. We test the detectability of the covert signal with and without cancellation. We conclude with an experiment where we inject the hidden signal into Over-The-Air (OTA) recordings of 802.11 packets and demonstrate the effectiveness of the technique using that real-world OTA data.

Covert Channels in Network Time Security

  • Kevin Lamshöft
  • Jana Dittmann

Network Time Security (NTS) specified in RFC8915 is a mechanism to provide cryptographic security for clock synchronization using the Network Time Protocol (NTP) as foundation. By using Transport Layer Security (TLS) and Authenticated Encryption with Associated Data (AEAD) NTS is able to ensure integrity and authenticity between server and clients synchronizing time. However, in the past it was shown that time synchronisation protocols such as the Network Time Protocol (NTP) and the Precision Time Protocol (PTP) might be leveraged as carrier for covert channels, potentially infiltrating or exfiltrating information or to be used as Command-and-Control channels in case of malware infections. By systematically analyzing the NTS specification, we identified 12 potential covert channels, which we describe and discuss in this paper. From the 12 channels, we exemplary selected an client-side approach for a proof-of-concept implementation using NTS random UIDs. Further, we analyze and investigate potential countermeasures and propose a design for an active warden capable of mitigating the covert channels described in this paper.

Collusion-resistant Fingerprinting of Parallel Content Channels

  • Basheer Joudeh
  • Boris Skoric

The fingerprinting game is analysed when the coalition size k is known to the tracer, but the colluders can distribute themselves across L TV channels. The collusion channel is introduced and the extra degrees of freedom for the coalition are made manifest in our formulation. We introduce a payoff functional that is analogous to the single TV channel case, and is conjectured to be closely related to the fingerprinting capacity. For the binary alphabet case under the marking assumption, and the restriction of access to one TV channel per person per segment, we derive the asymptotic behavior of the payoff functional. We find that the value of the maximin game for our payoff is asymptotically equal to L2/k2 2 ln 2, with optimal strategy for the tracer being the arcsine distribution, and for the coalition being the interleaving attack across all TV channels, as well as assigning an equal number of colluders across the L TV channels.

SESSION: Session 4: Steganography I

Session details: Session 4: Steganography I

  • Jessica Fridrich

Domain Adaptational Text Steganalysis Based on Transductive Learning

  • Yiming Xue
  • Boya Yang
  • Yaqian Deng
  • Wanli Peng
  • Juan Wen

Traditional text steganalysis methods rely on a large amount of labeled data. At the same time, the test data should be independent and identically distributed with the training data. However, in practice, a large number of text types make it difficult to satisfy the i.i.d condition between the training set and the test set, which leads to the problem of domain mismatch and significantly reduces the detection performance. In this paper, we draw on the ideas of domain adaptation and transductive learning to design a novel text steganalysis method. In this method, we design a distributed adaptation layer and adopt three loss functions to achieve domain adaptation, so that the model can learn the domain-invariant text features. The experimental results show that the method has better steganalysis performance in the case of domain mismatch.

Few-shot Text Steganalysis Based on Attentional Meta-learner

  • Juan Wen
  • Ziwei Zhang
  • Yu Yang
  • Yiming Xue

Text steganalysis is a technique to distinguish between steganographic text and normal text via statistical features. Current state-of-the-art text steganalysis models have two limitations. First, they need sufficient amounts of labeled data for training. Second, they lack the generalization ability on different detection tasks. In this paper, we propose a meta-learning framework for text steganalysis in the few-shot scenario to ensure model fast-adaptation between tasks. A general feature extractor based on BERT is applied to extract universal features among tasks, and a meta-learner based on attentional Bi-LSTM is employed to learn task-specific representations. A classifier trained on the support set calculates the prediction loss on the query set with a few samples to update the meta-learner. Extensive experiments show that our model can adapt fast among different steganalysis tasks through extremely few-shot samples, significantly improving detection performance compared with the state-of-the-art steganalysis models and other meta-learning methods.

Hidden in Plain Sight - Persistent Alternative Mass Storage Data Streams as a Means for Data Hiding With the Help of UEFI NVRAM and Implications for IT Forensics

  • Stefan Kiltz
  • Robert Altschaffel
  • Jana Dittmann

This article presents a first study on the possibility of hiding data using the UEFI NVRAM of today's computer systems as a storage channel. Embedding and extraction of executable data as well as media data are discussed and demonstrated as a proof of concept. This is successfully evaluated using 10 different systems. This paper further explores the implications of data hiding within UEFI NVRAM for computer forensic investigations and provides forensics measures to address this new challenge.

Fighting the Reverse JPEG Compatibility Attack: Pick your Side

  • Jan Butora
  • Patrick Bas

In this work we aim to design a steganographic scheme undetectable by the Reverse JPEG Compatibility Attack (RJCA). The RJCA, while only effective for JPEG images compressed with quality factors 99 and 100, was shown to work mainly due to change in variance of the rounding errors after decompression of the DCT coefficients, which is induced by embedding changes incompatible with the JPEG format. One remedy to preserve the aforementioned format is utilizing during the embedding the rounding errors created during the JPEG compression, but no steganographic method is known to be resilient to RJCA without this knowledge. Inspecting the effect of embedding changes on variance and also mean of decompression rounding errors, we propose a steganographic method allowing resistance against RJCA without any side-information. To resist RJCA, we propose a distortion metric making all embedding changes within a DCT block dependent, resulting in a lattice-based embedding. Then it turns out it is enough to cleverly pick the side of the (binary) embedding changes through inspection of their effect on the variance of decompression rounding errors and simply use uniform costs in order to enforce their sparsity across DCT blocks. To increase security against detectors in the spatial (pixel) domain, we show an easy way of combining the proposed methodology with steganography designed for spatial domain security, further improving the undetectability for quality factor 99. The improvements over existing non-informed steganography are up to 40% in terms of detector's accuracy.

SESSION: Session 5: Security & Privacy II

Session details: Session 5: Security & Privacy II

  • Daniel Chew

A Nearest Neighbor Under-sampling Strategy for Vertical Federated Learning in Financial Domain

  • Denghao Li
  • Jianzong Wang
  • Lingwei Kong
  • Shijing Si
  • Zhangcheng Huang
  • Chenyu Huang
  • Jing Xiao

Machine learning techniques have been widely applied in modern financial activities. Participants in the field are aware of the importance of data privacy. Vertical federated learning (VFL) was proposed as a solution to multi-party secure computation for machine learning to obtain the huge data required by the models as well as keep the privacy of the data holders. However, previous research majorly analyzed the algorithms under ideal conditions. Data imbalance in VFL is still an open problem. In this paper, we propose a privacy-preserving sampling strategy for imbalanced VFL based on federated graph embedding of the samples, without leaking any distribution information. The participants of the federation provide partial neighbor information for each sample during the intersection stage and the controversial negative sample will be filtered out. Experiments were conducted on commonly used financial datasets and one real-world dataset. Our proposed approach obtained the leading F1 score on all tested datasets on comparing with the baseline under sampling strategies for VFL.

Colmade: Collaborative Masking in Auditable Decryption for BFV-based Homomorphic Encryption

  • Alberto Ibarrondo
  • Hervé Chabanne
  • Vincent Despiegel
  • Melek Önen

This paper proposes a novel collaborative decryption protocol for the Brakerski-Fan-Vercauteren (BFV) homomorphic encryption scheme in a multiparty distributed setting, and puts it to use in designing a leakage-resilient biometric identification solution. Allowing the computation of standard homomorphic operations over encrypted data, our protocol reveals only one least significant bit (LSB) of a scalar/vectorized result resorting to a pool of N parties. By employing additively shared masking, our solution preserves the privacy of all the remaining bits in the result as long as one party remains honest. We formalize the protocol, prove it secure in several adversarial models, implement it on top of the open-source library Lattigo and showcase its applicability as part of a biometric access control scenario.

SESSION: Session 6: Steganography II

Session details: Session 6: Steganography II

  • Jan Butora

AMR Steganalysis based on Adversarial Bi-GRU and Data Distillation

  • Zhijun Wu
  • Junjun Guo

Existing AMR (Adaptive Multi-Rate) steganalysis algorithms based on pitch delay have low detection accuracy on samples with short time or low embedding rate, and the model shows fragility under the attack of adversarial samples. To solve this problem, we design an advanced AMR steganalysis method based on adversarial Bi-GRU (Bi-directional Gated Recurrent Unit) and data distillation. First, Gaussian white noise is randomly added to part of the original speech to form adversarial data set, then artificially annotate a small amount of voice to train the model. Second, perform three transformations of 1.5 times speed, 0.5 times speed, and mirror flip on the remaining original voice data, then put them into Bi-GRU for classification, and the final predicted label obtained by the decision fusion corresponds to the original data. All data with the label is put back into the Bi-GRU model for final training at last. What needs to be pointed out is that each batch of final training data includes normal and adversarial samples. This method adopts a semi-supervised learning method, which greatly saves the resources consumed by manual labeling, and introduces adversarial Bi-GRU, which can realize the two-direction analysis of samples for a long time. Based on improving the detection accuracy, the safety and robustness of the model are greatly improved. The experimental results show that for normal and adversarial samples, the algorithm can achieve accuracy of 96.73% and 95.6% respectively.

Capacity Laws for Steganography in a Crowd

  • Andrew D. Ker

A steganographer is not only hiding a payload inside their cover, they are also hiding themselves amongst the non-steganographers. In this paper we study asymptotic rates of growth for steganographic data -- analogous to the classical Square-Root Law -- in the context of a 'crowd' of K actors, one of whom is a steganographer. This converts steganalysis from a binary to a K-class classification problem, and requires some new information-theoretic tools. Intuition suggests that larger K should enable the steganographer to hide a larger payload, since their stego signal is mixed in with larger amounts of cover noise from the other actors. We show that this is indeed the case, in a simple independent-pixel model, with payload growing at O(√(log K)) times the classical Square-Root capacity in the case of homogeneous actors. Further, examining the effects of heterogeneity reveals a subtle dependence on the detector's knowledge about the payload size, and the need for them to use negative as well as positive information to identify the steganographer.

Detector-Informed Batch Steganography and Pooled Steganalysis

  • Yassine Yousfi
  • Eli Dworetzky
  • Jessica Fridrich

We study the problem of batch steganography when the senders use feedback from a steganography detector. This brings an additional level of complexity to the table due to the highly non-linear and non-Gaussian response of modern steganalysis detectors as well as the necessity to study the impact of the inevitable mismatch between senders' and Warden's detectors. Two payload spreaders are considered based on the oracle generating possible cover images. Three different pooling strategies are devised and studied for a more comprehensive assessment of security. Substantial security gains are observed with respect to previous art - the detector-agnostic image-merging sender. Close attention is paid to the impact of the information available to the Warden on security.