Digital Document Integrity

Graham Shaw

Signum Technologies Ltd.
6 Thorney Leys Business Park
Witney, Oxon., UK
+44 1993 776929

grahams@signumtech.com

ABSTRACT

The revolution in digital data processing has brought many benefits to the way we create, organise and manage information, whether in the form of images, documents, audio or video files. The conversion of such information into digital form provides us with capabilities such as online storage and retrieval, efficient search processes and worldwide data transmission. But digital data also brings with it a major problem, namely the ease with which such information can be copied and tampered with, without the forensic trail of evidence that we have relied on for analogue and hard-copy files. The need has therefore arisen for technologies which are able to safeguard the integrity and authenticity of digital records, particularly if such records are subject to legal and/or ethical scrutiny. The preservation of integrity of digital records is particularly important where they are subject to concerted and possible criminal attack.

1. THE NEED

At its simplest level, the security of digital information is addressed at three levels:

In its broadest sense, security of data is concerned with ensuring that a digital record is what it purports to be, whether in respect of content or origination.

Technologies such as described above apply largely in purely digital applications and processes. A new dimension is added to the issue of security of data when digital records are used for the origination of printed material, since technologies protecting access or authentication of parties no longer apply. In product and image security applications, a number of differing requirements exist across the spectrum of applications:

These are just some examples of where data authentication provides major benefits to prospective users and adds a necessary additional layer to security and protection of assets. This need is now addressed by the use of digital watermarking - a relatively recent technological development which is now gaining momentum as an accepted method of data authentication.

2. THE TECHNOLOGY

Digital watermarking is a modern form of the ancient art of steganography which is, in essence, the ability to hide information inside other information. Earlier examples of steganography range from invisible ink on personal letters to the encoding of hidden messages into normal text files. In its modern form, digital watermarking enables data to be embedded imperceptibly and permanently into digital files. Such files include images, documents, audio and video data and include a wide range of formats. For example, digital watermarks can be embedded in black & white 1-bit documents as well as full-colour 32-bit images. Remarkably, the embedded watermark can be recovered not only from the digital data but also by rescanning its printed version. In this respect, digital watermarking provides a cross-over technology which spans both digital and hard-copy data.

Signum Technologies has developed a digital watermarking technology based on the use of secure permutation keys. This approach prevents removal of the watermark once applied and also, through control of the key, unauthorised application of watermarks to adulterated data. It is important to note that digital watermarking is not a form of data encryption - the original record remains fully accessible - but can be easily combined with encryption technology.

In essence, digital watermarks can be used in two respects:

These two elements can be used separately or combined to create permanently tagged ´digital masters`.

Signum has commercialised two forms of its digital watermarking: ´SureSign` for copyright ownership and ´VeriData` for data authentication. Both share the same essential features:

As stated above, the potential use of digital watermarks is very broad and customisable to a particular application. In summary, they provide a security layer that has been lacking to date and thereby reinforce other security and authentication methods.

3. THE VALIDATION METHOD

3.1 Security

Signum provides a complete version of the VeriData algorithm which includes Signum`s own hashing, encryption and embedding algorithms. These operate at a high level of security and are more than adequate for most of our customers. However, some customers wish for the extra reassurance of algorithms which have gained global acceptance as possessing known levels of security. For these customers it is possible to replace the Signum algorithms with proprietary algorithms whilst working within the framework which this software provides.

It is not essential to have high security at every possible stage of the process. One properly administered security feature will protect overall security. If an asymmetric algorithm such as RSA is used for encryption of the signature then the reading software that actually confirms the authentic nature of the file can be freely distributed without providing the means to authenticate the file.

3.2 Notes on Algorithm

3.2.1 Overview

The VeriData method consists essentially of two parts. The first part is the calculation of a "signature" or "digest" of the data in the file concerned. This is similar to the method used in several other applications. The second part of the method is the embedding of the signature back into the data file. This is carried out in such a way that the format of the file is not altered and the quality of data is not degraded. It is this second part which gives the uniqueness to the VeriData algorithm.

A) User Key

Each user has a unique key, typically of 128 bit length.

B) Embedding Sites

The encrypted signature is embedded within the file at selected sites. The selection of the sites is carried out by an algorithm that depends upon the key. VeriData uses an algorithm based upon permutations and it is possible to customise the permutations for any user.

C) Hashing Algorithm

Signum provides a non-linear hashing algorithm for VeriData which can be rapidly executed. SHA1 and MD5 algorithms are included for possible selection by the user who wishes to use a well established algorithm. Alternatively, the user can supply an algorithm of their own choice.

D) Encryption

The signature is encrypted before embedding into the file. Signum supplies a simple symmetric encryption algorithm for this purpose. RSA asymmetric encryption can be included as an alternative. Another alternative is that the user supplies his own encryption.

E) Data Embedding

The algorithm results in a binary string of n bits and n sites into which it may be embedded. The method of embedding is again customisable and provides another opportunity for adding security.