Robust Object Watermarking Algorithm

Authors

Balamurgan Chirtsabesan (balamc@cs.arizona.edu)
Tapas Ranjan Sahoo (tapas@cs.arizona.edu)

Description

This algorithm is a static watermark algorithm.

The Robust Object Watermarking presents a new approach to watermarking. Instead of applying the watermarking scheme to the raw code directly, a new vector representation of the code is created. The basic idea revolving around this scheme is that instead of considering the overall structure of the code and its control flow, the code is viewed as a statistical object. The frequencies of groups of instructions in the entire code are taken into consideration in creating a new vector representation of the data. Spreading the watermark throughout the target code ensures a large measure of security against intentional and unintentional attacks.

The watermark value is passed as a string parameter, which is converted into a frequency vector representation and embedded in the code. Each vector element represents an instruction group which is identified through some profiling information. Various embedding techniques such as code substitution, instruction group insertion, etc are used to build the watermark in the code. The basic idea is increase the frequency of the instruction groups in the code which form a component of the watermark vector. The embedding module references the 'CodeBook' to carry out different types of embedding.

RESTRICTIONS:
The algorithm is usually applicable over a large application where the code size is large enough to embed the watermark vector. Embedding new instructions in a code is a sensitive issue since it must satisfy to lot of criterias such as maintaining proper stack size, proper variable intialization and use, etc. In very small applications, there might not be enought scope to carry out the entire vector instruction embedding, especially the approaches other than code substitution. The algorithm uses a "decision procedure" to make sure that the implementation does not go into an infinite loop. Incase there is no further scope for embedding, the program exits with a log message saying that the watermark embedding was not completed.

Example

As a simple example, let us briefly describe a simple substitution procedure for embedding a vector instruction group. Say, iload; isub is a two element instruction group that is a watermark vector component. The 'CodeBook' stores a substitution group corresponding to it as: A{ iload X, iload Y,if_icmpne -> Z } --> B{ iload X, iload Y,isub, ifne -> Z}. Hence, finding an occurence of A in the code and substituting it by a semantically equivalent group B, effectively increase the frequency of the vector component by one.

The recognition procedure follows a different approach. Instead of retrieving the watermark directly from the target code, the recognizer is provided with both the original code and the new code along with the watermark. The recognizer then answers with a 'yes/no' whether the particular watermark exists in the code or not.

EMBEDDING THE WATERMARK:
Input the original jar file (eg. A.jar). We have the watermarked jar file as A_wm.jar. Enter the watermark value in the 'watermark' field as well as in the 'key' field. The watermark must be an integer with 8 or fewer digits.

WATERMARK RECOGNITION:
Input the watermarked jar file (ie. A_wm.jar) and the original jar file (i.e. A.jar). Enter the watermark, which you wish to detect, in the 'key' field. The recognizer then outputs "WATERMARK FOUND/ WATERMARK NOT FOUND" based on whether the particular watermark was detected in the target code (ie. A_wm.jar) or not. The recognizer extracts the watermark within a certain threshold value. The default 'recognition threshold' is set to 1. This threshold level can be changed in the 'myRecognitionThreshold' field of the 'Config' class.

Configuration

References