Skip to content

Using and extending the code

flanglet edited this page Feb 19, 2023 · 4 revisions

Compressing/Decompressing data

Here is how to compress/decompress a block to/from a file using RLT+TEXT as transform, Huffman as entropy codec, using a block size of 1 MB, 4 jobs and a checksum (error management ignored for readibility):

import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.io.OutputStream;
import java.util.HashMap;
import kanzi.io.CompressedInputStream;
import kanzi.io.CompressedOutputStream;

private static final ExecutorService pool = Executors.newFixedThreadPool(4); // not necessary if jobs = 1

public long compress(byte[] block, int length) throws IOException {
	// Create an OutputStream
	OutputStream os = new FileOutputStream("compressed.knz");

	// Create a CompressedOutputStream
        HashMap<String, Object> ctx = new HashMap<>();
        ctx.put("transform", "RLT+TEXT");        
        ctx.put("codec", "HUFFMAN");
        ctx.put("blockSize", 1024 * 1024);
        ctx.put("checksum", true);
        ctx.put("pool", pool); // not necessary if jobs = 1
        ctx.put("jobs", 4);
	CompressedOutputStream cos = new CompressedOutputStream(os, ctx);

	// Compress block
	cos.write(block, 0, length);

	// Close streams
	cos.close();
        os.close();
	
	// Get number of bytes written
	return cos.getWritten();
}

public long decompress(byte[] block, int length) throws IOException {
	// Create an InputStream
	InputStream is = new FileInputStream("compressed.knz");

	// Create a CompressedInputStream
        HashMap<String, Object> ctx = new HashMap<>();
        ctx.put("pool", pool); // not necessary if jobs = 1
        ctx.put("jobs", 4);
	CompressedInputStream cis = new CompressedInputStream(is, ctx);

	// Decompress block
	cis.read(block, 0, length);    

	// Close streams
	cis.close();
        is.close();

	// Get number of bytes read
	return cis.getRead();
}

The only tricky part is that an application wide threadpool is required in multi-threaded mode to control the maximum number of threads. The threadpool must be created before the calls with a size at least equals to the number of jobs provided on the command line. No need to create a theadpool in mono-threaded scenarios.

Implementing a new transform

Here is how to implement and add a new transform to kanzi.

  • Step 1: write the transform code

For example:

 import java.util.Map;
 import kanzi.ByteTransform;
 import kanzi.SliceByteArray;

 class SuperDuperTransform implements ByteTransform
 {
     public SuperDuperTransform() {}
     public SuperDuperTransform(Map<String, Object> context) {}

     public boolean forward(SliceByteArray input, SliceByteArray output)
     { 
         final int count = input.length;

         // Ensure enough room in the destination buffer
         if (output.length - output.index < getMaxEncodedLength(count))
             return false;

         for (int i = 0; i < count; i++)
             dst[i] = (byte) (src[i] ^ 0xAA);

         input.index += count;
         output.index += count;
         return true;
     }

     public boolean inverse(SliceByteArray input, SliceByteArray output)
     {
         final int count = input.length;

         for (int i = 0; i < count; i++)
             dst[i] = (byte) (src[i] ^ 0xAA);

         input.index += count;
         output.index += count;
         return true;
     }

     public int getMaxEncodedLength(int inputLen) { return inputLen; }
 };

Always provide a constructor with a Context: the context map contains all the application wide information (such as block size, number of jobs, input & output names, etc ...). Always implement ByteTransform and do not create more threads that the jobs value provided in the context. Implement forward and inverse methods as well as getMaxEncodedLength(int). Do not write to System.out or System.err. Be aware that your code must be multi-thread safe.

  • Step 2: Register the transform in tranform/TransformFactory.java

Add the type, say

  public static final short SUPERDUPER_TYPE = 63;

Let us say you use the name "SUPERDUPER" for the transform. Update the following methods:

 private long getTypeToken(String name)
 private static ByteTransform newFunctionToken(Map<String, Object> ctx, int functionType)
 private static String getNameToken(int functionType)
  • step 3: Update the help message in app/Kanzi.java

In Kanzi.printHelp, add the SUPERDUPER transform to the list in the -t option section.

  • This is it. For example, run
java -jar kanzi.jar -i foo.txt -f -t SUPERDUPER -e none -j 2 -v 4
Clone this wiki locally