SCZ - Simple Compression Utilities and Library

https://sourceforge.net/projects/scz-compress/                       November 26, 2008
SCZ is a simple set of compression routines for compressing and decompressing arbitrary data. The initial set of routines implement new loss-less compression algorithms with perfect restoration (decompression). Recently, a set of lossy routines has been added for photo-image compression.

The library is called SCZ, for simple compression format. SCZ is intended as subroutines for calling within your own applications without legal or technical encumbrances. It was developed because the standard compression routines, such as gzip, Zlib, JPEG, GIF, etc., are fairly large, complex, and difficult to integrate-with, maintain, understand, have external dependencies, or had legal restrictions.

SCZ is intended to fill a niche: simple lightweight, self-contained, data-compress/decompress routines that can be included within other applications, and that permit the applications to compress or decompress data on-the-fly, during read-in or write-out by simple calls. This niche applies to your application especially if the other compression libraries are as large or more complex than your application itself. Other compression utilities do not appear to be intended for easily embedding within other applications, and often depend on multiple external libraries that may not be installed on a given system.

SCZ typically achieves 3:1 compression. On binary PPM diagram image files it often achieves a 10:1 compression. On text files such as XML, it often compresses by 25:1. On difficult files, it may achieve less than 2:1 reduction. Although zip and gzip usually achieve slightly higher ratios, SCZ makes trade-offs for simplicity, memory footprint, and run-time speed, - in that order -, with consideration to diminishing-returns. For example, when compressing a particular 10-MB file, gzip saved 8.2-MB, while SCZ saved 7.8-MB. Either way, that's a lot of space saved! Sure, we could go after that last 0.4-MB of compression, but that is where diminishing-returns comes in. To compress that extra bit would more than double the complexity and run-time of SCZ. SCZ's core compression and decomp routines are only 178 and 45 lines of code, respectively. The balance of the files provide convenient access methods to compress files, buffers, and streams. SCZ's core library is four files containing about 600 source lines. (In contrast, the commendable light-weight zlib has 3,360 source lines in 25 files.)

Although the scz routines are intended for compiling (or linking) into your applications, the package also includes two self-contained (example) application programs that are stand-alone compress/decompress utilities, along with Readme and header files for linking:

The application programs work similar to gzip and gunzip. See the header comments. The application utilities also serve as examples for how to call the scz compression routines from your own applications, and are useful for testing and validation.

To use the scz routines in your programs, just include or link to the lib file(s). You can use SCZ either as command-line utilities or as direct calls. See the SCZ-API for a list of functions and their parameters.

See How It Works for information about the SCZ architecture and file format. You should find the SCZ commenting and code structures somewhat understandable.

The SCZ routines work interchangeably across all platforms. This makes portable and self-contained compression available to all applications.

A set of regression tests and scripts has been added as an optional SCZ download package - SCZ-Tests. It contains a generic test-data generator for testing, benchmarking, rating, or comparing compression methods. It can generate random binary data files with arbitrary sizes and with arbitrary amounts of compressibility. By testing SCZ routines with thousands of different files of various sizes, we gain confidence in SCZ's correctness and efficiency. The regression tests can be quickly re-run whenever any improvements to SCZ are proposed, and to verify that it continues to work properly.

Photo-Image Compression
The primary SCZ routines are useful for text, XML, line-drawing diagram images, scans of text pages, or other types of binary computer data. However, they are unable to compress photographic images very much. Recently, a new pair of routines has been added for efficient photo-image compression. See Image-SCZ.

In the future, I would like to add a similar set of simple routines for lossy compression of other kinds of data, such as audio files.


Downloading SCZ


 

SourceForge.net Logo