Abstract:
Compression can be used on data in order to save memory space and/or to reduce the amount of data to be transmitted over a communication channel. Many designers ask what the compression performance is for the CAST hardware implementation of a compression algorithm. The basic calculations and suggestions on how to evaluate compression performance will be discussed in this paper.
Introduction:
Compression performance measures how much the data size is reduced by the compression algorithm. There are other factors besides compression performance which may determine a compression solution’s suitability for an application. These might be throughput, latency, size and power consumption. Most designers who are looking for a hardware implementation of a compression algorithm are trying to achieve throughputs that are too high for an effective CPU implementation or trying to reduce latency or power consumption in the system.
There are standard datasets for measuring compression performance. Examples of some of these datasets are the infamous “Lena” picture for still image compression, “duckstakeoff” clip for video compression and Canturbury and Silesia Corpus data tests for lossless data compression. Using the same dataset for comparing compression performance for different algorithms and implementations allows and apples to apples comparison.
However, it is important to understand that any compression performance is always data content specific, so the input data should be related specifically to the characteristics of the customer’s application data. For example, a video input of a soccer game containing lots of motion will compress totally differently than a security camera filming an empty parking lot! If possible, the dataset should be a sample of actual data used in the application.

Figure 1 - Lena image
Compression Ratio:
The compression performance is expressed as Compression Ratio (CR) with the size of the input (uncompressed) data followed by the size of the output (compressed) data. 3:1 would mean that the data is reduced to a third of the size of the input data. This is easy to measure by comparing the input file size to the compressed file size. The result is a function of the compression algorithm, its implementation and the content of the data.
CR = (uncompressed file size) / (compressed file size)
Compression can also be measured by measuring the input vs the output bitrate. Let us look at how the input bitrate is calculated.
The input bitrate for images or video is a simple calculation of the image size (horizontal pixels x vertical pixels) times the number of bits per pixel times the frame rate (the number of frames per second).
(# pixels horizontal) x (# pixels vertical) x (# bits/pixel) x (# fps)
The number of bits per pixel is determined by the number of color components, the color subsampling format and the number of bits per color sample. For grayscale with 12 bits, the number of bits per pixel is 12. For H.264 Baseline Profile input YUV 4:2:0 with 8-bits per sample we use 12 as the number of bits per pixel for this calculation. Wikipedia has a very good description of “chroma sub-sampling”.
For an example, if the input picture is 1920x1080 at 60fps and has 12 bits per pixel, then the input rate is
1920 x 1080 x 60fps x 12 b/pixel = 1.49Gbps
For example, if the output has to be transmitted over a 1G Ethernet channel, then the compression ratio must be greater or equal to:
1.49Gbps / 1G = 1.49 or 1.49:1
However, if the output channel bandwidth is 100Mbps then the minimum compression ratio becomes:
1.49Gbps / 100Mbps = 14.9 or 14.9:1
Lossless Compression:
The evaluation of lossy compression includes inspection and analysis of artifacts and quality introduced by loss of data due to compression. These quality issues are not present with lossless compression, as the resulting compressed data, when uncompressed, is identical to the initial input data. Note that the CR cannot be guaranteed with lossless compression and is always data-dependent.
Popular algorithms for lossless image compression are Lossless JPEG, JPEG2000 and JPEG-LS. There are many papers available on the internet with comparisons of the performance of these algorithms, with JPEG-LS often providing the highest compression ratio.
A popular algorithm for lossless data compression is Deflate or GZIP which can be highly tuned to fit the type of input data and other application criteria (area, throughput, latency and power). This flexibility allows GZIP implementations to provide high CR values or trade CR to gain higher throughput.
For lossless image compression, you could expect in the order of 1:1 (no compression) to about 3.5:1. In general, compression will not work well on data that has already been compressed.
Lossy Compression:

The amount that the input data can be compressed using lossy algorithms is controlled by the user, with the higher compression ratios degrading the accuracy or quality of the reconstructed (i.e. compressed and then decompressed) results. The amount of compression is determined by setting algorithmic parameters that affect quality. For example, JPEG compression level (and quality) is mostly controlled by the Quantization Tables, H.264 is mostly controlled by the Quantization Parameter and for GZIP this is mostly the Huffman Tables type (Dynamic or Static) and history window along with many other LZ77 search parameter settings. An example of image quality can be seen in the following images where Lena is compressed at CR=20:1 and CR=51:1.

Figure 3 - Lena JPEG Quality Factor = 50, CR=20:1

Figure 4 - Lena JPEG with Quality Factor = 10, CR = 51:1
Lossy Compression Rate Control:
Compression is typically used to send data over a limited bandwidth channel. Rate Control is used to guarantee that the output bitrate will not exceed a certain level. The goal of Rate Control for an image is to guarantee that the output size of the compressed image will not be greater than a maximum threshold. Rate Control for video tries to average the bitrate over a period of time or number of frames. The CAST paper on Video Latency discusses video rate control in further detail. (https://www.cast-inc.com/blog/white-paper-reducing-video-latency/)
Rate Control is achieved by selectively eliminating bits in the output stream. Rate Control can only be used with lossy compression since lossless compression cannot tolerate any information loss. The rate control algorithm is not typically described by the compression standard and is usually the “secret sauce” of an encoder or compressor implementation. Therefore, the compressed output using rate control should be analyzed for performance quality acceptance in the target application.
Compression Performance Analysis:
You should use your representative input data with a compression software tool to get a feeling for what type of compression performance you might obtain. For example, data files can be easily compressed with software gzip and the input and output file sizes compared to determine the CR. Of course, this only gives a rough estimate of performance using the default parameters of the algorithm; these parameters could be tuned in software or in a hardware implementation of the compression algorithm.
For hardware implementations, there are usually software models of the hardware core or a hardware board that can be used for evaluation. Simulation is usually too slow to implement these complex algorithms in a timely manner, however, could be implemented as well.
Software decoders/display programs can be used to visually inspect the quality of compressed image or video data files. Such free programs are FFMPEG and VLAN.
Conclusions:
Compression performance is one of the important qualities that differentiate compression algorithms and their implementations. It is fairly straight-forward to compute the CR of software or hardware compressor, but remember: it is important that the data that you use represents the type of data that will be used in your real-life application.
Explore CAST IP here