JPEG Image Compression Explained

Max · Post by **Max** » Wed Sep 30, 2009 2:42 am

What is Compression?
Representing the data (Image, Audio, Video, Speech or Voice..) with the fewer number of bits than what it exactly requires to represent.

Basic Need of Compression
Effectively utilizing transmission bandwidth .Utilization of the storage media to the maximum.

Types of Compression:

Lossless Compressing the data which almost resembles the original Input data when decompressed.
Lossy Compressing the data with some lose of information(Keeping in mind of Human Visual and Psychoacoustic system) i.e. neglecting the higher frequency components which are very less sensitive to human visual system.

JPEG Image Compression standards.
In general image is nothing but a group of pixels. Pixel holds the brightness and colour information of the image at a particular coordinate.
Red, Green and Blue are the primary colour components of the colour image.With the help of these colour combinations we can get the colour that we deserve.

Step by Step procedure in compressing input Image data.

Input: Reading MxN (In general 8x8) block of input RGB Image each colour component is of 8-bit.
RBG->YcbCr: Converting MxN RBG samples to YCbCr (Luma and Chroma components).
DCT: Performing Discrete Cosine Transform (DCT) on each of the MxN Luma and Chroma components.
Scaling: Performing quantization of the resultant output coefficient matrix given by DCT which of same size as input to this block.
Scanning: Zig-Zag scaning of the resultant MxN matrix to a single dimensional array.
Huffman coding: Performing Huffman Coding on the resultant 1-D array.
BitStream: And finally the resultant bitstream will be the JPEG encoder output.

In detail:
Reading 8x8 matrix of Red, Green and Blue components of input image. Converting each one of the RGB components to YCbCr (Luma and Chroma components) using below equation

Y = (0.299R + 0.587G + 0.114B) + 64
Cb = (-0.1687R - 0.3313G + 0.5B) + 512
Cr = (0.5R - 0.4187G - 0.0813B) + 512

And then performing 2D - DCT on each of the 8x8 matrix of Y, Cb and Cr. Read more at https://robot.lk/viewtopic.php?f=71&t=435

DCT will give the lower frequency coefficients matrix 8x8 (Y,Cb and Cr) which are more sensitive to human eye.

And then performing scalar quantization (scaling of resultant DCT output array) of each of the luma and chroma matrices.

This quantization is the major step in which the actual compression takes place and this is module which consumes more number of cycles in JPEG compression.

And performing zig-zag scanning on the resultant scaled arrays(Y, Cb and Cr).

Performing Huffman Run length coding on the resultant 1D array got after zig-zag coding.

The resultant array is the output bitstream of the JPEG encoder.