Image Coding

Audiovisual Processing CMP-6026A

Dr. David Greenwood

December 01 2021

Content

Lossy and lossless image compression.

Changing colour spaces and subsampling
DCT and quantisation
Run-length encoding
Entropy coding

Image Coding

How can we compress an image without destroying the image?

Data and information are not the same thing.
Goal is to identify and remove redundancy.

Lossless

Image can be reconstructed exactly.

Lossy

Inflated image is an approximation of the original.
How much loss is acceptable?

Image Redundancy

Inter-pixel redundancy:

Neighbouring pixels are related to one another

Image Redundancy

Coding redundancy:

Not all pixel intensities are equally likely

Image Redundancy

Pycho-visual redundancy:

We are not visually sensitive to everything in the image

JPEG Compression

A framework for compressing images.
Many algorithms can be used in the framework.
Developed by Joint Photographic Expert Group.
JPEG exploits the three forms of redundancy outlined.

JPEG Compression

\(Y C_b C_r\)

\[ \begin{aligned} Y &= 0.299R + 0.587G + 0.114B \\ C_b &= B-Y \\ C_r &= R-Y \end{aligned} \]

Luminance

\[Y = 0.299R + 0.587G + 0.114B\]

Humans are more sensitive to luminance…

Chrominance

\[ \begin{aligned} C_b &= B-Y \\ C_r &= R-Y \end{aligned} \]

Humans are less sensitive to chrominance…

\(Y C_b C_r\)

We can downsample the chrominance channels without affecting the image in a perceptible way.

Exploits psycho-visual redundancy.

JPEG Compression

Chroma Subsampling

Subsampling scheme is expressed as a ratio J:a:b

represents a conceptual window on the chrominance channels.

Chroma Subsampling
J: horizontal sampling reference. Usually, 4.
a: number of pixels in the top row that will have chroma information.
b: number of changes of samples (Cr, Cb) between first and second row of J pixels.

Chroma Subsampling

Chroma Subsampling

Chroma Subsampling

JPEG Compression

JPEG Compression

JPEG Compression

DCT

Transforms the image into the frequency domain.

DCT

DCT

JPEG Compression

DCT Quantisation

Reduce the number of bits needed to store a value by reducing precision.

Decrease precision as we move away from the top left corner.
High frequency details usually contribute less to the image.

DCT Quantisation

Quantisation is performed as follows:

\[DCT_{q}(i, j) = round \left( \frac{DCT(i, j)}{Q(i, j)} \right)\]

where \(Q\) is the quantisation matrix.

DCT Quantisation

JPEG Compression

ZigZag Scan

ZigZag Scan

\(65, -27, -2, 17, -3,\) \(19, 0, -3, 8, 0, ...\)

ZigZag Scan

Reads from low frequency coefficients to high frequency coefficients…

ZigZag Scan

More likely to encode all non-zeros and all zeros together…

beneficial for the next step…

JPEG Compression

Run Length Encoding

Extracts series of value and length of runs from sequence of values.

Exploits inter-pixel redundancy.

Run Length Encoding

65 -27 -2 17 -3 -3 1 1 1 -2 1 1 0 -1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Run Length Encoding

65 -27 -2 17 -3 -3 1 1 1 -2 1 1 0 -1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

65 1 -27 1 -2 1 17 1 -3 2 1 3 -2 1 1 2 0 1 -1 1 1 1 0 19

Run Length Encoding

65 -27 -2 17 -3 -3 1 1 1 -2 1 1 0 -1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

65 1 -27 1 -2 1 17 1 -3 2 1 3 -2 1 1 2 0 1 -1 1 1 1 0 19

Run Length Encoding

65 -27 -2 17 -3 -3 1 1 1 -2 1 1 0 -1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

65 1 -27 1 -2 1 17 1 -3 2 1 3 -2 1 1 2 0 1 -1 1 1 1 0 19

Run Length Encoding

65 -27 -2 17 -3 -3 1 1 1 -2 1 1 0 -1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

65 1 -27 1 -2 1 17 1 -3 2 1 3 -2 1 1 2 0 1 -1 1 1 1 0 19

Run Length Encoding

Exploits inter-pixel redundancy

the relationship between neighbouring “pixels” in the zigzag scan of the DCT coefficient matrix

JPEG Compression

Entropy Coding

Information and data are not the same thing.

Claude Shannon, (1948). A Mathematical Theory of Communication.

Entropy Coding exploits coding redundancy

not every value is equally likely.

Entropy Coding encodes a sequence with variable length code so that:

More probable values have fewer bits, and
less probable values have more bits.

The new alphabet requires fewer bits per pixel.

How many bits do we need?

Recall: the probability of an event is:

\[p_{i} = \frac{N_{i}}{N}\]

The information in bits is:

\[I_{i} = - \log_{2} p_{i}\]

The entropy, the smallest possible mean symbol length, is:

\[H = - \sum_{i} p_{i} \log_{2} p_{i}\]

We can use these properties to develop a better coding for an image.

The stream must be decoded unambiguously.
One code cannot be the prefix of another.

Huffman Coding

Step 1:

Arrange values in order of decreasing probability.
Each forms a leaf in the Huffman tree.

Huffman Coding

Step 2:

Merge the two leaves with the smallest probability,
- add the probabilities,
- insert the node into the sorted list.
Assign a 1/0 to each branch being merged.