Spatial image data (image samples or motion-compensated residual samples) are transformed into a different representation, the transform domain.
The two most widely used image compression transforms are the discrete cosine transform (DCT) and the discrete wavelet transform (DWT). The DCT is usually applied to small, regular blocks of image samples (e.g. 8 x 8 squares) and the DWT is usually applied to larger image sections ("tiles") or to complete images
- DCT
DCT become the most popular transform for image and video coding. There are two main reasons for its popularity: first, it is effective at transforming image data into a form that is easy to compress and second, it can be efficiently implemented in software and hardware.
The forward DCT (FDCT) of an N × N sample block isgiven by:
Y = AXA(T)
and the inverse DCT (IDCT) by:
X = A(T)YA
The transform matrix A for a 4 × 4 DCT is:
A =
0.5 0.5 0.5 0.5
0.653 0.271 0.271 −0.653
0.5 −0.5 −0.5 0.5
0.271 −0.653 −0.653 0.271
The forward DCT (FDCT) transforms a set of image samples (the "spatial domain") into a set of transform coefficients (the "transform domain"). The transform is reversible: the inverse DCT (IDCT) transforms a set of coefficients into a set of image samples.
The DCT has two useful properties for image and video compression, energy compaction (concentrating the image energy into a small number of coefficients) and decorrelution (minimising the interdependencies between coefficients).
A reasonable approximation to the original image block can be reconstructed from just these most significant coefficients.
The DCT becomes increasingly complex to calculate for larger block sizes.
- DWT
A single-stage wavelet transformation consists of a filtering operation that decomposes an image into four frequency bands. m. The top-left comer of the transformed image ("LC) is the original image, low-pass filtered and subsampled in the horizontal and vertical dimensions. The top-right comer ("W) consists of residual vertical frequencies .
The bottom-left comer "LH" contains residual horizontal frequencies.
The bottom-right comer "HH" contains residual diagonal frequencies.
This decomposition process may be repeated for the "LL" component to produce another set of four components: a new "LL" component that is a further subsampled version of the original image, plus three more residual frequency component.
The wavelet decomposition has some important properties. First, the number of wavelet "coefficients" (the spatial values that make up Figure 7.8) is the same as the number of pixels in the original image and so the transform is not inherently adding or removing information.
Second, many of the coefficients of the high-frequency components ("HH", "HL" and "LH" at each stage) are zero or insignificant. This reflects the fact that much of the important information in an image is low-frequency. Third, the decomposition is not restricted by block boundaries (unlike the DCT) and hence may be a more flexible way of decorrelating the image data (i.e. concentrating the significant components into a few coefficients) than the block-based DCT.
Wavelet-based compression performs well for still images (particularly in comparison with DCT-based compression) and can be implemented reasonably efficiently.
No comments:
Post a Comment