Video Encoding
In contrast to JPEG, the image preparation phase of MPEG exactly defines the format of an image. Each image consists of three components; the luminance component has twice as many samples in the horizontal and vertical axes as the other two components - this is known as color-subsampling. The resolution of the luminance component should not exceed 768 x 576 pixels; for each component, a pixel is coded with eight bits.
Due to the required frame rate, each image must be built up within a maximum of 41.7 milliseconds.
MPEG distinguishes four types of image coding for processing. The reasons behind this are the contradictory demands for an efficient coding scheme and fast random access. The following types of images are distinguished (image is used as a synonym for still image or frame):
The picture shows a sequence of I, P, and B-frames. For example, the prediction for the first P-frames and a bi-directional prediction for a B-frame is shown.
A P-frame to be displayed after the related B-frame must be decoded before the B-frame because its data is required for the decompression of the B-frame.
The regularity of a sequence of I, P and B-frames is determined by the MPEG application. For fast random access, the best resolution would be achieved by coding the whole data stream as I-frames. On the other hand, the highest degree of compression is attained by using as many B-frames as possible. For practical applications, the following sequence has proved to be useful, “IBBPBBPBBIBBPBBPBB...”. In this case, random access would have a resolution of nine still images(i.e., about 330 milliseconds), and it still provides a very good compression ratio.
Three different layers of encoder and decoder complexity and performance are defined. An implementation of a higher layer must be able to decode the MPEG audio signals of lower layers.
For each subband, the amplitude of the audio signal is calculated. Also for each subband, the noise level is determined. At a higher noise level, a rough quantization is performed, and at a lower noise level, a finer quantization is applied.
The audio coding can be performed with a single channel, two independent channels or one stereo signal. In the definition of MPEG, there are two different stereo modes: two channels that are processed either independently or as joint stereo. In the case of joint stereo, MPEG exploits redundancy of both channels and achieves a higher compression ratio.
Each layer defines 14 fixed bit rates for the encoded audio data stream, which in MPEG are addressed by a bit rate index. The minimal value is always 32 Kbits/second. These layers support different maximal bit rates: layer 1 allows for a maximal bit rate of 448 Kbits/second, layer 2 for 384 Kbits/second and layer 3 for 320 Kbits/s.