VideoCD Format
12. VideoCD Encoding
Source ( VideoPack 4 Manual)
The encoded video of the play items conforms to the MPEG-1 standard using the standard input format (SIF). The technical specifications for picture size and picture rate are:
352 x 240 pixel, 29.97 Hz picture rate for NTSC
352 x 240 pixel, 23.976 Hz for movies
352 x 288 pixel, 25 Hz for PAL.
The location of text within videos encoded for PAL format should be in a Text Area in the middle of the screen (352 x 240) not using 24 lines at the upper and lower border of the screen.
MPEG-1 video uses interframe compression. A sophisticated system of different kind of pictures eliminates not only redundant information within a frame but also between frames. To assist random access to the video sequences groups of pictures (GOP) are built, beginning with a so called intra-coded picture (I-picture) that uses information only form itself without any reference to another picture. I-pictures are reference pictures and access points to the video sequence.
Predictive-coded and bidirectionally predictive-coded pictures are both coded using information (motion compensated prediction) from reference pictures. P-Pictures are reference pictures too based on information from past reference pictures (I- or P-Pictures). B-Pictures are located between reference pictures using information from both, past and future reference pictures.
For example, a group of pictures (GOP) may have a structure like IBBPBBPBBPBB etc. or IBBBPBBBPBBB (see also figure XX). The selection of the appropriate GOP-structure depends on the content of the video source material, if there are many changes in the picture, many details etc. The selection of appropriate GOPs affects the quality of MPEG video. In general it is recommended to have at least one or two I-Pictures every second, the Video CD standard allows a maximum distance of 2 seconds between two I-Pictures.
Associated audio must be encoded conforming to MPEG-1 standard, Layer II with a sampling frequency of 44.1 kHz, in dual channel mode for a dual language program or Karaoke, intensity stereo mode, or (joint) stereo, that considers interchannel dependencies of the two stereo channels for compression.
To maintain synchronization of video and audio both data streams are interleaved in MPEG Video and MPEG audio sectors. The data rate of Form 2 sectors (single speed) is about 1,4 Mbit/s, the video stream must be encoded at 1,15 Mbit/s, audio stream at 224 kbit/s, information that tells an MPEG decoding system how to handle the interleaved data streams is added by a multiplexer software. Optional Audio tracks are coded conforming to the Red Book with a sampling frequency of 44.1 kHz, 16 Bit, stereo mode.
AV-MPEG-Sequences in the Segment Play item in track #1 may be encoded in the same way as those in MPEG-tracks. But different from that the data rate may increase up to 1,37 MBit/s depending on the rate of associated audio that may range from 0 to 384 kbit/s. There are three data rates for single channel mode (64, 96, and 192 kbit/s) and four rates for stereo, intensity stereo and dual channel mode (128, 192, 224, and 384 kbit/s). A segment can contain pure MPEG audio without video.
Still Videos in track #1 must be encoded as an MPEG Intra picture. Normal resolution is 352 x 240 pixel for NTSC and 352 x 288 Pixel for PAL. High Resolution Pictures are encoded in the same way, but the horizontal and vertical size is doubled. So NTSC size is 704 x 480 pixel, PAL size 704 x 576. There may also be sequences of still pictures filled up with padding packets to get convenient time intervals - even a mix of normal and high resolution pictures is allowed. Each Segment play item starts at the beginning of a new segment. According to the maximum number of segment 1980 Segment P play items may be defined.