DICOM PS3.5 2024c - Data Structures and Encoding

8.2.5 MPEG2 Main Profile / Main Level Video Compression

DICOM provides a mechanism for supporting the use of MPEG2 Main Profile / Main Level Video Compression through the Encapsulated Format. Annex A defines Non-Fragmentable and Fragmentable Encapsulated Transfer Syntaxes that reference the MPEG2 Main Profile / Main Level Standard.


MPEG2 compression is inherently lossy. The context where the usage of lossy compression of medical images is clinically acceptable is beyond the scope of the DICOM Standard. The policies associated with the selection of appropriate compression parameters (e.g., compression ratio) for MPEG2 Main Profile / Main Level are also beyond the scope of this Standard.

The use of the DICOM Encapsulated Format to support MPEG2 Main Profile / Main Level compressed pixel data requires that the Data Elements that are related to the Pixel Data encoding (e.g., Photometric Interpretation, Samples per Pixel, Planar Configuration, Bits Allocated, Bits Stored, High Bit, Pixel Representation, Rows, Columns, etc.) shall contain Values that are consistent with the characteristics of the compressed data stream, with some specific exceptions noted here. The Pixel Data characteristics included in the MPEG2 Main Profile / Main Level bit stream shall be used to decode the compressed data stream.


These requirements are specified in terms of consistency with what is encapsulated, rather than in terms of the uncompressed pixel data from which the compressed data stream may have been derived.

When decompressing, should the characteristics explicitly specified in the compressed data stream be inconsistent with those specified in the DICOM Data Elements, those explicitly specified in the compressed data stream should be used to control the decompression. The DICOM Data Elements, if inconsistent, can be regarded as suggestions as to the form in which an uncompressed Data Set might be encoded, subject to the general and IOD-specific rules for uncompressed Photometric Interpretation and Planar Configuration, which may require that decompressed data be converted to one of the permitted forms.

The MPEG2 Main Profile / Main Level bit stream specifies whether or not a reversible or irreversible multi-component (color) transformation, if any, has been applied. If no multi-component transformation has been applied, then the components shall correspond to those specified by the DICOM Attribute Photometric Interpretation (0028,0004). MPEG2 Main Profile / Main Level applies an irreversible multi-component transformation, so DICOM Attribute Photometric Interpretation (0028,0004) shall be YBR_PARTIAL_420 in the case of multi-component data, and MONOCHROME2 in the case of single component data (even though the MPEG2 bit stream itself is always encoded as three components, one luminance and two chrominance).


  1. If MPEG2 Compressed Pixel Data is decompressed and re-encoded in Native (uncompressed) form, then the Data Elements that are related to the Pixel Data encoding are updated accordingly. If color components are converted from YBR_PARTIAL_420 to RGB during decompression and Native re-encoding, the Photometric Interpretation will be changed to RGB in the Data Set with the Native encoding.

  2. MPEG2 proposes some video formats. Each of the standards specified is used in a different market, including: ITU-R BT.470-2 System M for SD NTSC and ITU-R BT.470-2 System B/G for SD PAL/SECAM. A PAL based system should therefore be based on ITU-BT.470 System B for each of Color Primaries, Transfer Characteristic (gamma) and matrix coefficients and should take a value of 5 as defined in [ISO/IEC 13818-2].

The Value of Planar Configuration (0028,0006) is irrelevant since the manner of encoding components is specified in the MPEG2 Main Profile / Main Level standard, hence it shall be set to 0.

In summary:

Table 8-1. MPEG2 Main Profile / Main Level Image Transfer Syntax Rows and Columns Attributes

Video Type

Spatial resolution

Frame Rate

(see Note 4)

Frame Time

(see Note 5)

Maximum Rows

Maximum Columns

525-line NTSC



33.33 ms



625-line PAL



40.0 ms




  1. Although different combinations of Values for Rows and Columns are possible while respecting the maximum values listed above, it is recommended that the typical 4:3 ratio of image width to height be maintained in order to avoid image deformation by MPEG2 decoders. A common way to maintain the ratio of width to height is to pad the image with black areas on either side.

  2. "Half" definition of pictures (240x352 and 288x352 for NTSC and PAL, respectively) are always supported by decoders.

  3. Main Profile / Main Level allows for various different display and pixel aspect ratios, including the use of square pixels, and the use of non-square pixels with display aspect ratios of 4:3 and 16:9. DICOM specifies no additional restrictions beyond what is provided for in Main Profile / Main Level. All permutations allowed by Main Profile / Main Level are valid and are require to be supported by all DICOM decoders.

  4. The actual frame rate for NTSC MPEG2 is approximately 29.97 frames/sec.

  5. The nominal Frame Time is supplied for the purpose of inclusion on the DICOM Cine Module Attributes, and should be calculated from the actual frame rate.

For the Non-Fragmentable Encapsulated Transfer Syntax, one Fragment shall contain the whole MPEG2 stream.

For the Fragmentable Encapsulated Transfer Syntax, the stream may be segmented into multiple Fragments.


  1. If a video stream exceeds the maximum length of one fragment (2^32-2 bytes), it may be sent using a Fragmentable Encapsulated Transfer Syntax. Alternatively, it may be sent using a Non-Fragmentable Encapsulated Transfer Syntax as multiple SOP Instances, but each SOP Instance will contain an independent and playable bit stream, and not depend on the encoded bit stream in other (previous) instances. The manner in which such separate instances are related is not specified in the Standard, but mechanisms such as grouping into the same Series, and references to earlier instances using Referenced Image Sequence may be used.

  2. Fragmentable Encapsulated Transfer Syntaxes allow for streams of essentially unlimited length; the only limit imposed is the maximum Number of Frames (0028,0008), which is 2^31-1 frames (largest positive Value in an Integer String VR).

The Basic Offset Table shall be empty (present but zero length).


The Basic Offset Table is not used because MPEG2 contains its own mechanism for describing navigation of frames. To enable decoding of only a part of the sequence, MPEG2 manages a header in any group of pictures (GOP) containing a time_code - a 25-bit integer containing the following: drop_frame_flag, time_code_hours, time_code_minutes, marker_bit, time_code_seconds and time_code_pictures.

The container format for the video bit stream is not constrained. For example, it may MPEG-2 Transport Stream (MPEG-TS), MPEG-2 Program Stream (MPEG-PS), MPEG-2 Elementary Stream (MPEG-ES), MPEG-2 Packetized Elementary Stream (MPEG-PES) (see [ISO/IEC 13818-1]) or MPEG-4 (MP4) container (see [ISO/IEC 14496-12] and [ISO/IEC 14496-14]).

Any audio components present within the MPEG bit stream shall comply with the following restrictions:


  1. MPEG-1 Layer III is standardized in Part 3 of the MPEG-1 standard (see [ISO/IEC 11172-3]).

  2. Although MPEG describes each channel as including up to 5 signals (e.g., for surround effects), it is recommended to limit each of the two channels to 2 signals each one (stereo).

DICOM PS3.5 2024c - Data Structures and Encoding