DICOM PS3.5 2024b - Data Structures and Encoding

8.2.8 MPEG-4 AVC/H.264 High Profile / Level 4.2 Video Compression

DICOM provides a mechanism for supporting the use of MPEG-4 AVC/H.264 Image Compression through the Encapsulated Format. Annex A defines Transfer Syntaxes that reference the MPEG-4 AVC/H.264 Standard.


MPEG-4 AVC/H.264 compression / High Profile compression is inherently lossy. The context where the usage of lossy compression of medical images is clinically acceptable is beyond the scope of the DICOM Standard. The policies associated with the selection of appropriate compression parameters (e.g., compression ratio) for MPEG-4 AVC/H.264 High Profile / Level 4.2 are also beyond the scope of this Standard.

The use of the DICOM Encapsulated Format to support MPEG-4 AVC/H.264 compressed pixel data requires that the Data Elements that are related to the Pixel Data encoding (e.g., Photometric Interpretation, Samples per Pixel, Planar Configuration, Bits Allocated, Bits Stored, High Bit, Pixel Representation, Rows, Columns, etc.) shall contain Values that are consistent with the characteristics of the compressed data stream, with some specific exceptions noted here. The Pixel Data characteristics included in the MPEG-4 AVC/H.264 bit stream shall be used to decode the compressed data stream.


These requirements are specified in terms of consistency with what is encapsulated, rather than in terms of the uncompressed pixel data from which the compressed data stream may have been derived.

When decompressing, should the characteristics explicitly specified in the compressed data stream be inconsistent with those specified in the DICOM Data Elements, those explicitly specified in the compressed data stream should be used to control the decompression. The DICOM Data Elements, if inconsistent, can be regarded as suggestions as to the form in which an uncompressed Data Set might be encoded, subject to the general and IOD-specific rules for uncompressed Photometric Interpretation and Planar Configuration, which may require that decompressed data be converted to one of the permitted forms.


If MPEG-4 Compressed Pixel Data is decompressed and re-encoded in Native (uncompressed) form, then the Data Elements that are related to the Pixel Data encoding are updated accordingly. If color components are converted from YBR_PARTIAL_420 to RGB during decompression and Native re-encoding, the Photometric Interpretation will be changed to RGB in the Data Set with the Native encoding.

The requirements are:


  1. The Value of Planar Configuration (0028,0006) is irrelevant since the manner of encoding components is specified in the MPEG-4 AVC/H.264 standard, hence it is set to 0.

  2. The frame rate of the acquiring camera for '30 Hz HD' MPEG-4 AVC/H.264 may be either 30 or 30/1.001 (approximately 29.97) frames/sec. Similarly, the frame rate in the case of 60 Hz may be either 60 or 60/1.001 (approximately 59.94) frames/sec. This may lead to small inconsistencies between the video timebase and real time. The relationship between frame rate and frame time is shown in Table 8-7.

  3. The Frame Time (0018,1063) may be calculated from the frame rate of the acquiring camera. A frame rate of 29.97 frames per second corresponds to a frame time of 33.367 ms.

  4. The value of chroma_format for this profile and level is defined by MPEG as 4:2:0.

Table 8-7. MPEG-4 AVC/H.264 High Profile / Level 4.2 Image Transfer Syntax Frame Rate Attributes

Video Type

Frame Rate (see Note 2)

Frame Time (see Note 3)

30 Hz HD


33.33 ms

25 Hz HD


40.0 ms

60 Hz HD


16.67 ms

50 Hz HD


20.00 ms

Stereo Pairs Present (0022,0028) shall be YES if stereoscopic pairs are present, otherwise shall be NO or absent.

Table 8-8. MPEG-4 AVC/H.264 High Profile / Level 4.2 Image Transfer Syntax Stereo Attributes

Transfer Syntax

Stereo Pairs Present

Stereo Frame Packing Format

MPEG-4 AVC/H.264 High Profile / Level 4.2 for 2D Image Compression

NO or absent


MPEG-4 AVC/H.264 High Profile / Level 4.2 for 3D Image Compression



For the Non-Fragmentable Encapsulated Transfer Syntax, one Fragment shall contain the whole MPEG-4 AVC/H.264 bit stream.

For the Fragmentable Encapsulated Transfer Syntax, the stream may be segmented into multiple Fragments.


  1. If a video stream exceeds the maximum length of one fragment (2^32-2 bytes), it may be sent using a Fragmentable Encapsulated Transfer Syntax. Alternatively, it may be sent using a Non-Fragmentable Encapsulated Transfer Syntax as multiple SOP Instances, but each SOP Instance will contain an independent and playable bit stream, and not depend on the encoded bit stream in other (previous) instances. The manner in which such separate instances are related is not specified in the Standard, but mechanisms such as grouping into the same Series, and references to earlier instances using Referenced Image Sequence may be used.

  2. Fragmentable Encapsulated Transfer Syntaxes allow for streams of essentially unlimited length; the only limit imposed is the maximum Number of Frames (0028,0008), which is 2^31-1 frames (largest positive Value in an Integer String VR).

The container format for the video bit stream shall be MPEG-2 Transport Stream, a.k.a. MPEG-TS (see [ISO/IEC 13818-1]) or MPEG-4, a.k.a. MP4 container (see [ISO/IEC 14496-12] and [ISO/IEC 14496-14]). The PTS/DTS of the transport stream shall be used in the MPEG coding.

Any audio components included in the data container shall follow the constraints detailed in Section 8.2.12 Constraints for Audio Data Integration in AVC and HEVC Compressed Bit Streams.

DICOM PS3.5 2024b - Data Structures and Encoding