DICOM PS3.5 2024e - Data Structures and Encoding

K.2 Example of Person Name Value Representation in the Chinese Language

Example K.2-1. Example of Person Name Value Representation in the Chinese Language

Person names in the Chinese language may be written in Pinyin (phonetic characters), Hanzi (ideographic characters), or English Name (alphabetic characters). The three component groups should be written in the order of alphabetic (English name), ideographic, and phonetic.

Specific Character Set:

  • (0008,0005) \ISO 2022 IR 58

Character String:

  • Zhang^XiaoDong=张^小东=

Encoded String:

  • Zhang^XiaoDong= ESC 02/04 02/09 04/01 张^ESC 02/04 02/09 04/01小东 =

Character encoded representation (GB2312) is:

  • 0x5A 0x68 0x61 0x6E 0x67 0x5E 0x58 0x69 0x61 0x6F 0x44 0x6F 0x6E 0x67 0x3D 0x1B 0x24 0x29 0x41 0xD5 0xC5 0x5E 0x1B 0x24 0x29 0x41 0xD0 0xA1 0xB6 0xAB 0x3D 0x20

Note

  1. The underlined byte codes correspond to double byte characters, the bold byte codes to escape sequences.

  2. The multi-byte character set (ISO-IR 58) and single-byte character set [ISO 646] can be used intermixed without any explicit escape sequence after the initial escape sequence, up to the next delimiter (^ or =) or the end of the Value Field. Once [ISO 646] has been designated to G0 and ISO-IR 58 to G1, each character set has a different code area, thus can be used intermixed. The decoder will check the most significant bit of a character to know whether it is a two byte character in the G1 area (high bit one) or a one byte character in the G0 area (high bit zero). There does not need to be an explicit escape to invoke [ISO 646] into G0 at the end of the string prior to a delimiter (^ or =) or the end of the Value Field. However, there does need to be a new invocation of ISO-IR 58 in each name component in which it is used.


DICOM PS3.5 2024e - Data Structures and Encoding