DICOM PS3.5 2024e - Data Structures and Encoding

H Character Sets and Person Name Value Representation in the Japanese Language (Informative)

H.1 Character Sets for the Japanese Language

The purpose of this section is to explain the character sets for the Japanese language.

H.1.1 JIS X 0201

[JIS X 0201] has the following code elements:

  • ISO-IR 13 Japanese katakana (phonetic) characters (94 characters)

  • ISO-IR 14 Japanese romaji (alphanumeric) characters (94 characters)

[JIS X 0201] defines a 7-bit romaji code table (ISO-IR 14), a 7-bit katakana code table (ISO-IR 13), and the combination of romaji and katakana as an 8-bit code table (ISO-IR 14 as G0, ISO-IR 13 as G1).

The 7-bit romaji (ISO-IR 14) is identical to ASCII (ISO-IR 6) except that bit combination 05/12 represents a YEN SIGN and bit combination 07/14 represents an over-line. These are national Graphic Character allocations in [ISO 646].

The Escape Sequence for ISO/IEC 2022 is shown for reference in Table H.1-1 (for the Defined Terms, see PS3.3).

Table H.1-1. ISO/IEC 2022 Escape Sequence for ISO-IR 13 and ISO-IR 14

ISO-IR 14

ISO-IR 13

G0 set

ESC 02/08 04/10

ESC 02/08 04/09

G1 set

ESC 02/09 04/10

ESC 02/09 04/09


Note

  1. Table H.1-1 does not include the G2 and G3 sets that are not used in DICOM. See Section 6.1.2.5.1.

  2. Defined Terms ISO_IR 13 and ISO 2022 IR 13 for the Value of the Specific Character Set (0008,0005) support the G0 set for ISO-IR 14 and G1 set for ISO-IR 13. See PS3.3.

  3. ISO-IR 14 cannot encode DOS-style paths that use a BACKSLASH as a file component separator, since the bit combination 05/12 represents a YEN SIGN and not a BACKSLASH symbol. Further, for some VRs (SH, LO, PN, and UC), 05/12 is a delimiter between Values, not part of the Value.

H.1.2 JIS X 0208

[JIS X 0208] has the following code element:

  • ISO-IR 87: Japanese kanji (ideographic), hiragana (phonetic), and katakana (phonetic) characters (942 characters, 2-byte).

H.1.3 JIS X 0212

[JIS X 0212] has the following code element:

  • ISO-IR 159: Supplementary Japanese kanji (ideographic) characters (942 characters, 2-byte)

The Escape Sequence for ISO/IEC 2022 is shown for reference in Table H.1-2 (for the Defined Terms, see PS3.3)

Table H.1-2. ISO/IEC 2022 Escape Sequence for ISO-IR 87 and ISO-IR 159

ISO-IR 87

ISO-IR 159

G0 set

ESC 02/04 04/02

ESC 02/04 02/08 04/04

G1 set

ESC 02/04 02/09 04/02

ESC 02/04 02/09 04/04


Note

  1. The Escape Sequence for the designation function G0-DESIGNATE 94-SET, has first I byte 02/04 and second I byte 02/08. There is an exception to this: The second I byte 02/08 is omitted if the Final Byte is 04/00, 04/01 or 04/02. See ISO/IEC 2022.

  2. The table does not include the G2 and G3 sets that are not used in DICOM. See Section 6.1.2.5.1.

  3. Defined Term ISO 2022 IR 87 for the Value of the Specific Character Set (0008,0005) supports the G0 set for ISO-IR 87, and Defined Term ISO 2022 IR 159 supports the G0 set for ISO-IR 159. See PS3.3.

DICOM PS3.5 2024e - Data Structures and Encoding