Unicode Transformation Format (UTF)

The Unicode Transformation Format (UTF) is a character encoding format which is able to encode all of the possible character code points in Unicode. The most prolific is UTF-8, which is a variable-length encoding and uses 8-bit code units, designed for backwards compatibility with ASCII encoding.

The Unicode Transformation Format is also known as the Universal Transformation Format. 

The Unicode Transformation Format is one of two encodings used in Unicode, the other one being the Universal Character Set (UCS). They are both used to map the range of Unicode code points into sequences of termed code values. The numbers in the names of the encoding indicate how many bits are being used in one code value of the encoding. This simply means that each unique character is being assigned a code identifier called code points.

Different kinds of UTF encodings include:

  • UTF-1 — Retired predecessor of UTF-8, no longer part of the Unicode Standard
  • UTF-7 — Uses 7 bits for encoding and was primarily used in email, but is now considered obsolete
  • UTF-8 — Uses an 8-bit variable-width encoding in order to maximize compatibility with ASCII
  • UTF-16 — 16-bit variable-width encoding
  • UTF-32 — 32-bit fixed-width encoding
  • UTF-EBCIDC — Uses 8 bits and designed to be compatible with Extended Binary Coded Decimal Interchange Code (EBCDIC)

Post a Comment

0 Comments