Supported Encodings
Ataccama products support Unicode and a number of other character encodings. Here you can find a list of supported encodings.
| Canonical name for java.nio API | Canonical name for java.io API and java.lang API | Description | 
|---|---|---|
Big5  | 
Big5  | 
Big5, Traditional Chinese  | 
Big5-HKSCS  | 
Big5_HKSCS  | 
Big5 with Hong Kong extensions, Traditional Chinese (incorporating 2001 revision)  | 
CESU-8  | 
CESU8  | 
Compatibility Encoding Scheme for UTF-16: 8-Bit (CESU-8)  | 
EUC-JP  | 
EUC_JP  | 
JISX 0201, 0208 and 0212, EUC encoding Japanese  | 
EUC-KR  | 
EUC_KR  | 
KS C 5601, EUC encoding, Korean  | 
GB18030  | 
GB18030  | 
Simplified Chinese, PRC standard  | 
GB2312  | 
EUC_CN  | 
GB2312, EUC encoding, Simplified Chinese  | 
GBK  | 
GBK  | 
GBK, Simplified Chinese  | 
IBM00858  | 
Cp858  | 
Code page 858 (also known as CP 858)  | 
IBM01140  | 
Cp1140  | 
Variant of Cp037 with Euro character  | 
IBM01141  | 
Cp1141  | 
Variant of Cp273 with Euro character  | 
IBM01142  | 
Cp1142  | 
Variant of Cp277 with Euro character  | 
IBM01143  | 
Cp1143  | 
Variant of Cp278 with Euro character  | 
IBM01144  | 
Cp1144  | 
Variant of Cp280 with Euro character  | 
IBM01145  | 
Cp1145  | 
Variant of Cp284 with Euro character  | 
IBM01146  | 
Cp1146  | 
Variant of Cp285 with Euro character  | 
IBM01147  | 
Cp1147  | 
Variant of Cp297 with Euro character  | 
IBM01148  | 
Cp1148  | 
Variant of Cp500 with Euro character  | 
IBM01149  | 
Cp1149  | 
Variant of Cp871 with Euro character  | 
IBM037  | 
Cp037  | 
USA, Canada (Bilingual, French), Netherlands, Portugal, Brazil, Australia  | 
IBM1026  | 
Cp1026  | 
IBM Latin-5, Turkey  | 
IBM1047  | 
Cp1047  | 
Latin-1 character set for EBCDIC hosts  | 
IBM273  | 
Cp273  | 
IBM Austria, Germany  | 
IBM277  | 
Cp277  | 
IBM Denmark, Norway  | 
IBM278  | 
Cp278  | 
IBM Finland, Sweden  | 
IBM280  | 
Cp280  | 
IBM Italy  | 
IBM284  | 
Cp284  | 
IBM Catalan/Spain, Spanish Latin America  | 
IBM285  | 
Cp285  | 
IBM United Kingdom, Ireland  | 
IBM290  | 
Cp290  | 
EBCDIC-JP-kana, Japanese EBCDIC  | 
IBM297  | 
Cp297  | 
IBM France  | 
IBM420  | 
Cp420  | 
IBM Arabic  | 
IBM424  | 
Cp424  | 
IBM Hebrew  | 
IBM437  | 
Cp437  | 
MS-DOS United States, Australia, New Zealand, South Africa  | 
IBM500  | 
Cp500  | 
EBCDIC 500V1  | 
IBM775  | 
Cp775  | 
PC Baltic  | 
IBM850  | 
Cp850  | 
MS-DOS Latin-1  | 
IBM852  | 
Cp852  | 
MS-DOS Latin-2  | 
IBM855  | 
Cp855  | 
IBM Cyrillic  | 
IBM857  | 
Cp857  | 
IBM Turkish  | 
IBM860  | 
Cp860  | 
MS-DOS Portuguese  | 
IBM861  | 
Cp861  | 
MS-DOS Icelandic  | 
IBM862  | 
Cp862  | 
PC Hebrew  | 
IBM863  | 
Cp863  | 
MS-DOS Canadian French  | 
IBM864  | 
Cp864  | 
PC Arabic  | 
IBM865  | 
Cp865  | 
MS-DOS Nordic  | 
IBM866  | 
Cp866  | 
MS-DOS Russian  | 
IBM868  | 
Cp868  | 
MS-DOS Pakistan  | 
IBM869  | 
Cp869  | 
IBM Modern Greek  | 
IBM870  | 
Cp870  | 
IBM Multilingual Latin-2  | 
IBM871  | 
Cp871  | 
IBM Iceland  | 
IBM918  | 
Cp918  | 
IBM Pakistan (Urdu)  | 
IBM-Thai  | 
Cp838  | 
IBM Thailand extended SBCS  | 
ISO-2022-CN  | 
ISO2022CN  | 
GB2312 and CNS11643 in ISO 2022 CN form, Simplified and Traditional Chinese (conversion to Unicode only)  | 
ISO-2022-JP  | 
ISO2022JP  | 
JIS X 0201, 0208, in ISO 2022 form, Japanese  | 
ISO-2022-JP-2  | 
Not Available  | 
Multilingual Extension of ISO-2022-JP  | 
ISO-2022-KR  | 
ISO2022KR  | 
ISO 2022 KR, Korean  | 
ISO-8859-1  | 
ISO8859_1  | 
ISO-8859-1, Latin Alphabet No. 1  | 
ISO-8859-13  | 
ISO8859_13  | 
Latin Alphabet No. 7  | 
ISO-8859-15  | 
ISO8859_15  | 
Latin Alphabet No. 9  | 
ISO-8859-16  | 
ISO8859_2  | 
Latin Alphabet No. 2  | 
ISO-8859-2  | 
ISO8859_3  | 
Latin Alphabet No. 3  | 
ISO-8859-3  | 
ISO8859_4  | 
Latin Alphabet No. 4  | 
ISO-8859-4  | 
ISO8859_5  | 
Latin/Cyrillic Alphabet  | 
ISO-8859-5  | 
ISO8859_6  | 
Latin/Arabic Alphabet  | 
ISO-8859-6  | 
ISO8859_7  | 
Latin/Greek Alphabet (ISO-8859-7:2003)  | 
ISO-8859-7  | 
ISO8859_8  | 
Latin/Hebrew Alphabet  | 
ISO-8859-8  | 
ISO8859_9  | 
Latin Alphabet No. 5  | 
ISO-8859-9  | 
JIS_X0201  | 
JIS X 0201  | 
JIS_X0201  | 
JIS_X0212-1990  | 
JIS X 0212  | 
JIS_X0212-1990  | 
KOI8_R  | 
KOI8-R, Russian  | 
KOI8-R  | 
KOI8_U  | 
KOI8-U, Ukrainian  | 
KOI8-U  | 
UnicodeBig  | 
Sixteen-bit Unicode (or UCS) Transformation Format, big-endian byte order, with byte-order mark  | 
Shift_JIS  | 
SJIS  | 
Shift-JIS, Japanese  | 
TIS-620  | 
TIS620  | 
TIS620, Thai  | 
US-ASCII  | 
ASCII  | 
American Standard Code for Information Interchange  | 
UTF-16  | 
UTF-16  | 
Sixteen-bit Unicode (or UCS) Transformation Format, byte order identified by an optional byte-order mark  | 
UTF-16BE  | 
UnicodeBigUnmarked  | 
Sixteen-bit Unicode (or UCS) Transformation Format, big-endian byte order  | 
UTF-16LE  | 
UnicodeLittleUnmarked  | 
Sixteen-bit Unicode (or UCS) Transformation Format, little-endian byte order  | 
UTF-32  | 
UTF_32  | 
32-bit Unicode (or UCS) Transformation Format, byte order identified by an optional byte-order mark  | 
UTF-32BE  | 
UTF_32BE  | 
32-bit Unicode (or UCS) Transformation Format, big-endian byte order  | 
UTF-32LE  | 
UTF_32LE  | 
32-bit Unicode (or UCS) Transformation Format, little-endian byte order  | 
UTF-8  | 
UTF8  | 
Eight-bit Unicode (or UCS) Transformation Format  | 
windows-1250  | 
Cp1250  | 
Windows Eastern European  | 
windows-1251  | 
Cp1251  | 
Windows Cyrillic  | 
windows-1252  | 
Cp1252  | 
Windows Latin-1  | 
windows-1253  | 
Cp1253  | 
Windows Greek  | 
windows-1254  | 
Cp1254  | 
Windows Turkish  | 
windows-1255  | 
Cp1255  | 
Windows Hebrew  | 
windows-1256  | 
Cp1256  | 
Windows Arabic  | 
windows-1257  | 
Cp1257  | 
Windows Baltic  | 
windows-1258  | 
Cp1258  | 
Windows Vietnamese  | 
windows-31j  | 
MS932  | 
Windows Japanese  | 
x-Big5-HKSCS-2001  | 
big5hk  | 
Hong Kong Supplementary Character Set  | 
x-Big5-Solaris  | 
Big5_Solaris  | 
Big5 with seven additional Hanzi ideograph character mappings for the Solaris zh_TW.BIG5 locale  | 
x-euc-jp-linux  | 
EUC_JP_LINUX  | 
JISX 0201, 0208, EUC encoding Japanese  | 
x-eucJP-Open  | 
EUC_JP_Solaris  | 
JISX 0201, 0208, 0212, EUC encoding Japanese  | 
x-EUC-TW  | 
EUC_TW  | 
CNS11643 (Plane 1-7,15), EUC encoding, Traditional Chinese  | 
x-IBM1006  | 
Cp1006  | 
IBM AIX Pakistan (Urdu)  | 
x-IBM1025  | 
Cp1025  | 
IBM Multilingual Cyrillic: Bulgaria, Bosnia, Herzegovinia, Macedonia (FYR)  | 
x-IBM1046  | 
Cp1046  | 
IBM Arabic - Windows  | 
x-IBM1097  | 
Cp1097  | 
IBM Iran (Farsi)/Persian  | 
x-IBM1098  | 
Cp1098  | 
IBM Iran (Farsi)/Persian (PC)  | 
x-IBM1112  | 
Cp1112  | 
IBM Latvia, Lithuania  | 
x-IBM1122  | 
Cp1122  | 
IBM Estonia  | 
x-IBM1123  | 
Cp1123  | 
IBM Ukraine  | 
x-IBM1124  | 
Cp1124  | 
IBM AIX Ukraine  | 
x-IBM1129  | 
Not Available  | 
ISO-8 Vietnamese  | 
x-IBM1166  | 
Cp1166  | 
IBM Cyrillic Multilingual with euro for Kazakhstan  | 
x-IBM1364  | 
Cp1364  | 
IBM EBCDIC KS X 1005-1  | 
x-IBM1381  | 
Cp1381  | 
IBM OS/2, DOS People’s Republic of China (PRC)  | 
x-IBM1383  | 
Cp1383  | 
IBM AIX People’s Republic of China (PRC)  | 
x-IBM29626C  | 
Cp33722  | 
IBM-eucJP - Japanese (superset of 5050)  | 
x-IBM300  | 
Cp300  | 
IBM Japanese Latin Host Double-Byte  | 
x-IBM33722  | 
Cp33722  | 
IBM-eucJP - Japanese (superset of 5050)  | 
x-IBM737  | 
Cp737  | 
PC Greek  | 
x-IBM833  | 
Cp833  | 
IBM Korean Host Extended SBCS  | 
x-IBM834  | 
Cp834  | 
IBM EBCDIC DBCS-only Korean  | 
x-IBM856  | 
Cp856  | 
IBM Hebrew  | 
x-IBM874  | 
Cp874  | 
IBM Thai  | 
x-IBM875  | 
Cp875  | 
IBM Greek  | 
x-IBM921  | 
Cp921  | 
IBM Latvia, Lithuania (AIX, DOS)  | 
x-IBM922  | 
Cp922  | 
IBM Estonia (AIX, DOS)  | 
x-IBM930  | 
Cp930  | 
Japanese Katakana-Kanji mixed with 4370 UDC, superset of 5026  | 
x-IBM933  | 
Cp933  | 
Korean Mixed with 1880 UDC, superset of 5029  | 
x-IBM935  | 
Cp935  | 
Simplified Chinese Host mixed with 1880 UDC, superset of 5031  | 
x-IBM937  | 
Cp937  | 
Traditional Chinese Host mixed with 6204 UDC, superset of 5033  | 
x-IBM939  | 
Cp939  | 
Japanese Latin Kanji mixed with 4370 UDC, superset of 5035  | 
x-IBM942  | 
Cp942  | 
IBM OS/2 Japanese, superset of Cp932  | 
x-IBM942C  | 
Cp942C  | 
Variant of Cp942  | 
x-IBM943  | 
Cp943  | 
IBM OS/2 Japanese, superset of Cp932 and Shift-JIS  | 
x-IBM943C  | 
Cp943C  | 
Variant of Cp943  | 
x-IBM948  | 
Cp948  | 
OS/2 Chinese (Taiwan) superset of 938  | 
x-IBM949  | 
Cp949  | 
PC Korean  | 
x-IBM949C  | 
Cp949C  | 
Variant of Cp949  | 
x-IBM950  | 
Cp950  | 
PC Chinese (Hong Kong, Taiwan)  | 
x-IBM964  | 
Cp964  | 
AIX Chinese (Taiwan)  | 
x-IBM970  | 
Cp970  | 
AIX Korean  | 
x-ISCII91  | 
ISCII91  | 
ISCII91 encoding of Indic scripts  | 
x-ISO-2022-CN-CNS  | 
ISO2022_CN_CNS  | 
CNS11643 in ISO 2022 CN form, Traditional Chinese (conversion from Unicode only)  | 
x-ISO-2022-CN-GB  | 
ISO2022_CN_GB  | 
GB2312 in ISO 2022 CN form, Simplified Chinese (conversion from Unicode only)  | 
x-iso-8859-11  | 
x-iso-8859-11  | 
Latin/Thai Alphabet  | 
x-JIS0208  | 
x-JIS0208  | 
JIS X 0208  | 
x-JISAutoDetect  | 
JISAutoDetect  | 
Detects and converts from Shift-JIS, EUC-JP, ISO 2022 JP (conversion to Unicode only)  | 
x-Johab  | 
x-Johab  | 
Korean, Johab character set  | 
x-MacArabic  | 
MacArabic  | 
Macintosh Arabic  | 
x-MacCentralEurope  | 
MacCentralEurope  | 
Macintosh Latin-2  | 
x-MacCroatian  | 
MacCroatian  | 
Macintosh Croatian  | 
x-MacCyrillic  | 
MacCyrillic  | 
Macintosh Cyrillic  | 
x-MacDingbat  | 
MacDingbat  | 
Macintosh Dingbat  | 
x-MacGreek  | 
MacGreek  | 
Macintosh Greek  | 
x-MacHebrew  | 
MacHebrew  | 
Macintosh Hebrew  | 
x-MacIceland  | 
MacIceland  | 
Macintosh Iceland  | 
x-MacRoman  | 
MacRoman  | 
Macintosh Roman  | 
x-MacRomania  | 
MacRomania  | 
Macintosh Romania  | 
x-MacSymbol  | 
MacSymbol  | 
Macintosh Symbol  | 
x-MacThai  | 
MacThai  | 
Macintosh Thai  | 
x-MacTurkish  | 
MacTurkish  | 
Macintosh Turkish  | 
x-MacUkraine  | 
MacUkraine  | 
Macintosh Ukraine  | 
x-MS932_0213  | 
x-MS950-HKSCS MS950_HKSCS  | 
Shift_JISX0213 Windows MS932 Variant  | 
x-MS950-HKSCS  | 
MS950_HKSCS  | 
Windows Traditional Chinese with Hong Kong extensions  | 
x-MS950-HKSCS-XP  | 
x-mswin-936 MS936  | 
HKSCS Windows XP Variant  | 
x-mswin-936  | 
MS936  | 
Windows Simplified Chinese  | 
x-PCK  | 
PCK  | 
Solaris version of Shift_JIS  | 
x-SJIS_0213  | 
x-SJIS_0213  | 
Shift_JISX0213  | 
x-UTF-16LE-BOM  | 
UnicodeLittle  | 
Sixteen-bit Unicode (or UCS) Transformation Format, little-endian byte order, with byte-order mark  | 
X-UTF-32BE-BOM  | 
UTF_32BE_BOM  | 
32-bit Unicode (or UCS) Transformation Format, big-endian byte order, with byte-order mark  | 
X-UTF-32LE-BOM  | 
UTF_32LE_BOM  | 
32-bit Unicode (or UCS) Transformation Format, little-endian byte order, with byte-order mark  | 
x-windows-50220  | 
Cp50220  | 
Windows Codepage 50220 (7-bit implementation)  | 
x-windows-50221  | 
Cp50221  | 
Windows Codepage 50221 (7-bit implementation)  | 
x-windows-874  | 
MS874  | 
Windows Thai  | 
x-windows-949  | 
MS949  | 
Windows Korean  | 
x-windows-950  | 
MS950  | 
Windows Traditional Chinese  | 
x-windows-iso2022jp  | 
x-windows-iso2022jp  | 
Variant ISO-2022-JP (MS932 based)  | 
Was this page useful?