Document Type | Technical Information
Category | Administration
Applicable Product Versions | 6FS04, 6FS05, 6FS06, 6FS07, 6FS07PS, 7FS01, 7FS02, 7FS02PS
Document Number | TADTI069
Overview
Method
Check terminal encoding
Query DB Character set
Check whether Taiwanese is a character supported by the character set as in the example below.
SQL> select * from sys._vt_nls_character_set; CHARACTERSET_NAME NCHAR_CHARACTERSET_NAME ----------------- ----------------------- EUCTW UTF16
Check Linux locale
Locale is a string that defines the language, regional settings, output format, etc. used in the user interface.
# View list of supported locales $ locale -a C POSIX EN_US.UTF-8 EN_US KO_KR.UTF-8 KO_KR en_US.8859-15 en_US.ISO8859-1 en_US.UTF-8 en_US ko_KR.IBM-eucKR ko_KR.UTF-8 ko_KR
# Check currently set locale $ locale LANG=ko_KR.UTF-8 LC_COLLATE=ko_KR.UTF-8 LC_CTYPE=ko_KR.UTF-8 LC_MONETARY=ko_KR.UTF-8 LC_NUMERIC=ko_KR.UTF-8 LC_TIME=ko_KR.UTF-8 LC_MESSAGES=ko_KR.UTF-8 LC_ALL=ko_KR.UTF-8
Check TB_NLS_LANG
TB_NLS_LANG is the character set used by the client. If not specified, the database's default character set is used.
$ echo $TB_NLS_LANG UTF8
Check target table dump
You can check what byte values the string is actually stored as through the dump file.
SQL> select dump([column name]) from [table name];
Usage example
SQL> select dump(c2) from t302494; DUMP(C2) ---------------------------------- NULL Len=1: 63 Len=1: 63 Len=1: 63 NULL Len=1: 63 Len=1: 63 NULL NULL Len=1: 63 Len=1: 63 Len=1: 63 NULL NULL
NoteFor reference, Len=1: 63 means Len=1 โ data length is 1 byte, 63 โ ASCII code value 63. It means that internally it occupies 1 byte and stores the character corresponding to ASCII code 63.
Solution for character set corruption when inputting via JDBC
If encoding problems occur when inputting via JDBC, and the character set of inserted or updated data is corrupted after java compilation, check by setting encoding.
$ javac -classpath [jdbc jar path] [JAVA filename] -encoding utf-8
Usage example
--- Tibero 6 $ javac -classpath .:$TB_HOME/client/lib/jar/tibero6-jdbc.jar IMS236665.java -encoding utf-8 --- Tibero 7 $ javac -classpath .:$TB_HOME/client/lib/jar/tibero7-jdbc.jar IMS236665.java -encoding utf-8
Solution for encoding issues when saving/reopening after editing with vi editor
If encoding problems occur when saving/reopening after editing with vi editor, and there is no problem with terminal or LINUX settings, check the file encoding.
$ file -i [filename]
Usage example
$ file -i example.txt example.txt: text/plain; charset=utf-8
Check list of supported character sets by DB version
Character set name is the Tibero character set, and Equivalent Oracle Charset name is the compatible Oracle standard character set.
$ tbboot -C Available character set list Charset name Equivalent Oracle Charset name AR8ISO8859P6 AR8ISO8859P6 AR8MSWIN1256 AR8MSWIN1256 ASCII US7ASCII CL8ISO8859P5 CL8ISO8859P5 CL8KOI8R CL8KOI8R CL8MSWIN1251 CL8MSWIN1251 EE8ISO8859P2 EE8ISO8859P2 EL8ISO8859P7 EL8ISO8859P7 EL8MSWIN1253 EL8MSWIN1253 EUCKR KO16KSC5601 EUCTW ZHT32EUC GB18030 ZHS32GB18030 GBK ZHS16GBK IW8ISO8859P8 IW8ISO8859P8 JA16EUC JA16EUC JA16EUCTILDE JA16EUCTILDE JA16SJIS JA16SJIS JA16SJISTILDE JA16SJISTILDE MSWIN949 KO16MSWIN949 RU8PC866 RU8PC866 SJIS SJISTILDE TH8TISASCII TH8TISASCII UTF16 AL16UTF16 UTF8 AL32UTF8 VN8VN3 VN8VN3 WE8ISO8859P1 WE8ISO8859P1 WE8ISO8859P15 WE8ISO8859P15 WE8ISO8859P9 WE8ISO8859P9 WE8MSWIN1252 WE8MSWIN1252 ZHT16BIG5 ZHT16BIG5 ZHT16HKSCS ZHT16HKSCS ZHT16MSWIN950 ZHT16MSWIN950 Available nls_date_lang set list AMERICAN BRAZILIAN PORTUGUESE JAPANESE KOREAN RUSSIAN SIMPLIFIED CHINESE THAI TRADITIONAL CHINESE VIETNAMESE
NoteThere is a peculiar case where the character set is corrupted due to an operating system AIX issue.When the DB character set is euc-tw (Taiwanese) on AIX, running the jdbc class file works normally on Linux but the character set is corrupted on AIX. It was confirmed to be corrupted because it was encoded in ISO-8859-1 instead of the expected euc-tw encoding byte conversion due to an AIX IBM JDK bug.