Document Type | Technical Information

Category | Administration

Applicable Product Versions | 6FS04, 6FS05, 6FS06, 6FS07, 6FS07PS, 7FS01, 7FS02, 7FS02PS

Document Number | TADTI069

Overview

This describes the items that can be checked when the character set is not properly recognized.

Method

Check terminal encoding

Query DB Character set

Check whether Taiwanese is a character supported by the character set as in the example below.

SQL> select * from sys._vt_nls_character_set;

CHARACTERSET_NAME NCHAR_CHARACTERSET_NAME

----------------- -----------------------

EUCTW             UTF16

Check Linux locale

Locale is a string that defines the language, regional settings, output format, etc. used in the user interface.

#  View list of supported locales
$ locale -a

C
POSIX
EN_US.UTF-8
EN_US
KO_KR.UTF-8
KO_KR
en_US.8859-15
en_US.ISO8859-1
en_US.UTF-8
en_US
ko_KR.IBM-eucKR
ko_KR.UTF-8
ko_KR

#  Check currently set locale
$ locale

LANG=ko_KR.UTF-8
LC_COLLATE=ko_KR.UTF-8
LC_CTYPE=ko_KR.UTF-8
LC_MONETARY=ko_KR.UTF-8
LC_NUMERIC=ko_KR.UTF-8
LC_TIME=ko_KR.UTF-8
LC_MESSAGES=ko_KR.UTF-8
LC_ALL=ko_KR.UTF-8

Check TB_NLS_LANG

TB_NLS_LANG is the character set used by the client. If not specified, the database's default character set is used.

$ echo $TB_NLS_LANG
UTF8

Check target table dump

You can check what byte values the string is actually stored as through the dump file.

SQL> select dump([column name]) from [table name];

Usage example

SQL> select dump(c2) from t302494;
DUMP(C2)
----------------------------------
NULL
Len=1: 63
Len=1: 63
Len=1: 63
NULL
Len=1: 63
Len=1: 63
NULL
NULL
Len=1: 63
Len=1: 63
Len=1: 63
NULL
NULL

Note
For reference, Len=1: 63 means Len=1 → data length is 1 byte, 63 → ASCII code value 63. It means that internally it occupies 1 byte and stores the character corresponding to ASCII code 63.

Solution for character set corruption when inputting via JDBC

If encoding problems occur when inputting via JDBC, and the character set of inserted or updated data is corrupted after java compilation, check by setting encoding.

 $ javac -classpath [jdbc jar path]  [JAVA filename] -encoding utf-8

Usage example

--- Tibero 6
$ javac -classpath .:$TB_HOME/client/lib/jar/tibero6-jdbc.jar IMS236665.java -encoding utf-8

--- Tibero 7
$ javac -classpath .:$TB_HOME/client/lib/jar/tibero7-jdbc.jar IMS236665.java -encoding utf-8

Solution for encoding issues when saving/reopening after editing with vi editor

If encoding problems occur when saving/reopening after editing with vi editor, and there is no problem with terminal or LINUX settings, check the file encoding.

 $ file -i [filename]

Usage example

$ file -i example.txt
example.txt: text/plain; charset=utf-8

Check list of supported character sets by DB version

Character set name is the Tibero character set, and Equivalent Oracle Charset name is the compatible Oracle standard character set.

$ tbboot -C
Available character set list

 Charset name          Equivalent Oracle Charset name
 
 AR8ISO8859P6          AR8ISO8859P6
 AR8MSWIN1256          AR8MSWIN1256
 ASCII                 US7ASCII
 CL8ISO8859P5          CL8ISO8859P5
 CL8KOI8R              CL8KOI8R
 CL8MSWIN1251          CL8MSWIN1251
 EE8ISO8859P2          EE8ISO8859P2
 EL8ISO8859P7          EL8ISO8859P7
 EL8MSWIN1253          EL8MSWIN1253
 EUCKR                 KO16KSC5601
 EUCTW                 ZHT32EUC
 GB18030               ZHS32GB18030
 GBK                   ZHS16GBK
 IW8ISO8859P8          IW8ISO8859P8
 JA16EUC               JA16EUC
 JA16EUCTILDE          JA16EUCTILDE
 JA16SJIS              JA16SJIS
 JA16SJISTILDE         JA16SJISTILDE
 MSWIN949              KO16MSWIN949
 RU8PC866              RU8PC866
 SJIS                  
 SJISTILDE            
 TH8TISASCII           TH8TISASCII
 UTF16                 AL16UTF16
 UTF8                  AL32UTF8
 VN8VN3                VN8VN3
 WE8ISO8859P1          WE8ISO8859P1
 WE8ISO8859P15         WE8ISO8859P15
 WE8ISO8859P9          WE8ISO8859P9
 WE8MSWIN1252          WE8MSWIN1252
 ZHT16BIG5             ZHT16BIG5
 ZHT16HKSCS            ZHT16HKSCS
 ZHT16MSWIN950         ZHT16MSWIN950


Available nls_date_lang set list

 AMERICAN
 BRAZILIAN PORTUGUESE
 JAPANESE
 KOREAN
 RUSSIAN
 SIMPLIFIED CHINESE
 THAI
 TRADITIONAL CHINESE
 VIETNAMESE

Note
There is a peculiar case where the character set is corrupted due to an operating system AIX issue.
When the DB character set is euc-tw (Taiwanese) on AIX, running the jdbc class file works normally on Linux but the character set is corrupted on AIX. It was confirmed to be corrupted because it was encoded in ISO-8859-1 instead of the expected euc-tw encoding byte conversion due to an AIX IBM JDK bug.

Related to

Search

Welcome to Tibero GTS!

Character set related issue verification items

Overview

Method

Check terminal encoding

Query DB Character set

Check Linux locale

Check TB_NLS_LANG

Check target table dump

Usage example

Solution for character set corruption when inputting via JDBC

Usage example

Solution for encoding issues when saving/reopening after editing with vi editor

Usage example

Check list of supported character sets by DB version

업무 외 시간 안내

Search

Welcome to Tibero GTS!

Overview

Method

Check terminal encoding

Query DB Character set

Check Linux locale

Check TB_NLS_LANG

Check target table dump

Usage example

Solution for character set corruption when inputting via JDBC

Usage example

Solution for encoding issues when saving/reopening after editing with vi editor

Usage example

Check list of supported character sets by DB version