Document Type | Troubleshooting

Category | Administration

Applicable Product Versions | 5SP1FS01, 5SP1FS02, 5SP1FS03, 5SP1FS04, 5SP1FS06, 6FS01, 6FS02, 6FS03, 6FS04, 6FS05, 6FS06, 6FS07, 6FS07PS, 7FS01, 7FS02, 7FS02PS

Document Number | TADTS015

Issue

This is a phenomenon where specific string data is not displayed correctly.

Case 1: When the data looks the same during query but actually contains different data
Case 2: When inserting content from documents such as Hancom Office directly through the AP, certain special characters are not displayed correctly

-- Example of the phenomenon 

drop table tibero.pk_test
create table tibero.pk_test(id1 varchar2(10), id2 varchar2(10), data1 varchar2(10), data2 varchar2(10))
alter table tibero.pk_test add constraint pk primary key(id1, id2) 

insert into tibero.pk_test values ('key1','key2',1,1) 
insert into tibero.pk_test values ('key1 ','key2',1,1) 
insert into tibero.pk_test values ('key1 ','key2',1,1) 
insert into tibero.pk_test values ('key1󰊱','key2',1,1) -- Broken character is a special character copied/pasted from Hangul
insert into tibero.pk_test values ('key1󰊲','key2',1,1) -- Broken character is a special character copied/pasted from Hangul


commit


desc tibero.pk_test


COLUMN_NAME TYPE CONSTRAINT 
ID1 VARCHAR(10) PRIMARY KEY 
ID2 VARCHAR(10) PRIMARY KEY 
D1 NUMBER D2 NUMBER 


INDEX_NAME TYPE COLUMN_NAME 
PK1 NORMAL ID1 ID2


select id1,id2 from tibero.pk_test 


  ID1 ID2 
1 key1 key2 
2 key1 key2 
3 key1　 key2 -- It is difficult to visually distinguish the differences in data for rows 1, 2, and 3
4 key1󰊱 key2 
5 key1󰊲 key2  -- The last character of ID1 in rows 4 and 5 is broken

Cause

Case 1: Full-width space characters, half-width space characters, and tab characters are all different characters, but it is difficult to recognize the difference when querying.
(Rows 1, 2, and 3 in the example above)

Case 2: Hancom Office uses an area in Unicode with address values that are unused, where Hangul-specific characters are assigned. In this case, Hancom Office reads the byte values and displays the corresponding characters, but in UTF8 and others, these are unused areas, so the characters are not displayed correctly. Since these are unsupported characters, there is no suitable character for the value during insert, so they are not displayed properly, but the value itself is retained and inserted as is.

Note
This is a case where the input is stored exactly as entered by the user. The character set reads the address code and outputs the character corresponding to it.

Solutions

This is not an error but a characteristic of character set processing. If necessary, check the byte code using to_char(rawtohex(string)) for analysis.

The value "key1" has a byte value of 6B657931
1 : E38080 = Full-width space character used in Japanese character sets
2 : 20 = space
3 : 09 = tab character
4 : F3B08AB1 = Unused area in UTF8 / U+F02B1 in Unicode unused area / Used as a square box with number 1 in Hangul. Looks the same as 5 but actually a different character is inserted
5 : F3B08AB2 = Unused area in UTF8 / U+F02B2 in Unicode unused area / Used as a square box with number 2 in Hangul. Looks the same as 4 but actually a different character is inserted                    

select id1,id2,length(id1),lengthb(id1),to_char(rawtohex(id1)) from tibero.pk_test

  ID1 ID2 LENGTH(ID1) LENGTHB(ID1) TO_CHAR(RAWTOHEX(ID1)) 
1 key1　 key2 5 7 6B657931E38080 
2 key1 key2 5 5 6B65793120 
3 key1 key2 5 5 6B65793109 
4 key1󰊱 key2 5 8 6B657931F3B08AB1
5 key1󰊲 key2 5 8 6B657931F3B08AB2

Related to

Search

Welcome to Tibero GTS!

How to Check When Specific String Data Is Not Displayed Correctly

Issue

Cause

Solutions

업무 외 시간 안내

Search

Welcome to Tibero GTS!

Issue

Cause

Solutions