Full Text Search MSSQL2008 show wrong display_item for Thai Language
- by ensecoz
I am working on MSSQL2008. My task is to investigate the issue that why FTS cannot find the right result for Thai.
First, I am having the table which enable the FTS on the column 'ItemName' which is nvarchar. The Catalog is created with the Thai Language. Note that, Thai language is one of the language that doesn't separate the word by space so '????' '???' '????' are written like this in the sentence '???????????'
In the table, there are many rows that include the word (????) for examples row#1 (ItemName: '???????????')
On the webpage, I try to search for '????' but SQLServer cannot find it.
So I try to investigate it by trying the following query in SQLServer
select * from sys.dm_fts_parser(N'"???????????"', 1054, 0, 0)
To see how the words are broken. The first one is the text to be break. The second parameter is specify that using Thai (WorkBreaker, so on). and here is the result:
row#1 (display_item: '????', source_item: '???????????')
row#2 (display_item: '????', source_item: '???????????')
row#3 (display_item: '??', source_item: '???????????')
Notice that the first and second row display the worng display_item '?' in the '????' isn't even Thai characters. '?' in '????' is not Thai charater either.
So the question is where is those alien characters come from? I guess this i why I cannot search for '????' because the word breaker is mis-borken and keeping the wrong character in the indexes.
Please help!