How do I read Unicode characters from an MS Access 2007 database through Java?
        Posted  
        
            by Peter
        on Stack Overflow
        
        See other posts from Stack Overflow
        
            or by Peter
        
        
        
        Published on 2010-05-09T10:02:37Z
        Indexed on 
            2010/05/09
            10:08 UTC
        
        
        Read the original article
        Hit count: 407
        
In Java, I have written a program that reads a UTF8 text file. The text file contains a SQL query of the SELECT kind. The program then executes the query on the Microsoft Access 2007 database and writes all fields of the first row to a UTF8 text file.
The problem I have is when a row is returned that contains unicode characters, such as "?". These characters show up as "?" in the text file.
I know that the text files are read and written correctly, because a dummy UTF8 character ("?") is read from the text file containing the SQL query and written to the text file containing the resulting row. The UTF8 character looks correct when the written text file is opened in Notepad, so the reading and writing of the text files are not part of the problem.
This is how I connect to the database and how I execute the SQL query:
---- START CODE
Connection c = DriverManager.getConnection("jdbc:odbc:Driver={Microsoft Access Driver (*.mdb, *.accdb)};DBQ=C:/database.accdb;Pwd=temp");
ResultSet r = c.createStatement().executeQuery(sql);
---- END CODE
I have tried making a charSet property to the Connection but it makes no difference:
---- START CODE
Properties p = new Properties();
p.put("charSet", "utf-8");
p.put("lc_ctype", "utf-8");
p.put("encoding", "utf-8");
Connection c = DriverManager.getConnection("...", p);
---- END CODE
Tried with "utf8"/"UTF8"/"UTF-8", no difference. If I enter "UTF-16" I get the following exception: "java.lang.IllegalArgumentException: Illegal replacement".
Been searching around for hours with no results and now turn my hope to you. Please help!
I also accept workaround suggestions. =) What I want to be able to do is to make a Unicode query (for example one that searches for posts that contain the "?" character) and to have results with Unicode characters receieved and saved correctly.
Thank you!
© Stack Overflow or respective owner