Search Results

Search found 1474 results on 59 pages for 'unicode'.

Page 14/59 | < Previous Page | 10 11 12 13 14 15 16 17 18 19 20 21 | Next Page >

UnicodeEncodeError: 'ascii' codec can't encode character [...]

- by user1461135

I have read the HOWTO on Unicode from the official docs and a full, very detailed article as well. Still I don't get it why it throws me this error. Here is what I attempt: I open an XML file that contains chars out of ASCII range (but inside allowed XML range). I do that with cfg = codecs.open(filename, encoding='utf-8, mode='r') which runs fine. Looking at the string with repr() also shows me a unicode string. Now I go ahead and read that with parseString(cfg.read().encode('utf-8'). Of course, my XML file starts with this: <?xml version="1.0" encoding="utf-8"?>. Although I suppose it is not relevant, I also defined utf-8 for my python script, but since I am not writing unicode characters directly in it, this should not apply here. Same for the following line: from __future__ import unicode_literals which also is right at the beginning. Next thing I pass the generated Object to my own class where I read tags into variables like this: xmldata.getElementsByTagName(tagName)[0].firstChild.data and assign it to a variable in my class. Now what perfectly works are those commands (obj is an instance of the class): for element in obj: print element And this command does work as well: print obj.__repr__() I defined __iter__() to just yield every variable while __repr__() uses the typical printf stuff: "%s" % self.varname Both commands print perfectly and can output the unicode character. What does not work is this: print obj And now I am stuck because this throws the dreaded UnicodeEncodeError: 'ascii' codec can't encode character u'\xfc' in position 47: So what am I missing? What am I doing wrong? I am looking for a general solution, I always want to handle strings as unicode, just to avoid any possible errors and write a compatible program. Edit: I also defined this: def __str__(self): return self.__repr__() def __unicode__(self): return self.__repr__() From documentation I got that this

Read the article
Unicode Collations problem ?

- by Bayonian

(.NET 3.5 SP1, VS 2008, VB.NET, MSSQL Server 2008) I'm writing a small web app to test the Khmer Unicode and Lao Unicode. I have a table that store text in Khmer Unicode with the following structure : [t_id] [int] IDENTITY(1,1) NOT NULL [t_chid] [int] NOT NULL [t_vn] [int] NOT NULL [t_v] [nvarchar](max) NOT NULL I can use Linq to SQL to do CRUD normally. The text display properly on the web page, even though I didn't change the default collation of MSSQL Server 2008. When it comes to search the column [t_v], the page will take a very long time to load and in fact, it loads every row of that column. It never compares with the "key word" criteria that I use for the search. Here's my query for the search : Public Shared Function SearchTestingKhmerTable(ByVal keyword As String) As DataTable Dim db As New BibleDataClassesDataContext() Dim query = From b In db.khmer_books _ From ch In db.khmer_chapters _ From v In db.testing_khmers _ Where v.t_v.Contains(keyword) And ch.kh_book_id = b.kh_b_id And v.t_chid = ch.kh_ch_id _ Select b.kh_b_id, b.kh_b_title, ch.kh_ch_id, ch.kh_ch_number, v.t_id, v.t_vn, v.t_v Dim dtDataTableOne = New DataTable("dtOne") dtDataTableOne.Columns.Add("bid", GetType(Integer)) dtDataTableOne.Columns.Add("btitle", GetType(String)) dtDataTableOne.Columns.Add("chid", GetType(Integer)) dtDataTableOne.Columns.Add("chn", GetType(Integer)) dtDataTableOne.Columns.Add("vid", GetType(Integer)) dtDataTableOne.Columns.Add("vn", GetType(Integer)) dtDataTableOne.Columns.Add("verse", GetType(String)) For Each r In query dtDataTableOne.Rows.Add(New Object() {r.kh_b_id, r.kh_b_title, r.kh_ch_id, r.kh_ch_number, r.t_id, r.t_vn, r.t_v}) Next Return dtDataTableOne End Function Please note that I use the exact same code and database design with Lao Unicode and it works just fine. I get the returned query as expected for the search. I can't figure out what the problem with searching for query in Khmer table.

Read the article
Robocopy unilog output is gibberish

- by miro

I tried to get robocopy in Windows 7 to generate a Unicode log, since I have files with Unicode characters. The command I used: robocopy C:\mysource D:\mydest /mir /unilog:backup.log /tee File the copy works and the onscreen output is correct, the log file itself just contains gibberish. This is regardless of whether I use the Command Prompt or the Powershell. What gives? Am I doing something wrong?

Read the article
Get unicode character with curl in PHP

- by saturngod

I tried to get URL with curl. Return value contain unicode character. curl convert to \u for unicode character. How to get unicode character with curl ? This is my code <?php $ch = curl_init(); // set URL and other appropriate options curl_setopt($ch, CURLOPT_URL, "http://www.ornagai.com/index.php/api/word/q/test/format/json"); curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); $return=curl_exec($ch); echo $return; ?>

Read the article
Difference between WinMain and wWinMain

- by Sherwood Hu

The only difference is that Winmain takes char* for lpCmdLine parameter, while wWinMain takes wchar_t*. On Windows XP, if an application entry is WinMain, does Windows convert the command line from Unicode to Ansi and pass to the application? If the command line parameter must be in Unicode (for example, Unicode file name, conversion will cause some characters missing), does that mean that I must use wWinMain as the entry function?

Read the article
Saving a file in a CSV type in Excel always removes the BOM

- by rickp

I've been trying to find a reasonable solution/explanation (unsuccessfully) to find out why Excel defaults to removing the BOM when saving a file to the CSV type. Please forgive me if you find this a duplicate of this question. This handles reading CSV files with non-ASCII encoding, but it doesn't cover saving the file back out (which is where the biggest issue lies). Here is my current situation (which I'm going to gather is common among localized software dealing with Unicode characters and a CSV format): We export data to a CSV format using UTF-16LE, ensuring the BOM is set (0xFFFE). We validate after the file is generated with a Hex editor to ensure it was set correctly. Open the file in Excel (for this example we're exporting Japanese characters) and witness that Excel handles loading the file with the correct encoding. Attempts to save this file will prompt you with a warning message indicating that the file may contain features that may not be compatible with Unicode encoding, but asks if you'd like to save anyway. If you select the Save As dialog, it will immediately ask you to save the file as "Unicode Text" rather than CSV. If you select the "CSV" extension and save the file it removes the BOM (obviously along with all the Japanese characters). Why would this happen? Is there a solution to this problem, or is this a known 'bug'/limitation of Excel? Additionally (as a side issue) it appears that Excel, when loading UTF-16LE encoded CSV files, only uses TAB delimiters. Again, is this another known 'bug'/limitation of Excel?

Read the article
Saving a file in a CSV type in Excel always removes the BOM

- by rickp

I've been trying to find a reasonable solution/explanation (unsuccessfully) to find out why Excel defaults to removing the BOM when saving a file to the CSV type. Please forgive me if you find this a duplicate of this question. This handles reading CSV files with non-ASCII encoding, but it doesn't cover saving the file back out (which is where the biggest issue lies). Here is my current situation (which I'm going to gather is common among localized software dealing with Unicode characters and a CSV format): We export data to a CSV format using UTF-16LE, ensuring the BOM is set (0xFFFE). We validate after the file is generated with a Hex editor to ensure it was set correctly. Open the file in Excel (for this example we're exporting Japanese characters) and witness that Excel handles loading the file with the correct encoding. Attempts to save this file will prompt you with a warning message indicating that the file may contain features that may not be compatible with Unicode encoding, but asks if you'd like to save anyway. If you select the Save As dialog, it will immediately ask you to save the file as "Unicode Text" rather than CSV. If you select the "CSV" extension and save the file it removes the BOM (obviously along with all the Japanese characters). Why would this happen? Is there a solution to this problem, or is this a known 'bug'/limitation of Excel? Additionally (as a side issue) it appears that Excel, when loading UTF-16LE encoded CSV files, only uses TAB delimiters. Again, is this another known 'bug'/limitation of Excel?

Read the article
Storing unicode strings to SQL Server via ActiveRecord

- by ripper234

I am using Castle ActiveRecord as my ORM. When I try to store unicode strings, I get question marks instead. Saving unicode strings worked perfectly when I was using mysql, but when I recently switch to SQL Server it broke. How should I go about fixing this?

Read the article
Unicode To ASCII Conversion

- by Yuvaraj

Hi all, i creating an small application in Delphi 2009. here i got problem that when i run my application in WindowsXP its working but it is not working in Windows95. i know the problem that 95 will not support Unicode. if anyone knows the solution please tell me. and also i have one more idea that converting Unicode to ASCII. is it possible please tell how to do that. Thanks in Advance Worm Regards, Yuvaraj

Read the article
Why do we need to put N before strings in Microsoft SQL Server?

- by user61752

I'm learning T-SQL. From the examples I've seen, to insert text in a varchar() cell, I can write just the string to insert, but for nvarchar() cells, every example prefix the strings with the letter N. I tried the following query on a table which has nvarchar() rows, and it works fine, so the prefix N is not required: insert into [TableName] values ('Hello', 'World') Why the strings are prefixed with N in every example I've seen? What are the pros or cons of using this prefix?

Read the article
Error while zipping files with unicode characters in names with Win7's "send to > compressed (zipped) folder"

- by user1306322

When I try to zip files containing unicode characters in their names, such as © or ™, I get the following error: [Window Title] Compressed (zipped) Folders Error [Content] 'C:\Asd™.txt' cannot be compressed because it includes characters that cannot be used in a compressed folder, such as ™. You should rename this file or directory. [OK] This only became a problem when I reinstalled Windows 7. I probably had some resources necessary from this error to be resolved automatically, but it's almost clean installation now and I can't zip files. How do I fix this? UPD: Some time passed since I posted this question, I installed some of my usual applications, but the problem still exists and I'm not sure if it can be fixed by installing some specific application from before.

Read the article
Cannot set "Language for Non-Unicode Programs" in Regional and Language Settings

- by cornjuliox

I'm trying to set the Language for Non-Unicode Programs from English to Japanese (I'm using Windows XP SP 3), but it won't let me. It looks like I've got the East Asian Language packs installed, but when I select "Japanese" from the drop-down box and hit "Apply" I get an error that says "Setup was unable to install the chosen locale. Please contact your system administrator". I'm already logged in as the administrator, and I've restarted several times but it still won't let me. Can anyone give me an idea as to how to solve this problem? Reinstalling Windows is absolutely out of the question.

Read the article
Problems with vim/locale as non-root user on Solaris

- by Lyle

I do some work on a Solaris 10 machine, and my .vimrc is set up to show unicode characters for tabs and line endings: set listchars=tab:?\ ,eol:¬ This works out of the box on my OS X machine. On Linux as well as Solaris I get the following error when I start vim: Error detected while processing /home/lhanson/.vimrc: line 17: E474: Invalid argument: listchars=tab:?~V?\ ,eol:¬ I solved this on my Linux box by setting LANG=en_US.utf8 ('locale -a' shows this as being an option). On Solaris, however, 'locale -a' shows the following: C POSIX iso_8859_1 Setting LANG to C or POSIX yields the same error, and even though iso_8859_1 probably wouldn't work it doesn't successfully change the locale anyway. As a non-root user, is there any way I can have my unicode characters show up?

Read the article
Excel transpose via paste

- by David Oneill

I want to transpose data in Excel. Normally, I cut the cells I need, and use paste special - transpose. However, sometimes when I do paste special, a box comes up asking me if I want to use unicode text vs normal text. How do I transpose this text? Is there a way to get past the unicode dialog box and get to the normal Paste special dialog box (that has the 'transpose' option)? Or, is there another simple way to transpose cells? transpose = flip rows and columns IE 1, 2, 3 becomes: 1 2 3

Read the article
How do I fix font corruption in Google Chrome 9.0.597.44beta in Windows XP?

- by snicker

I am not sure what is causing this problem, but I think it is related to unicode problems. Google Chrome, seemingly out of nowhere a month ago, stopped rendering unicode characters in certain fonts. IE this ?_? Looks fine in some fonts, but looks like this in others. Renders fine in other browsers. Most recently, I visited the FourSquare website and have complete font corruption. Here is IE vs Chrome Full Size What gives? Has anyone else seen this? How can I fix it?

Read the article
Converting an AnsiString to a Unicode String

- by jrodenhi

I'm converting a D2006 program to D2010. I have a value stored in a single byte per character string in my database and I need to load it into a control that has a LoadFromStream, so my plan was to write the string to a stream and use that with LoadFromStream. But it did not work. In studying the problem, I see an issue that tells me that I don't really understand how conversion from AnsiString to Unicode string works. Here is some code I am puzzling over: oStringStream := TStringStream.Create(sBuffer); sUnicodeStream := oPayGrid.sStream; //explicit conversion to unicode string iSize1 := StringElementSize(oPaygrid.sStream); iSize2 := StringElementSize(sUnicodeStream); oStringStream.WriteString(sUnicodeStream); When I get to the last line, iSize1 does equal 1 and iSize2 does equal 2, so that part is what I understood from my reading. But, on the last line, after I write the string to the stream, and look at the Bytes Property of the string, it shows this (the string starts as '16,159'): (49 {$31}, 54 {$36}, 44 {$2C}, 49 {$31}, 53 {$35}, 57 {$39} ... I was expecting that it might look something like (49 {$31}, 00 {$00}, 54 {$36}, 00 {$00}, 44 {$2C}, 00 {$00}, 49 {$31}, 00 {$00}, 53 {$35}, 00 {$00}, 57 {$39}, 00 {$00} ... I'm not getting the right results out of the LoadFromStream because it is reading from the stream two bytes at a time, but the data it is receiving is not arranged that way. What is it that I should do to give the LoadFromStream a well formed stream of data based on a unicode string? Thank you for your help.

Read the article
Why do Unicode characters show up properly in database, but as ? when printed in Java via Hibernate?

- by lupefiasco

I'm writing a webapp, and interfacing with MySQL using Hibernate 3.5. Using "?????? ?????????" as my test string, I can input the string and see that it is properly persisted into the database. However, when I later pull the value out of the database and print to the console as a String, I see "?????? ?????????". If I use new OutputStreamWriter(System.out,"UTF-8"); then I get "„Éá„Çp„ÇØ„Éà„ÉÉ„Éó ·Éò·Éú·Éí·Éö·Éò·É°·É£·É†·Éò"". Why don't I see the original string? These are my hibernate.cfg.xml settings: <property name="hibernate.connection.useUnicode"> true </property> <property name="hibernate.connection.characterEncoding"> UTF-8 </property> <property name="hibernate.connection.charSet"> UTF-8 </property> and this is my database connection string: hibernate.connection.url = jdbc:mysql://localhost/mydatabase?autoReconnect=true&useUnicode=true&characterEncoding=UTF-8

Read the article
Should UTF-16 be considered harmful?

- by Artyom

I'm going to ask what is probably quite a controversial question: "Should one of the most popular encodings, UTF-16, be considered harmful?" Why do I ask this question? How many programmers are aware of the fact that UTF-16 is actually a variable length encoding? By this I mean that there are code points that, represented as surrogate pairs, take more then one element. I know; lots of applications, frameworks and APIs use UTF-16, such as Java's String, C#'s String, Win32 APIs, Qt GUI libraries, the ICU Unicode library, etc. However, with all of that, there are lots of basic bugs in the processing of characters out of BMP (characters that should be encoded using two UTF-16 elements). For example, try to edit one of these characters: 𝄞 𝕥 𝟶 𠂊 You may miss some, depending on what fonts you have installed. These characters are all outside of the BMP (Basic Multilingual Plane). If you cannot see these characters, you can also try looking at them in the Unicode Character reference. For example, try to create file names in Windows that include these characters; try to delete these characters with a "backspace" to see how they behave in different applications that use UTF-16. I did some tests and the results are quite bad: Opera has problem with editing them Notepad can't deal with them correctly (delete for example) File names editing in Window dialogs in broken All QT3 applications can't deal with them. StackOverflow seems to remove these characters if edited directly in as Unicode characters, and only seems to allow them as HTML Unicode escapes. So... This was very simple test. Do you think that UTF-16 should be considered harmful?

Read the article
PHP PCRE differences on testing and hosting servers

- by Gary Pearman

Hi all, I've got the following regular expression that works fine on my testing server, but just returns an empty string on my hosted server. $text = preg_replace('~[^\\pL\d]+~u', $use, $text); Now I'm pretty sure this comes down to the hosting server version of PCRE not being compiled with Unicode property support enabled. The differences in the two versions are as follows: My server: PCRE version 7.8 2008-09-05 Compiled with UTF-8 support Unicode properties support Newline sequence is LF \R matches all Unicode newlines Internal link size = 2 POSIX malloc threshold = 10 Default match limit = 10000000 Default recursion depth limit = 10000000 Match recursion uses stack Hosting server: PCRE version 4.5 01-December-2003 Compiled with UTF-8 support Newline character is LF Internal link size = 2 POSIX malloc threshold = 10 Default match limit = 10000000 Match recursion uses stack Also note that the version on the hosting server (the same version PHP is compiled against) is pretty old. What confuses me though, is that pcretest fails on both servers from the command line with re> ~[^\\pL\d]+~u ** Unknown option 'u' although this regexp works fine when run from PHP on my server. So, I guess my questions are does the regular expression fail on the hosting server because of the lack of Unicode properties? Or is there something else that I'm missing? Thanks all, Gaz.

Read the article
SHGetFolderPath returns path with question marks in it

- by Colen

Hi, Our application calls ShGetFolderPath when it runs, to get the My Documents folder. This normally works great. However, for three users - ???????, Jörg and Jörgen (see if you can spot the pattern!) - the call returns some very strange results. For example, for ???????, the call returns: c:\Users\???????\Documents I assume there's some sort of character encoding shenanigan going on here, possibly related to Unicode, but I don't have any experience with that sort of thing. How can I get a useful path to the folder (and other related folders) out of windows, without grovelling through registry keys for the information? In an email to me, ??????? ("Dmitry"), told me his "my documents" folder was actually located here: C:\Users\43D6~1\Documents So I know there's a way to get a "normal" version of the path out of Windows, I just don't know what it is. Background: Our application is not unicode-aware, and uses standard "char *" strings. How can we get the "normal" path? I'm not opposed to calling the "unicode" version of the function, then converting it to "normal" text, if that's possible. Converting the application entirely to use unicode is not an option here (we don't have the time). Thanks.

Read the article
What is the correct JNA mapping for UniChar on Mac OS X?

- by Trejkaz

I have a C struct like this: struct HFSUniStr255 { UInt16 length; UniChar unicode[255]; }; I have mapped this in the expected way: public class HFSUniStr255 extends Structure { public UInt16 length; // UInt16 is just an IntegerType with length 2 for convenience. public /*UniChar*/ char[] unicode = new char[255]; //public /*UniChar*/ byte[] unicode = new byte[255*2]; //public /*UniChar*/ UInt16[] unicode = new UInt16[255]; public HFSUniStr255() { } public HFSUniStr255(Pointer pointer) { super(pointer); } } If I use this version, I get every second character of the string into my char[] ("aits D" for "Macintosh HD".) I am assuming that this is something to do with being on a 64-bit platform and JNA mapping the value to a 32-bit wchar_t but then chopping off the high 16 bits on each wchar_t on copying them back. If I use the byte[] version, I get data which decodes correctly using the UTF-16LE charset. If I use the UInt16[] version, I get the right code point for each character but it is then inconvenient to convert them back into a string. Is there some way I can define my type as char[], and yet have it convert correctly?

Read the article
php turn unicode http link into clickable

- by newinjs

Hello, i have unicode link needed to be turn into links. Is it possible to change unicode into clickable? Currently i'm using this piece of code to turn link into clickable function clickable_link($text) { $ret = ' ' . $text; $ret = preg_replace("#(^|[\n ])([\w]+?://[\w\#$%&~/.\-;:=,?@\[\]+]*)#is", "\\1<a class=\"hrefLink\" href=\"\\2\" target=\"_blank\">\\2</a>", $ret); $ret = preg_replace("#(^|[\n ])((www|ftp)\.[\w\#$%&~/.\-;:=,?@\[\]+]*)#is", "\\1<a class=\"hrefLink\" href=\"http://\\2\" target=\"_blank\">\\2</a>", $ret); $ret = preg_replace("#(^|[\n ])([a-z0-9&\-_.]+?)@([\w\-]+\.([\w\-\.]+\.)*[\w]+)#i", "\\1<a href=\"mailto:\\2@\\3\">\\2@\\3</a>", $ret); $ret = substr($ret, 1); return $ret; } Any help would be deeply appreciated.

Read the article
how to convert japanese characters to unicode?

- by TopCoder

can you point me tool to convert japanese characters to unicode ?

Read the article
How to convert from unicode to ASCII

- by Hanny

Is there any way to convert unicode values to ASCII?

Read the article
Postgres 8.1 pg_trgm plugin doesn’t work with unicode chars

- by Konsi

Is it possible to get the pg_trgm plugin in postgres server 8.1 to work with unicode chars?

Read the article

< Previous Page | 10 11 12 13 14 15 16 17 18 19 20 21 | Next Page >