Search Results

Search found 1649 results on 66 pages for 'unicode normalization'.

Page 4/66 | < Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12 | Next Page >

VB.NET - Convert Unicode in one TB to Shift-JIS in another TB

- by Yiu Korochko

Trying to develop a text editor, I've got two textboxes, and a button below each one. When the button below textbox1 is pressed, it is supposed to convert the Unicode text (intended to be Japanese) to Shift-JIS. The reason why I am doing this is because the software VOCALOID2 only allows ANSI and Shift-JIS encoding text to be pasted into the lyrics system. Users of the application normally have their keyboard set to change to Japanese already, but it types in Unicode. How can I convert Unicode text to Shift-JIS when SJIS isn't available in the System.Text.Encoding types?

Read the article
Converting datetime.ctime() values to Unicode

- by Malcolm

I would like to convert datetime.ctime() values to Unicode. Using Python 2.6.4 running under Windows I can set my locale to Spanish like below: import locale locale.setlocale(locale.LC_ALL, 'esp' ) Then I can pass %a, %A, %b, and %B to ctime() to get day and month names and abbreviations. import datetime dateValue = datetime.date( 2010, 5, 15 ) dayName = dateValue.strftime( '%A' ) dayName 's\xe1bado' How do I convert the 's\xe1bado' value to Unicode? Specifically what encoding do I use? I'm thinking I might do something like the following, but I'm not sure this is the right approach. codePage = locale.getdefaultlocale()[ 1 ] dayNameUnicode = unicode( dayName, codePage ) dayNameUnicode u's\xe1bado' Malcolm

Read the article
Output unicode strings in Windows console app

- by Andrew

Hi I was trying to output unicode string to a console with iostreams and failed. I found this: Using unicode font in c++ console app and this snippet works. SetConsoleOutputCP(CP_UTF8); wchar_t s[] = L"èéøÞ???Sæca"; int bufferSize = WideCharToMultiByte(CP_UTF8, 0, s, -1, NULL, 0, NULL, NULL); char* m = new char[bufferSize]; WideCharToMultiByte(CP_UTF8, 0, s, -1, m, bufferSize, NULL, NULL); wprintf(L"%S", m); However, I did not find any way to output unicode correctly with iostreams. Any suggestions?

Read the article
C Writting to a file in Unicode

- by Lefteris

Hey all, I am having some problems writting to a file in unicode inside my c program. I am trying to write a unicode Japanese string to a file. When I go to check the file though it is empty. If I try a non-unicode string it works just fine. What am I doing wrong? setlocale(LC_CTYPE, ""); FILE* f; f = _wfopen(COMMON_FILE_PATH,L"w"); fwprintf(f,L"???"); fclose(f);

Read the article
Writing to a file in Unicode

- by Lefteris

I am having some problems writing to a file in unicode inside my c program. I am trying to write a unicode Japanese string to a file. When I go to check the file though it is empty. If I try a non-unicode string it works just fine. What am I doing wrong? setlocale(LC_CTYPE, ""); FILE* f; f = _wfopen(COMMON_FILE_PATH,L"w"); fwprintf(f,L"???"); fclose(f); Oh about my system: I am running Windows. And my IDE is Visual Studio 2008.

Read the article
Unicode generated by toEscapedUnicode method is without spaces

- by vishvesha

For this word ????????????? the Unicode is== \u0938\u0941\u0916\u091A\u0948\u0928\u093E\u0928\u0940 \u0930\u0940\u091D\u0941\u092E\u0932 \u091C\u093F\u0935\u0924\u0930\u093E\u092E and look it has spaces before \u0930 and \u091C But when I am trying in my code String tempString=Strings.toEscapedUnicode(strString); This method to convert to Unicode gives a result without spaces: \u0938\u0941\u0916\u091A\u0948\u0928\u093E\u0928\u0940\u0930\u0940\u091D\u0941\u092E\u0932\u091C\u093F\u0935\u0924\u0930\u093E\u092E and that's why they are not matching. My 'toEscapeUnicode' method generates Unicode without spaces. I want the spaces, so how to do it?

Read the article
How to write to a file in Unicode in Vb.Net

- by Craig Johnston

How should I modify the following Vb.Net code to write str to the file in unicode? Do I need to convert str to Unicode before writing to the file? Using sw As StreamWriter = New StreamWriter(fname) sw.Write(str) sw.Close() End Using

Read the article
Python 3.1, trying to unescape html/unicode/xml characters

- by Sho Minamimoto

I found my problem here, but there is only an answer for Python 2.6. Basically, I need to unescape strings such as this: 'a altieri_joão' to show the proper characters. http://stackoverflow.com/questions/990169/how-do-convert-unicode-escape-sequences-to-unicode-characters-in-a-python-string I need to do this in 3.1, but when I try print (u'a altieri_joão') if gives me invalid syntax. And when I try name.decode('latin-1') it says 'str' has no method 'decode'.

Read the article
complete, monospaced Unicode font?

- by nachik

I'm looking for a good programming font that lets me add comments and string literals in Unicode, usually Japanese and Chinese along with some Latin and Cyrillic languages. So far the situation seems to be "complete, monospace, free, pick 2" and Google is failing me with this (maybe because there are no good ones?). The best I found is Arial Unicode but it's not monospace, which is a big nuisance for me and the editors I use. Not to mention Python indentation when I'm coding Python. (Links, edits are welcome)

Read the article
Using a Unicode format for Python's `time.strftime()`

- by Hosam Aly

I am trying to call Python's time.strftime() function using a Unicode format string: u'%d\u200f/%m\u200f/%Y %H:%M:%S' (\u200f is the "Right-To-Left Mark" (RLM).) However, I am getting an exception that the RLM character cannot be encoded into ascii: UnicodeEncodeError: 'ascii' codec can't encode character u'\u200f' in position 2: ordinal not in range(128) I have tried searching for an alternative but could not find a reasonable one. Is there an alternative to this function, or a way to make it work with Unicode characters?

Read the article
Regex Not Matching Unicode

- by cam

How would I go about using Regex to match Unicode strings? I'm loading in a couple keywords from a text file and using them with Regex on another file. The keywords both contain unicode (such á, etc). I'm not sure where the problem is. Is there some option I have to set?

Read the article
Converting Unicode strings to escaped ascii string

- by Ali

How can I convert this string: This string contains the unicode character Pi(p) into an escaped ascii string: This string contains the unicode character Pi(\u03a0) and vice versa ? The current Encoding available in C#, converts the p character into "?". I need to preserve that character.

Read the article
Unicode replacement characters for text matching

- by Christian Harms

I have some fun with unicode text sources (all correct encodet) and I want to match names. The classic problem, one source comes correctly, an other has more flatten names: "Elblag" vs. "Elblag" (see the character a) How can I "flatten" a, á, â or à to a for better matching? Are there unicode to ascii- matching tables?

Read the article
Java Unicode encoding

- by Marcus

A Java char is 2 bytes (max size of 65,536) but there are 95,221 Unicode characters. Does this mean that you can't handle certain Unicode characters in a Java application? Does this boil down to what character encoding you are using?

Read the article
passing unicode string from C# exe to C++ DLL

- by Martin

Using this function in my C# exe, I try to pass a Unicode string to my C++ DLL: [DllImport("Test.dll", CharSet = CharSet.Unicode, CallingConvention = CallingConvention.StdCall)] public static extern int xSetTestString(StringBuilder xmlSettings); This is the function on the C++ DLL side: __declspec(dllexport) int xSetTestString(char* pSettingsXML); Before calling the function in C#, I do a MessageBox.Show(string) and it displays all characters properly. On the C++ side, I do: OutputDebugStringW((wchar_t*)pString);, but that shows that the non-ASCII characters were replaced by '?'.

Read the article
Python, Unicode, and the Windows console

- by James Sulak

When I try to print a Unicode string in a windows console, I get a "UnicodeEncodeError: 'charmap' codec can't encode character ...." error. I assume this is because the Windows console does not accept Unicode-only characters. What's the best way around this? Is there any way I can make Python automatically print a "?" instead of failing in this situation? Edit: I'm using Python 2.5.

Read the article
Django approximate matching of unicode strings with ascii equivalents

- by c

I have the following model and instance: class Bashable(models.Model): name = models.CharField(max_length=100) >>> foo = Bashable.objects.create(name=u"piñata") Now I want to be able to search for objects, but using ascii characters rather than unicode, something like this: >>> Bashable.objects.filter(name__lookslike="pinata") Is there a way in Django to do this sort of approximate string matching, using ascii stand-ins for the unicode characters in the database? Here is a related question, but for Apple's Core Data.

Read the article
Unicode characters and IE

- by findmeahamper

I just built a site that relies on certain Unicode characters like Ⓐ, but have just realized that IE doesn't show these characters? Is there some meta tag to get the browser to show it or how do you update IE to handle these Unicode characters?

Read the article
Entering Unicode characters in LaTeX

- by John D. Cook

How do I enter Unicode characters in LaTeX? What packages do I need to install and what escape sequence do I type to specify Unicode characters in an ASCII source file?

Read the article
IIS 6.0 Server and Unicode Characters

- by Srikanth

We are performing a pen test on a simple asp application that uses MS SQL Database. It seems for the authentication they are using dynamic constructed queries but escaping single qoutes. When we use Unicode quotes like %uFFO7,%u02b9 etc we are able to successfully inject SQL injections. Want to understand is it more a kind of configuration issue of IIS server to cannonicalize Unicode characters or the way the validation function to escape single quotes is written is the cause of the problem?

Read the article
Displaying unicode character U+2661 ("White Heart Suit") in Windows 7

- by Jordan

I can't get this character: ? to display properly in Windows Explorer, it instead shows up as a symbol of three lines, similar to this ?. The strangest thing is that if i use the heart symbol beside another unusual symbol, such as one of these: ??????, it will display correctly as a heart; yet if I delete the symbol which is next to the heart it will revert to the 3 lines symbol. All of these other symbols display correctly when used alone. Does anybody else have this problem? Is it possible that Windows has 2 different characters listed for U+2661? Thanks for any help

Read the article
SED and Unicode Quotation Marks

- by Jonathan Patt

When testing against this string: “… so that’s that… ” The following should, but does not, match the opening quotation mark and following ellipsis and space: sed "s/\([“‘\"']…\) /\1/g" However, this correctly matches the second ellipsis and following space and closing quotation mark: sed "s/… \([”’\"'.!?]\)/…\1/g" If I split the first apart it works fine: sed -e "s/\(“…\) /\1/g" \ -e "s/\(‘…\) /\1/g" \ -e "s/\(\"…\) /\1/g" \ -e "s/\('…\) /\1/g" So why doesn't it work when it's grouped together? Especially when it works fine with the closing quotation marks.

Read the article
Telugu (unicode) font rendering in emacs

- by Prakash K

[I asked the following question in stackoverflow, and I have been redirected here. I hope I can get some answers here. My question at stackoverflow had two small images showing the example rendering of text. As a new user at superuser, I am not being allowed to include them here, nor I am allowed to post more than one hyperlink. And, I don't have enough reputation on SO to migrate that question. Please look at the stackoverflow question for the images. Sorry about the inconvenience.] I sometimes edit text in telugu language. However, when I open the file (UTF-8 encoded) in GNU emacs (version 23.1.50.1 on Ubuntu Jaunty) the text rendering is incorrect. The same text file opened in gedit is rendered correctly. Here's a snippet: ????????? ???? ???? ???????? rendred in gedit: Please see the SO question for the image showing telugu text rendering in gedit And, the emacs rendering of the same text: Please see the SO question for the image showing telugu text rendering in emacs Wherever glyphs need to be composited (not sure if it's the right word), emacs (or whatever library it uses) is not doing it right. Is there anyway to fix this? Perhaps tuning some setting in my configuration? Any ideas, please?

Read the article
Telugu (unicode) font rendering in emacs

- by Prakash K

I sometimes edit text in telugu language. However, when I open the file (UTF-8 encoded) in GNU emacs (version 23.1.50.1 on Ubuntu Jaunty) the text rendering is incorrect. The same text file opened in gedit is rendered correctly. Here's a snippet: ????????? ???? ???? ???????? rendred in gedit: And, the emacs rendering of the same text: Wherever glyphs need to be composited (not sure if it's the right word), emacs (or whatever library it uses) is not doing it right. Is there anyway to fix this? Perhaps tuning some setting in my configuration? Any ideas, please?

Read the article
Cannot use Alt code for Unicode character insertion any more

- by Bergi

I've been using the Alt code for the ellipsis, 8230, for some time now, in several applications. A few days ago it stopped working, and & is displayed instead of … when pressing Alt+8+2+3+0 (on numpad). This happened both on my desktop and on my laptop (where I use it with Fn). Both run on 64bit-Win-7 with code page 850, and both might have recently updated Windows and Opera 12. What could be the reason this input method got disabled, and how do I switch it back? Btw, I just found out that Alt+0+1+3+3 does work.

Read the article

< Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12 | Next Page >