Search Results

Search found 1474 results on 59 pages for 'unicode'.

Page 20/59 | < Previous Page | 16 17 18 19 20 21 22 23 24 25 26 27  | Next Page >

  • How to use Unicode in C++?

    - by Dox
    Assuming a very simple program that: -ask a name. -store the name in a variable. -display the variable content on the screen. It's so simple that is the first thing that one learns. But my problem is that I don't know how to do the same thing if I enter the name using japanese characters. So, if you know how to do this in C++, please show me an example (that I can compile and test) Thanks.

    Read the article

  • python check value not in unicode list

    - by John
    Hi, I have a list and a value and want to check if the value is not in the list. list = [u'first record', u'second record'] value = 'first record' if value not in list: do something however this is not working and I think it has something to do with the list values having a u at the start, how can I fix this? And before someone suggests the list is returned from Django queryset so I can't just take the u out of the code :) Thanks

    Read the article

  • Square Bullet in XSL-FO

    - by Igman
    I am attempting to create a list in XSL-FO using a square bracket. I have been able to get it working using the standard unicode bullet character (&#8226;) but I just can't seem to get it working for square brackets. I have tried using &#9632;, but that does not seem to work. It is important that i can get the square bullets working because I am matching an existing file format.Any help in getting this working would be greatly appreciated. <fo:list-item-label end-indent="label-end()"> <fo:block>&#x2022;</fo:block> </fo:list-item-label>

    Read the article

  • Can HTTP URIs have non-ASCII characters?

    - by Cheeso
    I tried to find this in the relevant RFC, IETF RFC 3986, but couldn't figure it. Do URIs for HTTP allow Unicode, or non-ASCII of any kind? Can you please cite the section and the RFC that supports your answer. NB: For those who might think this is not programming related - it is. It's related to an ISAPI filter I'm building. Addendum I've read section 2.5 of RFC 3986. But RFC 2616, which I believe is the current HTTP protocol, predates 3986, and for that reason I'd suppose it cannot be compliant with 3986. Furthermore, even if or when the HTTP RFC is updated, there still will be the issue of rationalization - in other words, does an HTTP URI support ALL of the RFC3986 provisos, including whatever is appropriate to include non US-ASCII characters?

    Read the article

  • printf field width : bytes or chars?

    - by leonbloy
    The printf/fprintf/sprintf family supports a width field in its format specifier. I have a doubt for the case of (non-wide) char arrays arguments: Is the width field supposed to mean bytes or characters? What is the (correct-de facto) behaviour if the char array corresponds to (say) a raw UTF-8 string? (I know that normally I should use some wide char type, that's not the point) For example, in char s[] = "ni\xc3\xb1o"; // utf8 encoded "niño" fprintf(f,"%5s",s); Is that function supposed to try to ouput just 5 bytes (plain C chars) (and you take responsability of misalignments or other problems if two bytes results in a textual characters) ? Or is it supposed to try to compute the length of "textual characters" of the array? (decodifying it... according to the current locale?) (in the example, this would amount to find out that the string has 4 unicode chars, so it would add a space for padding).

    Read the article

  • Why Read In UTF-16LE File Won't Convert "\r\n" Into "\n" In Windows

    - by Dbger
    I am using Perl to read UTF-16LE files in Windows 7. If I read in an ascii file with following code: open CUR_FILE, "<", $asciiFile; Then each "\r\n" in file will be converted into a "\n" in memory; if I read in an UTF-16LE(windows 1200) file with following code: open CUR_FILE, "<:encoding(UTF-16LE)", $utf16leFile; Then "\r\n" will keep unchanged. This inconsistency cause problems when I trying to regexp lines with line breaks. My questions is: Is this how unicode works in Perl & Windows? Or Am I using the wrong code? Thanks so much!

    Read the article

  • pound sign is not working in mail content using java.mail package?

    - by kumar kasimala
    HI all, I am using javax.mail packaage MINEMESSAGE,MimeMultipart class to send a mail, but even though I mention type utf-8, unicode characters are not working in body text. like pound sign is not working. please help what to do. here is my message headers To: [email protected] Message-ID: <875158456.1.1294898905049.JavaMail.root@nextrelease> Subject: My Site Free Trial - 5 days left MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_Part_0_1733237863.1294898905008" MyHeaderName: myHeaderValue Date: Thu, 13 Jan 2011 06:08:25 +0000 (UTC) ------=_Part_0_1733237863.1294898905008 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable

    Read the article

  • Exclude all normal alphanumeric character from a mixed chinese-and-alphanumeric character word list

    - by Christine
    I have a list of chinese characters and normal alphanumeric characters, mixed together, and I want to get rid of any element that contains an alphanumeric character. Is there a simple way to do this? If I simply exclude any element that contains an alphanumeric character, I get no result because the chinese characters (in utf-8) are similarly affected. I also tried [w for w in fourchar if w.startswith("\x")] to try to get the chinese characters but I'm not sure if that's valid at all. I'm having difficulty figuring out what the alphanumeric characters are in unicode. Thanks for any help!

    Read the article

  • Python + PostgreSQL + strange ascii = UTF8 encoding error

    - by Claudiu
    I have ascii strings which contain the character "\x80" to represent the euro symbol: >>> print "\x80" € When inserting string data containing this character into my database, I get: psycopg2.DataError: invalid byte sequence for encoding "UTF8": 0x80 HINT: This error can also happen if the byte sequence does not match the encodi ng expected by the server, which is controlled by "client_encoding". I'm a unicode newbie. How can I convert my strings containing "\x80" to valid UTF-8 containing that same euro symbol? I've tried calling .encode and .decode on various strings, but run into errors: >>> "\x80".encode("utf-8") Traceback (most recent call last): File "<pyshell#14>", line 1, in <module> "\x80".encode("utf-8") UnicodeDecodeError: 'ascii' codec can't decode byte 0x80 in position 0: ordinal not in range(128)

    Read the article

  • Characters in string changed after downloading HTML from the internet.

    - by Callum Rogers
    Using the following code, I can download the HTML of a file from the internet: WebClient wc = new WebClient(); // .... string downloadedFile = wc.DownloadString("http://www.myurl.com/"); However, sometimes the file contains "interesting" characters like é to é, ? to ↠and ????? to フシギダãƒ. I think it may be something to do with different unicode types or something, as each character gets changed into 2 new ones, perhaps each character being split in half but I have very little knowledge in this area. What do you think is wrong?

    Read the article

  • Interpretation of Greek characters by FOP

    - by Geek
    Hi, Can you please help me interpret the Greek Characters with HTML display as HTML= & #8062; and Hex value 01F7E Details of these characters can be found on the below URL http://www.isthisthingon.org/unicode/index.php?page=01&subpage=F&hilite=01F7E When I run this character in Apache FOP, they give me an ArrayIndexOut of Bounds Exception Caused by: java.lang.ArrayIndexOutOfBoundsException: -1 at org.apache.fop.text.linebreak.LineBreakUtils.getLineBreakPairProperty(LineBreakUtils.java:668) at org.apache.fop.text.linebreak.LineBreakStatus.nextChar(LineBreakStatus.java:117) When I looked into the FOP Code, I was unable to understand the need for lineBreakProperties[][] Array in LineBreakUtils.java. I also noticed that FOP fails for all the Greek characters mentioned on the above page which are non-displayable with the similar error. What are these special characters ? Why is their no display for these characters are these Line Breaks or TAB’s ? Has anyone solved a similar issue with FOP ?

    Read the article

  • Encoding issue with form and HTML Purifier / MySQL

    - by Andrew Heath
    Driving me nuts... Page with form is encoded as Unicode (UTF-8) via: <meta http-equiv="content-type" content="text/html; charset=utf-8"> entry column in database is text utf8_unicode_ci copying text from a Word document with " in it, like this: “1922.” is insta-fail and ends up in the database as â??1922.â?? (typing new data into the form, including " works fine... it's cut and pasting from Word...) PHP steps behind the scenes are: grab value from POST run through HTML Purifier default settings run through mysql_real_escape_string insert query into dbase Help?

    Read the article

  • Is it safe to convert varchar and char into nvarchar and nchar in SQL Server?

    - by Svish
    We currently have a number of columns in the database which are of type varchar. The application that uses them are in C# and it uses Linq2Sql for the communication (or what to call it). We would like to support unicode characters, which means we would have to convert the varchar columns into nvarchar. Is this a safe operation? Is it just a matter of changing the column type and updating the dbml file, or are there more stuff that needs to be done? Any changes in the C# code? Do I need to somehow convert the text that already exist in the database manually, or is it handled for me?

    Read the article

  • Convert byte array to understandable String

    - by Ender
    I have a program that handles byte arrays in Java, and now I would like to write this into a XML file. However, I am unsure as to how I can convert the following byte array into a sensible String to write to a file. Assuming that it was Unicode characters I attempted the following code: String temp = new String(encodedBytes, "UTF-8"); Only to have the debugger show that the encodedBytes contain "\ufffd\ufffd ^\ufffd\ufffd-m\ufffd\ufffd\/ufffd \ufffd\ufffdIA\ufffd\ufffd". The String should contain a hash in alphanumerical format. How would I turn the above String into a sensible String for output?

    Read the article

  • How do I convert from a possibly Windows 1252 'ANSI' encoded uploaded file to UTF8 in .NET?

    - by qqq123
    I've got a FileUpload control in an ASP.NET web page which is used to upload a file, the contents of which (in a stream) are processed in the C# code behind and output on the page later, using HtmlEncode. But, some of this output is becoming mangled, specifically the symbol '£' is output as the Unicode FFFD REPLACEMENT CHARACTER. I've tracked this down to the input file, which is Windows 1252 ('ANSI') encoded. The question is, How do I determine whether the file is encoded as 1252 or UTF8? It could be either, and How do I convert it to UTF8 if it is in Windows 1252, preserving the symbol £ etc? I've looked online but cannot find a satisfactory answer.

    Read the article

  • Large static arrays are slowing down class load, need a better/faster lookup method

    - by Visualize
    I have a class with a couple static arrays: an int[] with 17,720 elements a string[] with 17,720 elements I noticed when I first access this class it takes almost 2 seconds to initialize, which causes a pause in the GUI that's accessing it. Specifically, it's a lookup for Unicode character names. The first array is an index into the second array. static readonly int[] NAME_INDEX = { 0x0000, 0x0001, 0x0005, 0x002C, 0x003B, ... static readonly string[] NAMES = { "Exclamation Mark", "Digit Three", "Semicolon", "Question Mark", ... The following code is how the arrays are used (given a character code). [Note: This code isn't a performance problem] int nameIndex = Array.BinarySearch<int>(NAME_INDEX, code); if (nameIndex > 0) { return NAMES[nameIndex]; } I guess I'm looking at other options on how to structure the data so that 1) The class is quickly loaded, and 2) I can quickly get the "name" for a given character code. Should I not be storing all these thousands of elements in static arrays?

    Read the article

  • Understanding character encoding in typical Java web app

    - by Marcus
    Some pseudocode from a typical web app: String a = "A bunch of text"; //UTF-16 saveTextInDb(a); //Write to Oracle VARCHAR(15) column String b = readTextFromDb(); //UTF-16 out.write(b); //Write to http response In the first line we create a Java String which uses UTF-16. When you save to Oracle VARCHAR(15) does Oracle also store this as UTF-16? Does the length of an Oracle VARCHAR refer to number of Unicode characters (and not number of bytes)? And then when we write b to the ServletResponse is this being written as UTF-16 or are we by default converting to another encoding like UTF-8?

    Read the article

  • Text to a PNG on App Engine (Python)

    - by Bemmu
    Note: I am cross-posting this from App Engine group because I got no answers there. As part of my site about Japan, I have a feature where the user can get a large PNG for use as desktop background that shows the user's name in Japanese. After switching my site hosting entirely to App Engine, I removed this particular feature because I could not find any way to render text to a PNG using the image API. In other words, how would you go about outputting an unicode string on top of an image of known dimensions (1024x768 for example), so that the text will be as large as possible horizontally, and centered vertically? Is there a way to do this is App Engine, or is there some external service besides App Engine that could make this easier for me, that you could recommend (besides running ImageMagick on your own server)?

    Read the article

  • Best way to convert between [Char] and [Word8]?

    - by cmars232
    I'm new to Haskell and I'm trying to use a pure SHA1 implementation in my app (Data.Digest.Pure.SHA) with a JSON library (AttoJSON). AttoJSON uses Data.ByteString.Char8 bytestrings, SHA uses Data.ByteString.Lazy bytestrings, and some of my string literals in my app are [Char]. This article seems to indicate this is something still being worked out in the Haskell language/Prelude: http // hackage.haskell.org/trac/haskell-prime/wiki/CharAsUnicode And this one lists a few libraries but its a couple years old: http //blog.kfish.org/2007/10/survey-haskell-unicode-support.html [Links broken because SO doesn't trust me -- whatever...] What is the current best way to convert between these types, and what are some of the tradeoffs? I don't want to pick something that is obsolete... Thanks!

    Read the article

  • Efficient mapping for a particular finite integer set

    - by R..
    I'm looking for a small, fast (in both directions) bijective mapping between the following list of integers and a subset of the range 0-127: 0x200C, 0x200D, 0x200E, 0x200F, 0x2013, 0x2014, 0x2015, 0x2017, 0x2018, 0x2019, 0x201A, 0x201C, 0x201D, 0x201E, 0x2020, 0x2021, 0x2022, 0x2026, 0x2030, 0x2039, 0x203A, 0x20AA, 0x20AB, 0x20AC, 0x20AF, 0x2116, 0x2122 One obvious solution is: y = x>>2 & 0x40 | x & 0x3f; x = 0x2000 | y<<2 & 0x100 | y & 0x3f; Edit: I was missing some of the values, particularly 0x20Ax, which don't work with the above. Another obvious solution is a lookup table, but without making it unnecessarily large, a lookup table would require some bit rearrangement anyway and I suspect the whole task can be better accomplished with simple bit rearrangement. For the curious, those magic numbers are the only "large" Unicode codepoints that appear in legacy ISO-8859 and Windows codepages.

    Read the article

  • Why does SQL Server consider N'????' and N'???' to be equal?

    - by Aidan Ryan
    We are testing our application for Unicode compatibility and have been selecting random characters outside the Latin character set for testing. On both Latin and Japanese-collated systems the following equality is true (U+3422): N'????' = N'???' but the following is not (U+30C1): N'????' = N'???' This was discovered when a test case using the first example (using U+3422) violated a unique index. Do we need to be more selective about the characters we use for testing? Obviously we don't know the semantic meaning of the above comparisons. Would this behavior be obvious to a native speaker?

    Read the article

  • What is Causing This Memory Leak in Delphi?

    - by lkessler
    I just can't figure out this memory leak that EurekaLog is reporting for my program. I'm using Delphi 2009. Here it is: Memory Leak: Type=Data; Total size=26; Count=1; The stack is: System.pas _UStrSetLength 17477 System.pas _UStrCat 17572 Process.pas InputGedcomFile 1145 That is all there is in the stack. EurekaLog is pointing me to the location where the memory that was not released was first allocated. According to it, the line in my program is line 1145 of InputGedcomFile. That line is: CurStruct0Key := 'HEAD' + Level0Key; where CurStruct0Key and Level0Key are simply defined in the procedure as local variables that should be dynamically handled by the Delphi memory manager when entering and leaving the procedure: var CurStruct0Key, Level0Key: string; So now I look at the _UStrCat procedure in the System Unit. Line 17572 is: CALL _UStrSetLength // Set length of Dest and I go to the _UStrSetLength procedure in the System Unit, and the relevant lines are: @@isUnicode: CMP [EAX-skew].StrRec.refCnt,1 // !!! MT safety JNE @@copyString // not unique, so copy SUB EAX,rOff // Offset EAX "S" to start of memory block ADD EDX,EDX // Double length to get size JO @@overflow ADD EDX,rOff+2 // Add string rec size JO @@overflow PUSH EAX // Put S on stack MOV EAX,ESP // to pass by reference CALL _ReallocMem POP EAX ADD EAX,rOff // Readjust MOV [EBX],EAX // Store MOV [EAX-skew].StrRec.length,ESI MOV WORD PTR [EAX+ESI*2],0 // Null terminate TEST EDI,EDI // Was a temp created? JZ @@exit PUSH EDI MOV EAX,ESP CALL _LStrClr POP EDI JMP @@exit where line 17477 is the "CALL _ReallocMem" line. So then what is the memory leak? Surely a simple concatenate of a string constant to a local string variable should not be causing a memory leak. Why is EurekaLog pointing me to the ReallocMem line in a _UStrSetLength routine that is part of Delphi? This is Delphi 2009 and I am using the new unicode strings. Any help or explanation here will be much appreciated.

    Read the article

  • Emacs Lisp: how to set encoding for call-process

    - by RamyenHead
    I thought I knew how to set coding-system (or encoding): use process-coding-system-alist. Apparently, it's not working. ;; -*- coding: utf-8 -*- (require 'cl) (let ((process-coding-system-alist '("cygwin/bin/bash" . (utf-8-dos . utf-8-unix)))) (setq my-words (list "Lilo" "?_?" "_?" "?_" "?" "Stitch") my-cygwin-bash "C:/cygwin/bin/bash.exe" my-outbuf (get-buffer-create "*my cygwin bash echo test*") ) (with-current-buffer my-outbuf (goto-char (point-max)) (loop for word in my-words do (insert (concat "echo " word "\n")) (call-process my-cygwin-bash nil my-outbuf nil "-c" (concat "echo " word))) ) (display-buffer my-outbuf) ) Running the above code, the output is this: echo Lilo Lilo echo ?_? /usr/bin/bash: -c: line 0: unexpected EOF while looking for matching `"' /usr/bin/bash: -c: line 1: syntax error: unexpected end of file echo _? /usr/bin/bash: -c: line 0: unexpected EOF while looking for matching `"' /usr/bin/bash: -c: line 1: syntax error: unexpected end of file echo ?_ /usr/bin/bash: $'echo \346\267\205?': command not found echo ? /usr/bin/bash: -c: line 0: unexpected EOF while looking for matching `"' /usr/bin/bash: -c: line 1: syntax error: unexpected end of file echo Stitch Stitch Anything sent to cygwin in unicode is failing (MS Windows, Korean).

    Read the article

  • JSF ISO-8859-2 charset

    - by Vladimir
    Hi! I have problem with setting proper charset on my jsf pages. I use MySql db with latin2 (ISO-8859-2 charset) and latin2_croatian_ci collation. But, I have problems with setting values on backing managed bean properties. Page directive on top of my page is: <%@ page language="java" pageEncoding="ISO-8859-2" contentType="text/html; charset=ISO-8859-2" %> In head I included: <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-2"> And my form tag is: <h:form id="entityDetails" acceptcharset="ISO-8859-2"> I've created and registered Filter in web.xml with following doFilter method implementation: public void doFilter(ServletRequest request, ServletResponse response, FilterChain chain) throws IOException, ServletException { request.setCharacterEncoding("ISO-8859-2"); response.setCharacterEncoding("ISO-8859-2"); chain.doFilter(request, response); } But, i.e. when I set managed bean property through inputText, all special (unicode) characters are replaced with '?' character. I really don't have any other ideas how to set charset to pages to perform well. Any suggestions? Thanks in advance.

    Read the article

  • DjangoUnicodeDecodeError while storing pickle'd data.

    - by Jack M.
    I've got a simple dict object I'm trying to store in the database after it has been run through pickle. It seems that Django doesn't like trying to encode this error. I've checked with MySQL, and the query isn't even getting there before it is throwing the error, so I don't believe that is the problem. The dict I'm storing looks like this: { 'ordered': [ { 'value': u'First\xd1ame Last\xd1ame', 'label': u'Full Name' }, { 'value': u'123-456-7890', 'label': u'Phone Number' }, { 'value': u'[email protected]', 'label': u'Email Address' } ], 'cleaned_data': { u'Phone Number': u'123-456-7890', u'Full Name': u'First\xd1ame Last\xd1ame', u'Email Address': u'[email protected]' }, 'post_data': <QueryDict: { u'Phone Number': [u'1234567890'], u'Full Name_1': [u'Last\xd1ame'], u'Full Name_0': [u'First\xd1ame'], u'Email Address': [u'[email protected]'] }>, 'user': <User: itis> } The error that gets thrown is: 'utf8' codec can't decode bytes in position 52-53: invalid data. Position 52-53 is the first instance of \xd1 (Ñ) in the pickled data. So far, I've dug around StackOverflow and found a few questions where the database encoding for the objects was wrong. This doesn't help me because there is no MySQL query yet. This is happening before the database. Google also didn't help much when searching for unicode errors on pickled data. It is probably worth mentioning that if I don't use the Ñ, this code works fine.

    Read the article

< Previous Page | 16 17 18 19 20 21 22 23 24 25 26 27  | Next Page >