Search Results

Search found 36172 results on 1447 pages for 'unicode string'.

Page 95/1447 | < Previous Page | 91 92 93 94 95 96 97 98 99 100 101 102  | Next Page >

  • How do I create JavaScript escape sequences in PHP?

    - by ordinarytoucan
    I'm looking for a way to create valid UTF-16 JavaScript escape sequence characters (including surrogate pairs) from within PHP. I'm using the code below to get the UTF-32 code points (from a UTF-8 encoded character). This works as JavaScript escape characters (eg. '\u00E1' for 'á') - until you get into the upper ranges where you get surrogate pairs (eg '??' comes out as '\u1D715' but should be '\uD835\uDF15')... function toOrdinal($chr) { if (ord($chr{0}) >= 0 && ord($chr{0}) <= 127) { return ord($chr{0}); } elseif (ord($chr{0}) >= 192 && ord($chr{0}) <= 223) { return (ord($chr{0}) - 192) * 64 + (ord($chr{1}) - 128); } elseif (ord($chr{0}) >= 224 && ord($chr{0}) <= 239) { return (ord($chr{0}) - 224) * 4096 + (ord($chr{1}) - 128) * 64 + (ord($chr{2}) - 128); } elseif (ord($chr{0}) >= 240 && ord($chr{0}) <= 247) { return (ord($chr{0}) - 240) * 262144 + (ord($chr{1}) - 128) * 4096 + (ord($chr{2}) - 128) * 64 + (ord($chr{3}) - 128); } elseif (ord($chr{0}) >= 248 && ord($chr{0}) <= 251) { return (ord($chr{0}) - 248) * 16777216 + (ord($chr{1}) - 128) * 262144 + (ord($chr{2}) - 128) * 4096 + (ord($chr{3}) - 128) * 64 + (ord($chr{4}) - 128); } elseif (ord($chr{0}) >= 252 && ord($chr{0}) <= 253) { return (ord($chr{0}) - 252) * 1073741824 + (ord($chr{1}) - 128) * 16777216 + (ord($chr{2}) - 128) * 262144 + (ord($chr{3}) - 128) * 4096 + (ord($chr{4}) - 128) * 64 + (ord($chr{5}) - 128); } } How do I adapt this code to give me proper UTF-16 code points? Thanks!

    Read the article

  • Code to strip diacritical marks using ICU

    - by Paul J. Lucas
    Can somebody please provide some sample code to strip diacritical marks (i.e., replace characters having accents, umlauts, etc., with their unaccented, unumlauted, etc., character equivalents, e.g., every accented é would become a plain ASCII e) from a UnicodeString using the ICU library in C++? E.g.: UnicodeString strip_diacritics( UnicodeString const &s ) { UnicodeString result; // ... return result; } Assume that s has already been normalized. Thanks.

    Read the article

  • python unichr problem

    - by jacob
    I've got some problem with unichr() on my server. Please see below: On my server (Ubuntu 9.04): >>> print unichr(255) Traceback (most recent call last): File "<stdin>", line 1, in <module> UnicodeEncodeError: 'ascii' codec can't encode character u'\xff' in position 0: ordinal not in range(128) On my desktop (Ubuntu 9.10): >>> print unichr(255) ÿ I'm fairly new to python so I don't know how to solve this. Anyone care to help? Thanks.

    Read the article

  • to escape or not to escape: well formed XHTML with diacritics

    - by andresmh
    Say that you have a XHTML document in English but it has accented characters (e.g. meta name="author" content="José"). Let's say you have no control over the HTTP headers. Should the characters be replaced for their corresponding named entities (e.g. &aacute;, etc)? Should the doc type and the xml:lang attribute be set to English? I know I can check the W3C recommendation but I am asking more from a practical point of view.

    Read the article

  • ajax(search suggest) funny character problem

    - by Jason
    ajax(search suggest), if input funny character(like Ô) to search it, "?" is displayed in firefox or empty box is displayed in IE. i am using xmlhttp.open("post", "*****.asp", true); xmlhttp.setRequestHeader('Content-type','application/x-www-form-urlencoded; charset=UTF-8'); and there is <%@CODEPAGE=65001%> in *****.asp file how can i fix it?

    Read the article

  • Ruby character encoding problems in netbeans and command wíndow

    - by salgo60
    I use netbeans as development IDE and runs the application from cmd but have problems to display ISO 8859-1 characters like åäö correct in both cmd window and when I run the application from netbeans Question: What is best practice to set it up Right now I do @output.puts indent + "V" + 132.chr + "lkommen till Ruby Camping!" to get ä My environment chcp 65001 Active code page: 65001 ruby main.rb Source encoding: <Encoding:US-ASCII> Default external: #<Encoding:UTF-8> Default internal: nil Locale charmap: "CP65001" where I have in the code def self.printEncoding puts "Source encoding: #{__ENCODING__.inspect}" if defined? __ENCODING__ if defined? Environment::Encoding puts "Default external: #{Encoding.default_external.inspect}" puts "Default internal: #{Encoding.default_internal.inspect}" puts "Locale charmap: #{ Encoding.locale_charmap.inspect}" end puts "LANG environment variable: #{ENV['LANG'].inspect}" unless ENV['LANG'].nil? end ruby -v ruby 1.9.1p378 (2010-01-10 revision 26273) [i386-mingw32]

    Read the article

  • Why isn't wchar_t widely used in code for Linux / related platforms?

    - by Ninefingers
    This intrigues me, so I'm going to ask - for what reason is wchar_t not used so widely on Linux/Linux-like systems as it is on Windows? Specifically, the Windows API uses wchar_t internally whereas I believe Linux does not and this is reflected in a number of open source packages using char types. My understanding is that given a character c which requires multiple bytes to represent it, then in a char[] form c is split over several parts of char* whereas it forms a single unit in wchar_t[]. Is it not easier, then, to use wchar_t always? Have I missed a technical reason that negates this difference? Or is it just an adoption problem?

    Read the article

  • How to generate pdf files _with_ utf-8 multibyte characters using Zend Framework

    - by Sejanus
    Hello, I've got a "little" problem with Zend Framework Zend_Pdf class. Multibyte characters are stripped from generated pdf files. E.g. when I write aabccdee it becomes abcd with lithuanian letters stripped. I'm not sure if it's particularly Zend_Pdf problem or php in general. Source text is encoded in utf-8, as well as the php source file which does the job. Thank you in advance for your help ;) P.S. I run Zend Framework v. 1.6 and I use FONT_TIMES_BOLD font. FONT_TIMES_ROMAN does work

    Read the article

  • Convert UTF-16 to UTF-8 under Windows and Linux, in C

    - by DooriBar
    I was wondering if there is a recommended 'cross' Windows and Linux method for the purpose of converting strings from UTF-16LE to UTF-8? or one should use different methods for each environment? I've managed to google few references to 'iconv' , but for somreason I can't find samples of basic conversions, such as - converting a wchar_t UTF-16 to UTF-8. Anybody can recommend a method that would be 'cross', and if you know of references or a guide with samples, would very appreciate it. Thanks, Doori Bar

    Read the article

  • tchar safe functions -- count parameter for UTF-8 constants

    - by Dustin Getz
    I'm porting a library from char to TCHAR. the count parameter of this fragment, according to MSDN, is the number of multibyte characters, not the number of bytes. so, did I get this right? _tcsncmp(access, TEXT("ftp"), 3); //or do i want _tcsnccmp? "Supported on Windows platforms only, _mbsncmp and _mbsnbcmp are multibyte versions of strncmp. _mbsncmp will compare at most count multibyte characters and _mbsnbcmp will compare at most count bytes. They both use the current multibyte code page. _tcsnccmp and _tcsncmp are the corresponding Generic functions for _mbsncmp and _mbsnbcmp, respectively. _tccmp is equivalent to _tcsnccmp."

    Read the article

  • Space-saving character encoding for japanese?

    - by Constantin
    In my opinion a common problem: character encoding in combination with a bitmap-font. Most multi-language encodings have an huge space between different character types and even a lot of unused code points there. So if I want to use them I waste a lot of memory (not only for saving multi-byte text - i mean specially for spaces in my bitmap-font) - and VRAM is mostly really valuable... So the only reasonable thing seems to be: Using an custom mapping on my texture for i.e. UTF-8 characters (so that no space is waste). BUT: This effort seems to be same with use an own proprietary character encoding (so also own order of characters in my texture). In my specially case I got texture space for 4096 different characters and need characters to display latin languages as well as japanese (its a mess with utf-8 that only support generall cjk codepages). Had somebody ever a similiar problem (I really wonder, if not)? If theres already any approach? Edit: The same Problem is described here http://www.tonypottier.info/Unicode_And_Japanese_Kanji/ but it doesnt provide an real solution how to save these bitmapfont mappings to utf-8 space efficent. So any further help is welcome!

    Read the article

  • Ruby character encoding problems in netabenas and command wíndow

    - by salgo60
    I use netbeans as development IDE and runs the application from cmd but have problems to display ISO 8859-1 characters like åäö correct in both cmd window and when I run the application from netbeans Question: What is best practice to set it up Right now I do @output.puts indent + "V" + 132.chr + "lkommen till Ruby Camping!" to get ä My environment chcp 65001 Active code page: 65001 ruby main.rb Source encoding: <Encoding:US-ASCII> Default external: #<Encoding:UTF-8> Default internal: nil Locale charmap: "CP65001" where I have in the code def self.printEncoding puts "Source encoding: #{__ENCODING__.inspect}" if defined? __ENCODING__ if defined? Environment::Encoding puts "Default external: #{Encoding.default_external.inspect}" puts "Default internal: #{Encoding.default_internal.inspect}" puts "Locale charmap: #{ Encoding.locale_charmap.inspect}" end puts "LANG environment variable: #{ENV['LANG'].inspect}" unless ENV['LANG'].nil? end ruby -v ruby 1.9.1p378 (2010-01-10 revision 26273) [i386-mingw32]

    Read the article

  • Most Lite-Weight XML Parser with XPath and Wide-char Support

    - by Mystagogue
    I want a lite-weight C++ XML parser/DOM that: Can take UTF-8 as input, and parse into UTF-16. Maybe it does this directly (ideal!), or perhaps it provides a hook for the conversion (such as taking a custom stream object that does the conversion before parsing). Offers some XPath support. I've been looking at RapidXML, the Kranf xmlParser, and pugiXML. The first two of those might permit requirement #1 by way of a hook. The third, pugiXML, supports the #2 requirement. But none of those three fulfill both requirements. What is the smallest (free) library that can handle both requirements?

    Read the article

  • Weird SQL Server 2005 Collation difference between varchar() and nvarchar()

    - by richardtallent
    Can someone please explain this: SELECT CASE WHEN CAST('iX' AS nvarchar(20)) > CAST('-X' AS nvarchar(20)) THEN 1 ELSE 0 END, CASE WHEN CAST('iX' AS varchar(20)) > CAST('-X' AS varchar(20)) THEN 1 ELSE 0 END Results: 0 1 SELECT CASE WHEN CAST('i' AS nvarchar(20)) > CAST('-' AS nvarchar(20)) THEN 1 ELSE 0 END, CASE WHEN CAST('i' AS varchar(20)) > CAST('-' AS varchar(20)) THEN 1 ELSE 0 END Results: 1 1 On the first query, the nvarchar() result is not what I'm expecting, and yet removing the X make the nvarchar() sort happen as expected. (My original queries used the '' and N'' literal syntax to distinguish varchar() and nvarchar() rather than CAST() and got the same result.) Collation setting for the database is SQL_Latin1_General_CP1_CI_AS.

    Read the article

  • Emailing HTML from within an iPhone app is stopping at special characters

    - by user141146
    Hi, I have an iPhone app that will let users email some pre-determined text as HTML. I'm having a problem in that if the text contains special characters within the text (e.g., ampersand &, , <), the NSString variable that I use for sending the body of the email gets truncated at the special character. I'm not sure how to fix this (I tried using the method stringByAddingPercentEscapesUsingEncoding…but this hasn't fixed the problems). Thoughts on what I'm doing wrong / how to fix it? Here is sample code showing what I'm trying to do Thanks!!! - (void)send_an_email:(id)sender { NSString *subject_string = [NSString stringWithFormat:@"Summary of %@", commercial_name]; NSString *body_string = [NSString stringWithFormat:@"%@<br /><br />", [self.dl email_message]]; // email_message returns the body of text that should be shipped as html. If email_message contains special characters, the text truncates at the special character NSString *full_string = [NSString stringWithFormat:@"mailto:?to=&subject=%@&body=%@", [subject_string stringByAddingPercentEscapesUsingEncoding:NSUTF8StringEncoding], [body_string stringByAddingPercentEscapesUsingEncoding:NSUTF8StringEncoding]]; [[UIApplication sharedApplication] openURL:[[NSURL alloc] initWithString:full_string]]; }

    Read the article

  • Using Markov models to convert all caps to mixed case and related problems

    - by hippietrail
    I've been thinking about using Markov techniques to restore missing information to natural language text. Restore mixed case to text in all caps Restore accents / diacritics to languages which should have them but have been converted to plain ASCII Convert rough phonetic transcriptions back into native alphabets That seems to be in order of least difficult to most difficult. Basically the problem is resolving ambiguities based on context. I can use Wiktionary as a dictionary and Wikipedia as a corpus using n-grams and Markov chains to resolve the ambiguities. Am I on the right track? Are there already some services, libraries, or tools for this sort of thing? Examples GEORGE LOST HIS SIM CARD IN THE BUSH - George lost his SIM card in the bush tantot il rit a gorge deployee - tantôt il rit à gorge déployée

    Read the article

  • In MySQL how can I tell what character set a particular table is using?

    - by muudscope
    I have a large mysql table that I think might be using the wrong character set. If so I'll need to change it using ALTER TABLE mytable CONVERT TO CHARACTER SET utf8 But since this is a very large table, I'd rather not run this command unless I have to. So my question is, how can I ask mysql what the character set is on a particular table? I can call status in mysql to see the database's character set, but that doesn't necessarily mean all the tables have the same character set, right?

    Read the article

  • Facelets charset problem

    - by Vladimir
    Hi! In my earlier post there was a problem with JSF charset handling, but also the other part of the problem was MySQL connection parameters for inserting data into db. The problem was solved. But, I migrated the same application from JSP to facelets and the same problem happened again. Characters from input fields are replaced when inserting to database (c is replaced with Ä), but data inserted into db from sql scripts with proper charset are displayed correctly. I'm still using registered filter and page templates are used with head meta tag as following: <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-2"> If I insert into h:form tag the following attribute: acceptcharset="iso-8859-2" I get correct characters in Firefox, but not in IE7. Is there anything else I should do to make it work? Thanks in advance.

    Read the article

  • Google app engine error when I login.

    - by zjm1126
    i am using http://code.google.com/p/gaema/source/browse/#hg/demos/webapp, and this is my traceback: Traceback (most recent call last): File "D:\Program Files\Google\google_appengine\google\appengine\ext\webapp\__init__.py", line 510, in __call__ handler.get(*groups) File "D:\gaema\demos\webapp\main.py", line 31, in get google_auth.get_authenticated_user(self._on_auth) File "D:\gaema\demos\webapp\gaema\auth.py", line 641, in get_authenticated_user OpenIdMixin.get_authenticated_user(self, callback) File "D:\gaema\demos\webapp\gaema\auth.py", line 83, in get_authenticated_user url = self._OPENID_ENDPOINT + "?" + urllib.urlencode(args) File "D:\Python25\lib\urllib.py", line 1250, in urlencode v = quote_plus(str(v)) UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-1: ordinal not in range(128) how to do this thanks updated i change the code from args = dict((k, v[-1]) for k, v in self.request.arguments.iteritems()) args["openid.mode"] = u"check_authentication" url = self._OPENID_ENDPOINT + "?" + urllib.urlencode(args) to args = dict((k, v[-1].encode('utf-8')) for k, v in self.request.arguments.iteritems()) args["openid.mode"] = u"check_authentication" url = self._OPENID_ENDPOINT + "?" + urllib.urlencode(args) but also error.

    Read the article

< Previous Page | 91 92 93 94 95 96 97 98 99 100 101 102  | Next Page >