Search Results

Search found 36172 results on 1447 pages for 'unicode string'.

Page 94/1447 | < Previous Page | 90 91 92 93 94 95 96 97 98 99 100 101 | Next Page >

ICU MessageFormat Currency Value Precison Lost

- by Travis

This may be a niche question but I'm working with ICU to format currency strings. I've bumped into a situation that I don't quite understand. When using the MesssageFormat class, why for a certain locale (Korea "ko"KR" for example) does it round currency values (e.g. 100.50 becomes ?101). For most locales (such as the US "en_US"), the precision of the argument passed in remains untouched (e.g. 100.50 becomes $100.50). I thought this might be a default rounding issue that some locales have (Swiss Francs "fr_CH" for example have a default 0.05 rounding) but South Korea "ko_KR" has none. Any ideas?

Read the article
Does Perl's Net::Cassandra module support UTF-8?

- by knorv

I've run into a really strange UTF-8 problem with Net::Cassandra::Easy (which is built upon Net::Cassandra): UTF-8 strings written to Cassandra are garbled upon retrieval. The following code shows the problem: use strict; use utf8; use warnings; use Net::Cassandra::Easy; binmode(STDOUT, ":utf8"); my $key = "some_key"; my $column = "some_column"; my $set_value = "\x{2603}"; my $cassandra = Net::Cassandra::Easy->new(keyspace => "Keyspace1", server => "localhost"); $cassandra->connect(); $cassandra->mutate([$key], family => "Standard1", insertions => { $column => $set_value }); my $result = $cassandra->get([$key], family => "Standard1", standard => 1); my $get_value = $result->{$key}->{"Standard1"}->{$column}; if ($set_value eq $get_value) { # this is the path I want. print "OK: $set_value == $get_value\n"; } else { # this is the path I get. print "ERR: $set_value != $get_value\n"; } When running the code above $set_value eq $get_value evaluates to false. What am I doing wrong?

Read the article
Converting latin mysql data to utf8

- by Oguz

I want to use utf 8 right now , but all my data is latin1 , what is the efficient way to convert data .

Read the article
Dreaded python encoding errors, how to stop them?

- by Rhubarb

These have been plaguing me endlessly. Why? It seems that my console can't handle the encoding. I take it that the my browser and word processor can handle it. I don't have a master list of all the possible characters that it's choking on. What is the best way to relieve this without modifying my data? 'charmap' codec can't encode character u'\xca'

Read the article
Python interface to PayPal - urllib.urlencode non-ASCII characters failing

- by krys

I am trying to implement PayPal IPN functionality. The basic protocol is as such: The client is redirected from my site to PayPal's site to complete payment. He logs into his account, authorizes payment. PayPal calls a page on my server passing in details as POST. Details include a person's name, address, and payment info etc. I need to call a URL on PayPal's site internally from my processing page passing back all the params that were passed in abovem and an additional one called 'cmd' with a value of '_notify-validate'. When I try to urllib.urlencode the params which PayPal has sent to me, I get a: While calling send_response_to_paypal. Traceback (most recent call last): File "<snip>/account/paypal/views.py", line 108, in process_paypal_ipn verify_result = send_response_to_paypal(params) File "<snip>/account/paypal/views.py", line 41, in send_response_to_paypal params = urllib.urlencode(params) File "/usr/local/lib/python2.6/urllib.py", line 1261, in urlencode v = quote_plus(str(v)) UnicodeEncodeError: 'ascii' codec can't encode character u'\ufffd' in position 9: ordinal not in range(128) I understand that urlencode does ASCII encoding, and in certain cases, a user's contact info can contain non-ASCII characters. This is understandable. My question is, how do I encode non-ASCII characters for POSTing to a URL using urllib2.urlopen(req) (or other method) Details: I read the params in PayPal's original request as follows (the GET is for testing): def read_ipn_params(request): if request.POST: params= request.POST.copy() if "ipn_auth" in request.GET: params["ipn_auth"]=request.GET["ipn_auth"] return params else: return request.GET.copy() The code I use for sending back the request to PayPal from the processing page is: def send_response_to_paypal(params): params['cmd']='_notify-validate' params = urllib.urlencode(params) req = urllib2.Request(PAYPAL_API_WEBSITE, params) req.add_header("Content-type", "application/x-www-form-urlencoded") response = urllib2.urlopen(req) status = response.read() if not status == "VERIFIED": logging.warn("PayPal cannot verify IPN responses: " + status) return False return True Obviously, the problem only arises if someone's name or address or other field used for the PayPal payment does not fall into the ASCII range.

Read the article
PHP: Convert curl_exec output to UTF8

- by Paul Tarjan

I would like to only work with UTF8. The problem is I don't know the charset of every webpage. How can I detect it and convert to UTF8? <?php $url = "http://vkontakte.ru"; $ch = curl_init($url); $options = array( CURLOPT_RETURNTRANSFER => true, ); curl_setopt_array($ch, $options); $data = curl_exec($ch); // $data = magic($data); print $data; See this at: http://paulisageek.com/tmp/curl-utf8 What is magic()?

Read the article
Python: UnicodeEncodeError when reading from stdin

- by hansfbaier

When running a Python program that reads from stdin, I get the following error: UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 320: ordinal not in range(128) How can I fix it?

Read the article
feedparser fails during script run, but can't reproduce in interactive python console

- by Rhubarb

It's failing with this when I run eclipse or when I run my script in iPython: 'ascii' codec can't decode byte 0xe2 in position 32: ordinal not in range(128) I don't know why, but when I simply execute the feedparse.parse(url) statement using the same url, there is no error thrown. This is stumping me big time. The code is as simple as: try: d = feedparser.parse(url) except Exception, e: logging.error('Error while retrieving feed.') logging.error(e) logging.error(formatExceptionInfo(None)) logging.error(formatExceptionInfo1()) Here is the stack trace: d = feedparser.parse(url) File "C:\Python26\lib\site-packages\feedparser.py", line 2623, in parse feedparser.feed(data) File "C:\Python26\lib\site-packages\feedparser.py", line 1441, in feed sgmllib.SGMLParser.feed(self, data) File "C:\Python26\lib\sgmllib.py", line 104, in feed self.goahead(0) File "C:\Python26\lib\sgmllib.py", line 143, in goahead k = self.parse_endtag(i) File "C:\Python26\lib\sgmllib.py", line 320, in parse_endtag self.finish_endtag(tag) File "C:\Python26\lib\sgmllib.py", line 360, in finish_endtag self.unknown_endtag(tag) File "C:\Python26\lib\site-packages\feedparser.py", line 476, in unknown_endtag method() File "C:\Python26\lib\site-packages\feedparser.py", line 1318, in _end_content value = self.popContent('content') File "C:\Python26\lib\site-packages\feedparser.py", line 700, in popContent value = self.pop(tag) File "C:\Python26\lib\site-packages\feedparser.py", line 641, in pop output = _resolveRelativeURIs(output, self.baseuri, self.encoding) File "C:\Python26\lib\site-packages\feedparser.py", line 1594, in _resolveRelativeURIs p.feed(htmlSource) File "C:\Python26\lib\site-packages\feedparser.py", line 1441, in feed sgmllib.SGMLParser.feed(self, data) File "C:\Python26\lib\sgmllib.py", line 104, in feed self.goahead(0) File "C:\Python26\lib\sgmllib.py", line 138, in goahead k = self.parse_starttag(i) File "C:\Python26\lib\sgmllib.py", line 296, in parse_starttag self.finish_starttag(tag, attrs) File "C:\Python26\lib\sgmllib.py", line 338, in finish_starttag self.unknown_starttag(tag, attrs) File "C:\Python26\lib\site-packages\feedparser.py", line 1588, in unknown_starttag attrs = [(key, ((tag, key) in self.relative_uris) and self.resolveURI(value) or value) for key, value in attrs] File "C:\Python26\lib\site-packages\feedparser.py", line 1584, in resolveURI return _urljoin(self.baseuri, uri) File "C:\Python26\lib\site-packages\feedparser.py", line 286, in _urljoin return urlparse.urljoin(base, uri) File "C:\Python26\lib\urlparse.py", line 215, in urljoin params, query, fragment)) File "C:\Python26\lib\urlparse.py", line 184, in urlunparse return urlunsplit((scheme, netloc, url, query, fragment)) File "C:\Python26\lib\urlparse.py", line 192, in urlunsplit url = scheme + ':' + url File "C:\Python26\lib\encodings\cp1252.py", line 15, in decode return codecs.charmap_decode(input,errors,decoding_table)

Read the article
i18n / Markdown - Does Markdown support internationalization?

- by John Himmelman

I'm building a CMS which needs to manage content in english, chinese, and spanish at a minimum. Do most markdown implementations handle UTF-8 encoded text? Is the Markdown language designed to be used with non-english languages? I'm currently using Markdown Extra by Michel Fortin.

Read the article
Non-ascii characters in velocity templates are broken when displayed

- by glaz666

Hi! I have non-ascii chars in velocity template files. And when processed they are garbled. The files are saved in UTF-8 encoding and response header contentType is also set to text/html;charset=UTF-8. What else can be done?

Read the article
Windows C API for UTF8 to 1252

- by Paul

I'm familiar with WideCharToMultiByte and MultiByteToWideChar conversions and could use these to do something like: UTF8 - UTF16 - 1252 I know that iconv will do what I need, but does anybody know of any MS libs that will allow this in a single call? I should probably just pull in the iconv library, but am feeling lazy. Thanks

Read the article
regular expression to read the string between <title> and </title>

- by user262325

Hello every one I hope to read the contents between and in a html string. I think it should be in objective-c @"<title([\\s\\S]*)</title>" below are the codes that rewrited for regular expression //source of NSStringCategory.h #import <Foundation/Foundation.h> #import <regex.h> @interface NSStringCategory:NSObject { regex_t preg; } -(id)initWithPattern:(NSString *)pattern options:(int)options; -(void)dealloc; -(BOOL)matchesString:(NSString *)string; -(NSString *)matchedSubstringOfString:(NSString *)string; -(NSArray *)capturedSubstringsOfString:(NSString *)string; +(NSStringCategory *)regexWithPattern:(NSString *)pattern options:(int)options; +(NSStringCategory *)regexWithPattern:(NSString *)pattern; +(NSString *)null; +(void)initialize; @end @interface NSString (NSStringCategory) -(BOOL)matchedByPattern:(NSString *)pattern options:(int)options; -(BOOL)matchedByPattern:(NSString *)pattern; -(NSString *)substringMatchedByPattern:(NSString *)pattern options:(int)options; -(NSString *)substringMatchedByPattern:(NSString *)pattern; -(NSArray *)substringsCapturedByPattern:(NSString *)pattern options:(int)options; -(NSArray *)substringsCapturedByPattern:(NSString *)pattern; -(NSString *)escapedPattern; @end and .m file #import "NSStringCategory.h" static NSString *nullstring=nil; @implementation NSStringCategory -(id)initWithPattern:(NSString *)pattern options:(int)options { if(self=[super init]) { int err=regcomp(&preg,[pattern UTF8String],options|REG_EXTENDED); if(err) { char errbuf[256]; regerror(err,&preg,errbuf,sizeof(errbuf)); [NSException raise:@"CSRegexException" format:@"Could not compile regex \"%@\": %s",pattern,errbuf]; } } return self; } -(void)dealloc { regfree(&preg); [super dealloc]; } -(BOOL)matchesString:(NSString *)string { if(regexec(&preg,[string UTF8String],0,NULL,0)==0) return YES; return NO; } -(NSString *)matchedSubstringOfString:(NSString *)string { const char *cstr=[string UTF8String]; regmatch_t match; if(regexec(&preg,cstr,1,&match,0)==0) { return [[[NSString alloc] initWithBytes:cstr+match.rm_so length:match.rm_eo-match.rm_so encoding:NSUTF8StringEncoding] autorelease]; } return nil; } -(NSArray *)capturedSubstringsOfString:(NSString *)string { const char *cstr=[string UTF8String]; int num=preg.re_nsub+1; regmatch_t *matches=calloc(sizeof(regmatch_t),num); if(regexec(&preg,cstr,num,matches,0)==0) { NSMutableArray *array=[NSMutableArray arrayWithCapacity:num]; int i; for(i=0;i<num;i++) { NSString *str; if(matches[i].rm_so==-1&&matches[i].rm_eo==-1) str=nullstring; else str=[[[NSString alloc] initWithBytes:cstr+matches[i].rm_so length:matches[i].rm_eo-matches[i].rm_so encoding:NSUTF8StringEncoding] autorelease]; [array addObject:str]; } free(matches); return [NSArray arrayWithArray:array]; } free(matches); return nil; } +(NSStringCategory *)regexWithPattern:(NSString *)pattern options:(int)options { return [[[NSStringCategory alloc] initWithPattern:pattern options:options] autorelease]; } +(NSStringCategory *)regexWithPattern:(NSString *)pattern { return [[[NSStringCategory alloc] initWithPattern:pattern options:0] autorelease]; } +(NSString *)null { return nullstring; } +(void)initialize { if(!nullstring) nullstring=[[NSString alloc] initWithString:@""]; } @end @implementation NSString (NSStringCategory) -(BOOL)matchedByPattern:(NSString *)pattern options:(int)options { NSStringCategory *re=[NSStringCategory regexWithPattern:pattern options:options|REG_NOSUB]; return [re matchesString:self]; } -(BOOL)matchedByPattern:(NSString *)pattern { return [self matchedByPattern:pattern options:0]; } -(NSString *)substringMatchedByPattern:(NSString *)pattern options:(int)options { NSStringCategory *re=[NSStringCategory regexWithPattern:pattern options:options]; return [re matchedSubstringOfString:self]; } -(NSString *)substringMatchedByPattern:(NSString *)pattern { return [self substringMatchedByPattern:pattern options:0]; } -(NSArray *)substringsCapturedByPattern:(NSString *)pattern options:(int)options { NSStringCategory *re=[NSStringCategory regexWithPattern:pattern options:options]; return [re capturedSubstringsOfString:self]; } -(NSArray *)substringsCapturedByPattern:(NSString *)pattern { return [self substringsCapturedByPattern:pattern options:0]; } -(NSString *)escapedPattern { int len=[self length]; NSMutableString *escaped=[NSMutableString stringWithCapacity:len]; for(int i=0;i<len;i++) { unichar c=[self characterAtIndex:i]; if(c=='^'||c=='.'||c=='['||c=='$'||c=='('||c==')' ||c=='|'||c=='*'||c=='+'||c=='?'||c=='{'||c=='\\') [escaped appendFormat:@"\\%C",c]; else [escaped appendFormat:@"%C",c]; } return [NSString stringWithString:escaped]; } @end I use the codes below to get the string between "" and "" NSStringCategory *a=[[NSStringCategory alloc] initWithPattern:@"<title([\s\S]*)</title>" options:0];// Unfortunately [a matchedSubstringOfString:response]] always returns nil I do not if the regular expression is wrong or any other reason. Welcome any comment Thanks interdev

Read the article
Unexpected output of std::wcout << L"élève"; in Windows Shell

- by chmike

While testing some functions to convert strings between wchar_t and utf8 I met the following weird result with Visual C++ express 2008 std::wcout << L"élève" << std::endl; prints out "ÚlÞve:" which is obviously not what is expected. This is obviously a bug. How can that be ? How am I suppose to deal with such "feature" ?

Read the article
Can you get access to the NumberFormatter used by ICU MessageFormat

- by Travis

This may be a niche question but I'm working with ICU to format currency strings. I've bumped into a situation that I don't quite understand. When using the MesssageFormat class, is it possible to get access to the NumberFormat object it uses to format currency strings. When you create a NumberFormat instance yourself, you can specify attributes like precision and rounding used when creating currency strings. I have an issue where for the South Korean locale ("ko_KR"), the MessageFormat class seems to create currency strings w/ rounding (100.50 - ?100). In areas where I use NumberFormat directly, I set setMaximumFractionDigits and setMinimumFractionDigits to 2 but I can't seem to set this in the MessageFormat. Any ideas?

Read the article
Writing UTF8 text to file

- by sonofdelphi

I am using the following function to save text to a file (on IE-8 w/ActiveX). function saveFile(strFullPath, strContent) { var fso = new ActiveXObject( "Scripting.FileSystemObject" ); var flOutput = fso.CreateTextFile( strFullPath, true ); //true for overwrite flOutput.Write( strContent ); flOutput.Close(); } The code works fine if the text is fully Latin-9 but when the text contains even a single UTF-8 encoded character, the write fails. The ActiveX FileSystemObject does not support UTF-8, it seems. I tried UTF-16 encoding the text first but the result was garbled. What is a workaround?

Read the article
UnicodeDecodeError on attempt to save file through django default filebased backend

- by Ivan Kuznetsov

When i attempt to add a file with russian symbols in name to the model instance through default instance.file_field.save method, i get an UnicodeDecodeError (ascii decoding error, not in range (128) from the storage backend (stacktrace ended on os.exist). If i write this file through default python file open/write all goes right. All filenames in utf-8. I get this error only on testing Gentoo, on my Ubuntu workstation all works fine. class Article(models.Model): file = models.FileField(null=True, blank=True, max_length = 300, upload_to='articles_files/%Y/%m/%d/') Traceback: File "/usr/lib/python2.6/site-packages/django/core/handlers/base.py" in get_response 100. response = callback(request, *callback_args, **callback_kwargs) File "/usr/lib/python2.6/site-packages/django/contrib/auth/decorators.py" in _wrapped_view 24. return view_func(request, *args, **kwargs) File "/var/www/localhost/help/wiki/views.py" in edit_article 338. new_article.file.save(fp, fi, save=True) File "/usr/lib/python2.6/site-packages/django/db/models/fields/files.py" in save 92. self.name = self.storage.save(name, content) File "/usr/lib/python2.6/site-packages/django/core/files/storage.py" in save 47. name = self.get_available_name(name) File "/usr/lib/python2.6/site-packages/django/core/files/storage.py" in get_available_name 73. while self.exists(name): File "/usr/lib/python2.6/site-packages/django/core/files/storage.py" in exists 196. return os.path.exists(self.path(name)) File "/usr/lib/python2.6/genericpath.py" in exists 18. st = os.stat(path) Exception Type: UnicodeEncodeError at /edit/ Exception Value: ('ascii', u'/var/www/localhost/help/i/articles_files/2010/03/17/\u041f\u0440\u0438\u0432\u0435\u0442', 52, 58, 'ordinal not in range(128)')

Read the article
How do I eliminate TT's "Wide character in print" warning ?

- by planetp

I have this warning every time I run my CGI-script (output is rendered by Template::Toolkit): Wide character in print at /usr/local/lib/perl5/site_perl/5.8.9/mach/Template.pm line 163. What's the right way to eliminate it? I create the tt object using this config: my %config = ( ENCODING => 'utf8', INCLUDE_PATH => $ENV{TEMPLATES_DIR}, EVAL_PERL => 1, } my $tt = Template->new(\%config);

Read the article
PHP: Cyrillic characters not displayed correctly

- by user295502

Recently I switched hosting from one provider to the other and I have problems displaying Cyrillic characters. The characters which are read from the database are displayed correctly, but characters which are hardcoded in the php file aren't (they are displayed as question marks). The files which contain the php source code are saved in utf-8 form. Help anybody?

Read the article
Display WCHAR Strings in Xcode Debugger

- by Nicholaz

I'd like to preview WCHAR strings in the variable display of the Xcode 3.2 debugger. Bascially if I have WCHAR wtext[128]; wcscpy(wtext, L"Hello World"); I'd like to see "Hello World" for wtext when tracing into the function.

Read the article
Is there any reason to prefer UTF-16 over UTF-8?

- by Oak

Examining the attributes of UTF-16 and UTF-8, I can't find any reason to prefer UTF-16. However, checking out Java and C#, it looks like strings and chars there default to UTF-16. I was thinking that it might be for historic reasons, or perhaps for performance reasons, but couldn't find any information. Anyone knows why these languages chose UTF-16? And is there any valid reason for me to do that as well?

Read the article
When uploading Arabic files in Spring, filename ends up with XML entities instead of Arabic glyphs

- by sword101

I am using Spring upload to upload files. When uploading an Arabic file and getting the original file name in the controller, I get something like: المغفلين.png I expect it to be: ????????.png Any ideas why this problem occur?

Read the article
How can I use ToUnicode without breaking dead key support?

- by Cypherjb

A similar question has already been asked, so I'm not going to waste time re-explaining it, an existing discussion can be found here: http://stackoverflow.com/questions/1964614/toascii-tounicode-in-a-keyboard-hook-destroys-dead-keys The reason I'm posting a new question however is that I seem to have come across a 'solution', but I'm not quite sure how to implement it. This blog post seems to propose a solution to the problem of ToUnicode killing dead-key support: http://blogs.msdn.com/michkap/archive/2005/01/19/355870.aspx However I'm not sure how to implement the suggested solution. A push in the right direction would be greatly appreciated. To be clear, the part I'm referring to is this: "There are two ways to work around this: 1) You can keep calling ToUnicode with the same info until it is cleared out and then call it one more time to put the state back where it was if you had never typed anything, or 2) You can load all of the keyboard info ahead of time and then when they type information you can look up in your own info cache what the keystrokes mean, without having to call APIs later." I'm not quite sure how to do either of those things (keyboards and internationalization are far from my strong point), so any help would be greatly appreciated. Thanks

Read the article
python unichr problem

- by jacob

I've got some problem with unichr() on my server. Please see below: On my server (Ubuntu 9.04): >>> print unichr(255) Traceback (most recent call last): File "<stdin>", line 1, in <module> UnicodeEncodeError: 'ascii' codec can't encode character u'\xff' in position 0: ordinal not in range(128) On my desktop (Ubuntu 9.10): >>> print unichr(255) ÿ I'm fairly new to python so I don't know how to solve this. Anyone care to help? Thanks.

Read the article
Amazon SQS invalid binary character in message body

- by letronje

I have a web app that sends messages to an Amazon SQS Queue. Amazon sqs lib throws a 'AmazonSQSException' since the message contained invalid binary character. The message is the referrer obtained from an incoming http request. This is what it looks like: http://ads.vrx.adbrite.com/adserver/display_iab_ads.php?sid=1220459&title_color=0000FF&text_color=000000&background_color=FFFFFF&border_color=CCCCCC&url_color=008000&newwin=0&zs=3330305f323530&width=300&height=250&url=http%3A%2F%2Funblockorkutproxy.com%2Fsearch.php%2FOi8vZG93%2FbmxvYWRz%2FLnppZGR1%2FLmNvbS9k%2Fb3dubG9h%2FZGZpbGUv%2FNTY5MTQ3%2FNi9NeUN1%2FdGVHaXJs%2FZnJpZW5k%2FWmFoaXJh%2FLndtdi5o%2FdG1s%2Fb0%2F^FÃ´}ÃºÃ<99Ã«)j Looks like the characters in bold are the invalid characters. Is there an easy way to filter out characters characters that are not accepted by amazon ? Here are the characters allowed by amazon in message body. I am not sure what regex i should use to replace invalid characters by ''

Read the article
to escape or not to escape: well formed XHTML with diacritics

- by andresmh

Say that you have a XHTML document in English but it has accented characters (e.g. meta name="author" content="José"). Let's say you have no control over the HTTP headers. Should the characters be replaced for their corresponding named entities (e.g. á, etc)? Should the doc type and the xml:lang attribute be set to English? I know I can check the W3C recommendation but I am asking more from a practical point of view.

Read the article

< Previous Page | 90 91 92 93 94 95 96 97 98 99 100 101 | Next Page >