Search Results

Search found 36172 results on 1447 pages for 'unicode string'.

Page 94/1447 | < Previous Page | 90 91 92 93 94 95 96 97 98 99 100 101  | Next Page >

  • ICU MessageFormat Currency Value Precison Lost

    - by Travis
    This may be a niche question but I'm working with ICU to format currency strings. I've bumped into a situation that I don't quite understand. When using the MesssageFormat class, why for a certain locale (Korea "ko"KR" for example) does it round currency values (e.g. 100.50 becomes ?101). For most locales (such as the US "en_US"), the precision of the argument passed in remains untouched (e.g. 100.50 becomes $100.50). I thought this might be a default rounding issue that some locales have (Swiss Francs "fr_CH" for example have a default 0.05 rounding) but South Korea "ko_KR" has none. Any ideas?

    Read the article

  • Does Perl's Net::Cassandra module support UTF-8?

    - by knorv
    I've run into a really strange UTF-8 problem with Net::Cassandra::Easy (which is built upon Net::Cassandra): UTF-8 strings written to Cassandra are garbled upon retrieval. The following code shows the problem: use strict; use utf8; use warnings; use Net::Cassandra::Easy; binmode(STDOUT, ":utf8"); my $key = "some_key"; my $column = "some_column"; my $set_value = "\x{2603}"; my $cassandra = Net::Cassandra::Easy->new(keyspace => "Keyspace1", server => "localhost"); $cassandra->connect(); $cassandra->mutate([$key], family => "Standard1", insertions => { $column => $set_value }); my $result = $cassandra->get([$key], family => "Standard1", standard => 1); my $get_value = $result->{$key}->{"Standard1"}->{$column}; if ($set_value eq $get_value) { # this is the path I want. print "OK: $set_value == $get_value\n"; } else { # this is the path I get. print "ERR: $set_value != $get_value\n"; } When running the code above $set_value eq $get_value evaluates to false. What am I doing wrong?

    Read the article

  • Dreaded python encoding errors, how to stop them?

    - by Rhubarb
    These have been plaguing me endlessly. Why? It seems that my console can't handle the encoding. I take it that the my browser and word processor can handle it. I don't have a master list of all the possible characters that it's choking on. What is the best way to relieve this without modifying my data? 'charmap' codec can't encode character u'\xca'

    Read the article

  • PHP: Convert curl_exec output to UTF8

    - by Paul Tarjan
    I would like to only work with UTF8. The problem is I don't know the charset of every webpage. How can I detect it and convert to UTF8? <?php $url = "http://vkontakte.ru"; $ch = curl_init($url); $options = array( CURLOPT_RETURNTRANSFER => true, ); curl_setopt_array($ch, $options); $data = curl_exec($ch); // $data = magic($data); print $data; See this at: http://paulisageek.com/tmp/curl-utf8 What is magic()?

    Read the article

  • Python interface to PayPal - urllib.urlencode non-ASCII characters failing

    - by krys
    I am trying to implement PayPal IPN functionality. The basic protocol is as such: The client is redirected from my site to PayPal's site to complete payment. He logs into his account, authorizes payment. PayPal calls a page on my server passing in details as POST. Details include a person's name, address, and payment info etc. I need to call a URL on PayPal's site internally from my processing page passing back all the params that were passed in abovem and an additional one called 'cmd' with a value of '_notify-validate'. When I try to urllib.urlencode the params which PayPal has sent to me, I get a: While calling send_response_to_paypal. Traceback (most recent call last): File "<snip>/account/paypal/views.py", line 108, in process_paypal_ipn verify_result = send_response_to_paypal(params) File "<snip>/account/paypal/views.py", line 41, in send_response_to_paypal params = urllib.urlencode(params) File "/usr/local/lib/python2.6/urllib.py", line 1261, in urlencode v = quote_plus(str(v)) UnicodeEncodeError: 'ascii' codec can't encode character u'\ufffd' in position 9: ordinal not in range(128) I understand that urlencode does ASCII encoding, and in certain cases, a user's contact info can contain non-ASCII characters. This is understandable. My question is, how do I encode non-ASCII characters for POSTing to a URL using urllib2.urlopen(req) (or other method) Details: I read the params in PayPal's original request as follows (the GET is for testing): def read_ipn_params(request): if request.POST: params= request.POST.copy() if "ipn_auth" in request.GET: params["ipn_auth"]=request.GET["ipn_auth"] return params else: return request.GET.copy() The code I use for sending back the request to PayPal from the processing page is: def send_response_to_paypal(params): params['cmd']='_notify-validate' params = urllib.urlencode(params) req = urllib2.Request(PAYPAL_API_WEBSITE, params) req.add_header("Content-type", "application/x-www-form-urlencoded") response = urllib2.urlopen(req) status = response.read() if not status == "VERIFIED": logging.warn("PayPal cannot verify IPN responses: " + status) return False return True Obviously, the problem only arises if someone's name or address or other field used for the PayPal payment does not fall into the ASCII range.

    Read the article

  • feedparser fails during script run, but can't reproduce in interactive python console

    - by Rhubarb
    It's failing with this when I run eclipse or when I run my script in iPython: 'ascii' codec can't decode byte 0xe2 in position 32: ordinal not in range(128) I don't know why, but when I simply execute the feedparse.parse(url) statement using the same url, there is no error thrown. This is stumping me big time. The code is as simple as: try: d = feedparser.parse(url) except Exception, e: logging.error('Error while retrieving feed.') logging.error(e) logging.error(formatExceptionInfo(None)) logging.error(formatExceptionInfo1()) Here is the stack trace: d = feedparser.parse(url) File "C:\Python26\lib\site-packages\feedparser.py", line 2623, in parse feedparser.feed(data) File "C:\Python26\lib\site-packages\feedparser.py", line 1441, in feed sgmllib.SGMLParser.feed(self, data) File "C:\Python26\lib\sgmllib.py", line 104, in feed self.goahead(0) File "C:\Python26\lib\sgmllib.py", line 143, in goahead k = self.parse_endtag(i) File "C:\Python26\lib\sgmllib.py", line 320, in parse_endtag self.finish_endtag(tag) File "C:\Python26\lib\sgmllib.py", line 360, in finish_endtag self.unknown_endtag(tag) File "C:\Python26\lib\site-packages\feedparser.py", line 476, in unknown_endtag method() File "C:\Python26\lib\site-packages\feedparser.py", line 1318, in _end_content value = self.popContent('content') File "C:\Python26\lib\site-packages\feedparser.py", line 700, in popContent value = self.pop(tag) File "C:\Python26\lib\site-packages\feedparser.py", line 641, in pop output = _resolveRelativeURIs(output, self.baseuri, self.encoding) File "C:\Python26\lib\site-packages\feedparser.py", line 1594, in _resolveRelativeURIs p.feed(htmlSource) File "C:\Python26\lib\site-packages\feedparser.py", line 1441, in feed sgmllib.SGMLParser.feed(self, data) File "C:\Python26\lib\sgmllib.py", line 104, in feed self.goahead(0) File "C:\Python26\lib\sgmllib.py", line 138, in goahead k = self.parse_starttag(i) File "C:\Python26\lib\sgmllib.py", line 296, in parse_starttag self.finish_starttag(tag, attrs) File "C:\Python26\lib\sgmllib.py", line 338, in finish_starttag self.unknown_starttag(tag, attrs) File "C:\Python26\lib\site-packages\feedparser.py", line 1588, in unknown_starttag attrs = [(key, ((tag, key) in self.relative_uris) and self.resolveURI(value) or value) for key, value in attrs] File "C:\Python26\lib\site-packages\feedparser.py", line 1584, in resolveURI return _urljoin(self.baseuri, uri) File "C:\Python26\lib\site-packages\feedparser.py", line 286, in _urljoin return urlparse.urljoin(base, uri) File "C:\Python26\lib\urlparse.py", line 215, in urljoin params, query, fragment)) File "C:\Python26\lib\urlparse.py", line 184, in urlunparse return urlunsplit((scheme, netloc, url, query, fragment)) File "C:\Python26\lib\urlparse.py", line 192, in urlunsplit url = scheme + ':' + url File "C:\Python26\lib\encodings\cp1252.py", line 15, in decode return codecs.charmap_decode(input,errors,decoding_table)

    Read the article

  • Windows C API for UTF8 to 1252

    - by Paul
    I'm familiar with WideCharToMultiByte and MultiByteToWideChar conversions and could use these to do something like: UTF8 - UTF16 - 1252 I know that iconv will do what I need, but does anybody know of any MS libs that will allow this in a single call? I should probably just pull in the iconv library, but am feeling lazy. Thanks

    Read the article

  • Writing UTF8 text to file

    - by sonofdelphi
    I am using the following function to save text to a file (on IE-8 w/ActiveX). function saveFile(strFullPath, strContent) { var fso = new ActiveXObject( "Scripting.FileSystemObject" ); var flOutput = fso.CreateTextFile( strFullPath, true ); //true for overwrite flOutput.Write( strContent ); flOutput.Close(); } The code works fine if the text is fully Latin-9 but when the text contains even a single UTF-8 encoded character, the write fails. The ActiveX FileSystemObject does not support UTF-8, it seems. I tried UTF-16 encoding the text first but the result was garbled. What is a workaround?

    Read the article

  • regular expression to read the string between <title> and </title>

    - by user262325
    Hello every one I hope to read the contents between and in a html string. I think it should be in objective-c @"<title([\\s\\S]*)</title>" below are the codes that rewrited for regular expression //source of NSStringCategory.h #import <Foundation/Foundation.h> #import <regex.h> @interface NSStringCategory:NSObject { regex_t preg; } -(id)initWithPattern:(NSString *)pattern options:(int)options; -(void)dealloc; -(BOOL)matchesString:(NSString *)string; -(NSString *)matchedSubstringOfString:(NSString *)string; -(NSArray *)capturedSubstringsOfString:(NSString *)string; +(NSStringCategory *)regexWithPattern:(NSString *)pattern options:(int)options; +(NSStringCategory *)regexWithPattern:(NSString *)pattern; +(NSString *)null; +(void)initialize; @end @interface NSString (NSStringCategory) -(BOOL)matchedByPattern:(NSString *)pattern options:(int)options; -(BOOL)matchedByPattern:(NSString *)pattern; -(NSString *)substringMatchedByPattern:(NSString *)pattern options:(int)options; -(NSString *)substringMatchedByPattern:(NSString *)pattern; -(NSArray *)substringsCapturedByPattern:(NSString *)pattern options:(int)options; -(NSArray *)substringsCapturedByPattern:(NSString *)pattern; -(NSString *)escapedPattern; @end and .m file #import "NSStringCategory.h" static NSString *nullstring=nil; @implementation NSStringCategory -(id)initWithPattern:(NSString *)pattern options:(int)options { if(self=[super init]) { int err=regcomp(&preg,[pattern UTF8String],options|REG_EXTENDED); if(err) { char errbuf[256]; regerror(err,&preg,errbuf,sizeof(errbuf)); [NSException raise:@"CSRegexException" format:@"Could not compile regex \"%@\": %s",pattern,errbuf]; } } return self; } -(void)dealloc { regfree(&preg); [super dealloc]; } -(BOOL)matchesString:(NSString *)string { if(regexec(&preg,[string UTF8String],0,NULL,0)==0) return YES; return NO; } -(NSString *)matchedSubstringOfString:(NSString *)string { const char *cstr=[string UTF8String]; regmatch_t match; if(regexec(&preg,cstr,1,&match,0)==0) { return [[[NSString alloc] initWithBytes:cstr+match.rm_so length:match.rm_eo-match.rm_so encoding:NSUTF8StringEncoding] autorelease]; } return nil; } -(NSArray *)capturedSubstringsOfString:(NSString *)string { const char *cstr=[string UTF8String]; int num=preg.re_nsub+1; regmatch_t *matches=calloc(sizeof(regmatch_t),num); if(regexec(&preg,cstr,num,matches,0)==0) { NSMutableArray *array=[NSMutableArray arrayWithCapacity:num]; int i; for(i=0;i<num;i++) { NSString *str; if(matches[i].rm_so==-1&&matches[i].rm_eo==-1) str=nullstring; else str=[[[NSString alloc] initWithBytes:cstr+matches[i].rm_so length:matches[i].rm_eo-matches[i].rm_so encoding:NSUTF8StringEncoding] autorelease]; [array addObject:str]; } free(matches); return [NSArray arrayWithArray:array]; } free(matches); return nil; } +(NSStringCategory *)regexWithPattern:(NSString *)pattern options:(int)options { return [[[NSStringCategory alloc] initWithPattern:pattern options:options] autorelease]; } +(NSStringCategory *)regexWithPattern:(NSString *)pattern { return [[[NSStringCategory alloc] initWithPattern:pattern options:0] autorelease]; } +(NSString *)null { return nullstring; } +(void)initialize { if(!nullstring) nullstring=[[NSString alloc] initWithString:@""]; } @end @implementation NSString (NSStringCategory) -(BOOL)matchedByPattern:(NSString *)pattern options:(int)options { NSStringCategory *re=[NSStringCategory regexWithPattern:pattern options:options|REG_NOSUB]; return [re matchesString:self]; } -(BOOL)matchedByPattern:(NSString *)pattern { return [self matchedByPattern:pattern options:0]; } -(NSString *)substringMatchedByPattern:(NSString *)pattern options:(int)options { NSStringCategory *re=[NSStringCategory regexWithPattern:pattern options:options]; return [re matchedSubstringOfString:self]; } -(NSString *)substringMatchedByPattern:(NSString *)pattern { return [self substringMatchedByPattern:pattern options:0]; } -(NSArray *)substringsCapturedByPattern:(NSString *)pattern options:(int)options { NSStringCategory *re=[NSStringCategory regexWithPattern:pattern options:options]; return [re capturedSubstringsOfString:self]; } -(NSArray *)substringsCapturedByPattern:(NSString *)pattern { return [self substringsCapturedByPattern:pattern options:0]; } -(NSString *)escapedPattern { int len=[self length]; NSMutableString *escaped=[NSMutableString stringWithCapacity:len]; for(int i=0;i<len;i++) { unichar c=[self characterAtIndex:i]; if(c=='^'||c=='.'||c=='['||c=='$'||c=='('||c==')' ||c=='|'||c=='*'||c=='+'||c=='?'||c=='{'||c=='\\') [escaped appendFormat:@"\\%C",c]; else [escaped appendFormat:@"%C",c]; } return [NSString stringWithString:escaped]; } @end I use the codes below to get the string between "" and "" NSStringCategory *a=[[NSStringCategory alloc] initWithPattern:@"<title([\s\S]*)</title>" options:0];// Unfortunately [a matchedSubstringOfString:response]] always returns nil I do not if the regular expression is wrong or any other reason. Welcome any comment Thanks interdev

    Read the article

  • How do I eliminate TT's "Wide character in print" warning ?

    - by planetp
    I have this warning every time I run my CGI-script (output is rendered by Template::Toolkit): Wide character in print at /usr/local/lib/perl5/site_perl/5.8.9/mach/Template.pm line 163. What's the right way to eliminate it? I create the tt object using this config: my %config = ( ENCODING => 'utf8', INCLUDE_PATH => $ENV{TEMPLATES_DIR}, EVAL_PERL => 1, } my $tt = Template->new(\%config);

    Read the article

  • Can you get access to the NumberFormatter used by ICU MessageFormat

    - by Travis
    This may be a niche question but I'm working with ICU to format currency strings. I've bumped into a situation that I don't quite understand. When using the MesssageFormat class, is it possible to get access to the NumberFormat object it uses to format currency strings. When you create a NumberFormat instance yourself, you can specify attributes like precision and rounding used when creating currency strings. I have an issue where for the South Korean locale ("ko_KR"), the MessageFormat class seems to create currency strings w/ rounding (100.50 - ?100). In areas where I use NumberFormat directly, I set setMaximumFractionDigits and setMinimumFractionDigits to 2 but I can't seem to set this in the MessageFormat. Any ideas?

    Read the article

  • UnicodeDecodeError on attempt to save file through django default filebased backend

    - by Ivan Kuznetsov
    When i attempt to add a file with russian symbols in name to the model instance through default instance.file_field.save method, i get an UnicodeDecodeError (ascii decoding error, not in range (128) from the storage backend (stacktrace ended on os.exist). If i write this file through default python file open/write all goes right. All filenames in utf-8. I get this error only on testing Gentoo, on my Ubuntu workstation all works fine. class Article(models.Model): file = models.FileField(null=True, blank=True, max_length = 300, upload_to='articles_files/%Y/%m/%d/') Traceback: File "/usr/lib/python2.6/site-packages/django/core/handlers/base.py" in get_response 100. response = callback(request, *callback_args, **callback_kwargs) File "/usr/lib/python2.6/site-packages/django/contrib/auth/decorators.py" in _wrapped_view 24. return view_func(request, *args, **kwargs) File "/var/www/localhost/help/wiki/views.py" in edit_article 338. new_article.file.save(fp, fi, save=True) File "/usr/lib/python2.6/site-packages/django/db/models/fields/files.py" in save 92. self.name = self.storage.save(name, content) File "/usr/lib/python2.6/site-packages/django/core/files/storage.py" in save 47. name = self.get_available_name(name) File "/usr/lib/python2.6/site-packages/django/core/files/storage.py" in get_available_name 73. while self.exists(name): File "/usr/lib/python2.6/site-packages/django/core/files/storage.py" in exists 196. return os.path.exists(self.path(name)) File "/usr/lib/python2.6/genericpath.py" in exists 18. st = os.stat(path) Exception Type: UnicodeEncodeError at /edit/ Exception Value: ('ascii', u'/var/www/localhost/help/i/articles_files/2010/03/17/\u041f\u0440\u0438\u0432\u0435\u0442', 52, 58, 'ordinal not in range(128)')

    Read the article

  • Is there any reason to prefer UTF-16 over UTF-8?

    - by Oak
    Examining the attributes of UTF-16 and UTF-8, I can't find any reason to prefer UTF-16. However, checking out Java and C#, it looks like strings and chars there default to UTF-16. I was thinking that it might be for historic reasons, or perhaps for performance reasons, but couldn't find any information. Anyone knows why these languages chose UTF-16? And is there any valid reason for me to do that as well?

    Read the article

  • PHP: Cyrillic characters not displayed correctly

    - by user295502
    Recently I switched hosting from one provider to the other and I have problems displaying Cyrillic characters. The characters which are read from the database are displayed correctly, but characters which are hardcoded in the php file aren't (they are displayed as question marks). The files which contain the php source code are saved in utf-8 form. Help anybody?

    Read the article

  • python read utf8 text file problem

    - by cpps
    I have a problem with python about reading and print utf8 text file. I have a test.txt in utf8 encoding without BOM. This file has two characters in it: ?? The first character "?" is Chinese and the second "?" is Japanese. Now, When I use Ulipad (a python editor) to run the following code to read the txt file, and print these two characters. import codecs infile = "C:\\test.txt" f = codecs.open(infile, "r", "utf-8") s = f.read() print(s) I got this error, "UnicodeEncodeError: 'cp950' codec can't encode character '\u58f0' in position 1: illegal multibyte sequence" I found it caused from the second character "?" . But when I use the same code to test in python default GUI IDLE, it works to print the two characters with no error. So, how can I fix the problem. My running environment is python 3.1 , windows xp traditional Chinese.

    Read the article

  • How can I use ToUnicode without breaking dead key support?

    - by Cypherjb
    A similar question has already been asked, so I'm not going to waste time re-explaining it, an existing discussion can be found here: http://stackoverflow.com/questions/1964614/toascii-tounicode-in-a-keyboard-hook-destroys-dead-keys The reason I'm posting a new question however is that I seem to have come across a 'solution', but I'm not quite sure how to implement it. This blog post seems to propose a solution to the problem of ToUnicode killing dead-key support: http://blogs.msdn.com/michkap/archive/2005/01/19/355870.aspx However I'm not sure how to implement the suggested solution. A push in the right direction would be greatly appreciated. To be clear, the part I'm referring to is this: "There are two ways to work around this: 1) You can keep calling ToUnicode with the same info until it is cleared out and then call it one more time to put the state back where it was if you had never typed anything, or 2) You can load all of the keyboard info ahead of time and then when they type information you can look up in your own info cache what the keystrokes mean, without having to call APIs later." I'm not quite sure how to do either of those things (keyboards and internationalization are far from my strong point), so any help would be greatly appreciated. Thanks

    Read the article

  • Amazon SQS invalid binary character in message body

    - by letronje
    I have a web app that sends messages to an Amazon SQS Queue. Amazon sqs lib throws a 'AmazonSQSException' since the message contained invalid binary character. The message is the referrer obtained from an incoming http request. This is what it looks like: http://ads.vrx.adbrite.com/adserver/display_iab_ads.php?sid=1220459&title_color=0000FF&text_color=000000&background_color=FFFFFF&border_color=CCCCCC&url_color=008000&newwin=0&zs=3330305f323530&width=300&height=250&url=http%3A%2F%2Funblockorkutproxy.com%2Fsearch.php%2FOi8vZG93%2FbmxvYWRz%2FLnppZGR1%2FLmNvbS9k%2Fb3dubG9h%2FZGZpbGUv%2FNTY5MTQ3%2FNi9NeUN1%2FdGVHaXJs%2FZnJpZW5k%2FWmFoaXJh%2FLndtdi5o%2FdG1s%2Fb0%2F^Fô}úÃ<99ë)j Looks like the characters in bold are the invalid characters. Is there an easy way to filter out characters characters that are not accepted by amazon ? Here are the characters allowed by amazon in message body. I am not sure what regex i should use to replace invalid characters by ''

    Read the article

< Previous Page | 90 91 92 93 94 95 96 97 98 99 100 101  | Next Page >