Search Results

Search found 4604 results on 185 pages for 'utf 8'.

Page 14/185 | < Previous Page | 10 11 12 13 14 15 16 17 18 19 20 21 | Next Page >

Error reading file with accented vowels

- by Daniel Dcs

The following statement to fill a list from a file : action = [] with open (os.getcwd() + "/files/" + "actions.txt") as temp: action = list (temp) gives me the following error: (result, consumed) = self._buffer_decode (data, self.errors, end) UnicodeDecodeError: 'utf-8' codec can not decode byte 0xf1 in position 67: invalid continuation byte if I add errors = 'ignore': action = [] with open (os.getcwd () + "/ files /" + "actions.txt", errors = 'ignore') as temp: action = list (temp) Is read the file but not the ñ and vowels accented á-é-í-ó-ú being that python 3 works, as I have understood, default to 'utf-8' I'm looking for a solution for two or more days, and I'm getting more confused. In advance thank you very much for any suggestions.

Read the article
Encoding conversation UTF-8 to 1251 in javascript

- by lak-b

I need to convert string in UTF-8 to byte array in 1251 codepage in JavaScript. Google says nothing useful. Help :)

Read the article
Charset conversion from XXX to utf-8, command line

- by Marcin

I have a bunch of text files that are encoded in ISO-8851-2 (have some polish characters). Is there a command line tool for linux/mac that I could run from a shell script to convert this to a saner utf-8?

Read the article
How to convert pdf to utf-8

- by Apple

I am trying to upload a pdf file using webservice api. But this api doesnot work for pdf file. it works fine for text file.when i try to upload a pdf file it give error as Client-SOAP-ERROR: Encoding: string '%PDF-1.4 %\xc7...' is not a valid utf-8 string So can we convert this pdf file into utf8 string. i am using php as a scripting language.

Read the article
Convert string from UTF-8 to ISO 8859-1 in Java

- by Derk

I want to encode a UTF-8 string to a ISO 8859- string in Java I have this: String title = new String(item.getTitle().getText().getBytes("ISO-8859-1")); But it isn't working, the output is SÃ¸rensen for example

Read the article
How to allow utf-8 charset in preg_match ???

- by Shri.harry

Hello everone, I am using preg_match() function only to allow specific charachters to accept. It is allowing all alphabates and numbers but along with that i also want to allow utf-8 characters such as "ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖ" so how can i allow this charachters from preg_match() function.Plase suggest me. Thanks in advance. Regards Shri

Read the article
broken UTF-8 String ruby

- by josh

While reading a file I get broken UTF-8 String error whenever I have the following in my file través if I change it to normal e then it works. Whats the way to fix this? error only happens if I do line.lstrp or any other function. Just printing the lines is ok. problem even happens when I try to match the string with regex.

Read the article
Convert UTF-8 to Chinese Simplified (GB2312)

- by Jyothish

Hi Is there a way to convert UTF-8 string to Chinese Simplified (GB2312) in C#. Any help is greatly appreciated. Regards Jyothish George

Read the article
Is UTF-8 enough for all common languages?

- by jack

I just wanted to develop a translation app in a Django projects which enables registered users with certain permissions to translate every single message it appears in latest version. My question is, what character set should I use for database tables in this translation app? Looks like some european language characters cannot be stored in UTF-8?

Read the article
UTF-8 vs ASCII Text

- by user15432

Why does sql database use UTF-8 Encoding? do they both use 8-bit to store a character?

Read the article
Mongo Client RedHat EL5 UT8 Support

- by Michael Irey

# mongo MongoDB shell version: 1.6.4 Fri Mar 16 11:55:46 *** warning: spider monkey build without utf8 support. consider rebuilding with utf8 support connecting to: test Mongo Server seems to handle the utf8 characters fine, as well as my php-mongo-client driver. But when I try to query a record that has a utf8 character from the mongo command line client I get: > db.Users.find({age:33}); error:non ascii character detected Fri Mar 16 11:55:43 mongo got signal 11 (Segmentation fault), stack trace: Fri Mar 16 11:55:43 0x440b50 0x3664c302d0 0x3f47e7b6e0 0x3f47e83bbd 0x3f47e254f3 0x3f47e25660 0x3f47e256ee 0x3f47e25792 0x3f47e2876e 0x4b031d 0x443b72 0x445476 0x3664c1d994 0x43fd39 mongo(_Z12quitAbruptlyi+0x3b0) [0x440b50] /lib64/libc.so.6 [0x3664c302d0] /usr/lib64/libjs.so.1 [0x3f47e7b6e0] /usr/lib64/libjs.so.1(js_CompileTokenStream+0x3d) [0x3f47e83bbd] /usr/lib64/libjs.so.1 [0x3f47e254f3] /usr/lib64/libjs.so.1(JS_CompileUCScriptForPrincipals+0x60) [0x3f47e25660] /usr/lib64/libjs.so.1(JS_EvaluateUCScriptForPrincipals+0x3e) [0x3f47e256ee] /usr/lib64/libjs.so.1(JS_EvaluateUCScript+0x22) [0x3f47e25792] /usr/lib64/libjs.so.1(JS_EvaluateScript+0x6e) [0x3f47e2876e] mongo(_ZN5mongo7SMScope4execERKSsS2_bbbi+0xed) [0x4b031d] mongo(_Z5_mainiPPc+0x14a2) [0x443b72] mongo(main+0x26) [0x445476] /lib64/libc.so.6(__libc_start_main+0xf4) [0x3664c1d994] mongo(__gxx_personality_v0+0x269) [0x43fd39] Any ideas or suggestions would be welcome

Read the article
Strange xml/html accent issue

- by Ayrad

I have an XML file that contains a message with html tags in it. The XML file is read by a java class that mails it to people. When the mail is received, the accents do not show. For example é doesn't show. I have tried é in the xml but it gives an error in eclipse saying that the entity has not been declared. I also tried simply inserting é but that shows nothing in the final output. The 3rd thing I tried was using <![CDATA[é]]> but that broke the parser since it didn't output anything after it. However I noticed something weird. When i put something like this in the xml and added UTF-16 encoding <message>text bla bla blaa é< it did ouput the é at the end like this bla bla blaa blaa é. EDIT <message>text bla bla blaa éé< outputs ?é or just one é The file looks something like this: <?xml version="1.0"? encoding="UTF-16"> <message> <b>hello é </b> </message> </xml> What gives?

Read the article
WCF Method is returning xml fragment but no xml UTF-8 header

- by horls

My method does not return the header, just the root element xml. internal Message CreateReturnMessage(string output, string contentType) { // create dictionaryReader for the Message byte[] resultBytes = Encoding.UTF8.GetBytes(output); XmlDictionaryReader xdr = XmlDictionaryReader.CreateTextReader(resultBytes, 0, resultBytes.Length, Encoding.UTF8, XmlDictionaryReaderQuotas.Max, null); if (WebOperationContext.Current != null) WebOperationContext.Current.OutgoingResponse.ContentType = contentType; // create Message return Message.CreateMessage(MessageVersion.None, "", xdr); } However, the output I get is: <Test> <Message>Hello World!</Message> </Test> I would like the output to render as: <?xml version="1.0" encoding="utf-8" standalone="yes"?> <Test> <Message>Hello World!</Message> </Test>

Read the article
Creating files with french characters and encoding.

- by Kevin

HI, I am creating a file like so. FileStream temp = File.Create( this.FileName ); Then putting data in the file like so. this.Writer = new StreamWriter( this.Stream ); this.Writer.WriteLine( strMessage ); That code is encapsulated in a class hierarchy but that is the meat and potatoes of it. My problem is this. MSDN says that the default encoding for creating a file this way is UTF8. And when I write a french character such as é Textpad interprets the file as a UTF 8 file, but notepad++ says it's "ANSI as UTF8" or maybe it's an ansi file but is reading it as UTF8. When I create a file the same way without the french character both textpad and notepad++ read the file as an ansi file even though according to msdn it should be a utf 8 file still. Which program should be trusted. Notepad++ or textpad - Notepad++ seems to be more consistant, but is still the oppossite to what MSDN says it should be. My problem is that we create files that get sent off to another company and depending on whether there are french characters the encoding seems to keep changing. Or is there a better way to determine the encoding of a file. I've read about byte order marks and preambles but as far as I understand neither are guaranteed to be there. We initially thought that all the files we were building were ansi. Also please note that both ansi and utf8 should handle the french characters appropriately as the characters are part of both character sets.

Read the article
ruby 1.9: invalid byte sequence in UTF-8

- by Marc Seeger

I'm writing a crawler in ruby (1.9) that consumes lots of HTML from a lot of random sites. When trying to extract links, I decided to just use .scan(/href="(.*?)"/i) instead of nokogiri/hpricot (major speedup). The problem is that I now receive a lot of "invalid byte sequence in UTF-8" errors. From what I understood, the net/http library doesn't have any encoding specific options and the stuff that comes in is basically not properly tagged. What would be the best way to actually work with that incoming data? I tried .encode with the replace and invalid options set, but no success so far...

Read the article
Delphi dbExpress and Interbase: UTF8 migration steps and risks?

- by mjustin

Currently, our database uses Win1252 as the only character encoding. We will have to support Unicode in the database tables soon, which means we have to perform this migration for four databases and around 80 Delphi applications which run in-house in a 24/7 environment. Are there recommendations for database migrations to UTF-8 (or UNICODE_FSS) for Delphi applications? Some questions listed below. Many thanks in advance for your answers! are there tools which help with the migration of the existing databases (sizes between 250 MB and 2 GB, no Blob fields), by dumping the data, recreating the database with UNICODE_FSS or UTF-8, and loading the data back? are there known problems with Delphi 2009, dbExpress and Interbase 7.5 related to Unicode character sets? would you recommend to upgrade the databases to Interbase 2009 first? (This upgrade is planned but does not have a high priority) can we simply migrate the database and Delphi will handle the Unicode character sets automatically, or will we have to change all character field types in every Datamodule (dfm and source code) too? which strategy would you recommend to work on the migration in parallel with the normal development and maintenance of the existing application? The application runs in-house so development and database administration is done internally.

Read the article
Http Digest Authentication, Handle different browser char-sets...

- by user160561

Hi all, I tried to use the Http Authentication Digest Scheme with my php (apache module) based website. In general it works fine, but when it comes to verification of the username / hash against my user database i run into a problem. Of course i do not want to store the user´s password in my database, so i tend to store the A1 hashvalue (which is md5($username . ':' . $realm . ':' . $password)) in my db. This is just how the browser does it too to create the hashes to send back. The Problem: I am not able to detect if the browser does this in ISO-8859-1 fallback (like firefox, IE) or UTF-8 (Opera) or whatever. I have chosen to do the calculation in UTF-8 and store this md5 hash. Which leads to non-authentication in Firefox and IE browsers. How do you solve this problem? Just do not use this auth-scheme? Or Store a md5 Hash for each charset? Force users to Opera? (Terms of A1 refer to the http://php.net/manual/en/features.http-auth.php example.) (for digest access authentication read the according wikipedia entry)

Read the article
How to generate pdf files _with_ utf-8 multibyte characters using Zend Framework

- by Sejanus

Hello, I've got a "little" problem with Zend Framework Zend_Pdf class. Multibyte characters are stripped from generated pdf files. E.g. when I write aabccdee it becomes abcd with lithuanian letters stripped. I'm not sure if it's particularly Zend_Pdf problem or php in general. Source text is encoded in utf-8, as well as the php source file which does the job. Thank you in advance for your help ;) P.S. I run Zend Framework v. 1.6 and I use FONT_TIMES_BOLD font. FONT_TIMES_ROMAN does work

Read the article
Changing character encoding in MySQL, PHP scripts, HTML

- by Sandman

So, I have built on this system for quite some time, and it is currently outputting Latin1 (ISO-8859-1) to the web browser, and this is the components: MySQL - all data is stored with the Latin1 character set PHP - All PHP text files are stored on disk with Latin1 encoding HTML - The output has the http-equiv="content-type" content="text/html; charset=iso-8859-1" meta tag So, I'm trying to understand how the encoding of the different parts come into play in my workflow. If I open a PHP script and change its encoding within the text editor to UTF-8 and save it back to disk and reload the web browser, the text is all messed up - unless the text comes from the DB. If I change the encoding of the DB to UTF-8 and keep the PHP files in latin1 I have to use utf8_decode() for the data to display correctly. And if I change the HTML code the browser will read it incorrectly. So yeah, I realise that if I want to "upgrade" to UTF8, I have to update all three parts of this setup for it to work correctly, but since it's a huge system with some 180k lines of PHP code and millions of posts in a lot of databases/tables, I don't want to start something like this without understanding everything correctly. What haven't I thought about? What could mess this up beyond fixing? What are the procedures for changing the encoding of an entire MySQL installation and what's the easiest way to change the encoding of hundreds or thousands of PHP files on disk? The META tag is luckily added dynamically, so I'll change that in one place only :) Let me hear about your experiences with this.

Read the article
Displaying windows-1252 text in a literal control

- by GordonB

I currently have an aspx page that has a placeholder on it. In the code-behind page i'm adding a literal control to the placeholder controls collection. The literal control just contains text/html read from a sql server database field. The only text character encoding i've used so far is UTF-8. I have the requirement for a specific page to use windows-1252 encoding. I've strapped this to the page, and browsers now recognise the proper encoding. <% Response.Charset= "windows-1252" %> My issue is that i have various german characters ( ö / ü / etc ) that aren't displaying correctly. As presumably they are still be written to the page in UTF-8 not in windows-1252. I'm looking at; Dim textEncoder = System.Text.Encoding.GetEncoding(1252) Which seems to be more geared up to dealing with byte arrays than text. Do i have to change my text to a byte array then encode as windows-1252 then get the text back out again, or is there a simpler way of achieving what i'm after?

Read the article
Batch convert latin-1 files to utf-8 using iconv

- by Jasmo

I'm having this one PHP project on my OSX which is in latin1 -encoding. Now I need to convert files to UTF8. I'm not much a shell coder and I tried something I found from internet: mkdir new for a in ls -R *; do iconv -f iso-8859-1 -t utf-8 <"$a" new/"$a" ; done But that does not create the directory structure and it gives me heck load of errors when run. Can anyone come up with neat solution?

Read the article
C++ unicode UTF-16 encoding

- by Dan

Hi all, I have a wide char string is L"hao123--??????", and it must be encoded to "hao123--\u6211\u7684\u4E0A\u7F51\u4E3B\u9875". I was told that the encoded string is a special “%uNNNN” format for encoding Unicode UTF-16 code points. In this website(http://rishida.net/tools/conversion/), it tell me it's JavaScript escapes. But I don't know how to encode it with C++. It that any library to do this work? or give me some tips. Thanks my friends!

Read the article
Char C question about encoding signed/unsigned.

- by drigoSkalWalker

Hi guys. I read that C not define if a char is signed or unsigned, and in GCC page this says that it can be signed on x86 and unsigned in PowerPPC and ARM. Okey, I'm writing a program with GLIB that define char as gchar (not more than it, only a way for standardization). My question is, what about UTF-8? It use more than an block of memory? Say that I have a variable unsigned char *string = "My string with UTF8 enconding ~ çã"; See, if I declare my variable as unsigned I will have only 127 values (so my program will to store more blocks of mem) or the UTF-8 change to negative too? Sorry if I can't explain it correctly, but I think that i is a bit complex. NOTE: Thanks for all answer I don't understand how it is interpreted normally. I think that like ascii, if I have a signed and unsigned char on my program, the strings have diferently values, and it leads to confuse, imagine it in utf8 so.

Read the article
Loading xml with encoding UTF 16 using XDocument

- by Sangram

Hi, I am trying to read the xml document using XDocument method . but i am getting an error when xml has <?xml version="1.0" encoding="utf-16"?> When i removed encoding manually.It works perfectly. I am getting error " There is no Unicode byte order mark. Cannot switch to Unicode. " i tried searching and i landed up here-- Why does C# XmlDocument.LoadXml(string) fail when an XML header is included? But could not solve my problem. My code : XDocument xdoc = XDocument.Load(path); Any suggestions ?? thank you.

Read the article
Decode S-JIS string to UTF-8

- by user566613

Hi, I am working on a Japanese File and I have no knowledge of the language. The file is encoded in S-JIS. Now, I am supposed to convert the contents into UTF-8 so that the content looks like Japanese. And here I am completely blank. I tried the following code that I found somewhere on Internet but no luck: byte[] arrByte = Encoding.UTF8.GetBytes(arrActualData[x]); string str = ASCIIEncoding.ASCII.GetString(arrByte);Can anyone help me with this? Thanks in advance Kunal

Read the article

< Previous Page | 10 11 12 13 14 15 16 17 18 19 20 21 | Next Page >