text encoding - Page 89

Efficient way to ASCII encode UTF-8

- by Andreas Gohr

I'm looking for a simple and efficient way to store UTF-8 strings in ASCII-7. With efficient I mean the following: all ASCII chars in the input should stay ASCII chars in the output the resulting string should be as short as possible the operation needs to be reversable without any data loss there should be no restriction on the input length the whole UTF-8 range should be allowed My first idea was to use Punycode (IDNA) as it fits the first three requirements, but it fails at the last two. Can anyone recommend an alternative encoding scheme? Even better if there's some code available to look at.

Read the article

Load JSON in Python as header character set

- by mridang

Hi everyone, I've always found character sets and encodings complicated to understand and here I'm faced with another problem. My apologies for any inaccuracies. I'll do my best. I'm requesting data from a server which returns JSON. In the HTTP headers it also returns the character set like so: Content-Type: text/html; charset=UTF-8 I'm using the JSON library in Python to load the JSON using the json.loads method. When I pass it the returned JSON, it gives me a dictionary in Unicode. I've Googled around and I know that JSON should return Unicode as JavaScript strings are Unicode objects. How can I load the JSON as UTF-8? I would like to use the same encoding as specified in the response header. I've read this post but it didn't help. Thank you.

Read the article

Load JSON in Python as header chracterset

- by mridang

Hi everyone, I've always found character-sets and encodings complicated to understand and here I'm faced with another problem. My apologies for any inaccuracies. I'll do my best. I'm requesting data from a server which returns JSON. In the HTTP headers it also returns the character.set like so: Content-Type: text/html; charset=UTF-8 I'm using the JSON library in python to load the JSON using the json.loads method. When I pass it the returned JSON, it gives me a dictionary in Unicode. I've Googled around and I know that JSON should return Unicode as JavaScript strings are Unicode objects. How can I load the JSON as UTF-8. I would like to use the same encoding as specified in the response header. I've read this post but it didn't help. Thank you.

Read the article

Should I convert overlong UTF-8 strings to their shortest normal form?

- by Grant McLean

I've just been reworking my Encoding::FixLatin Perl module to handle overlong UTF-8 byte sequences and convert them to the shortest normal form. My question is quite simply "is this a bad idea"? A number of sources (including this RFC) suggest that any over-long UTF-8 should be treated as an error and rejected. They caution against "naive implementations" and leave me with the impression that these things are inherently unsafe. Since the whole purpose of my module is to clean up messy data files with mixed encodings and convert them to nice clean utf8, this seems like just one more thing I can clean up so the application layer doesn't have to deal with it. My code does not concern itself with any semantic meaning the resulting characters might have, it simply converts them into a normalised form. Am I missing something. Is there a hidden danger I haven't considered?

Read the article

Stream/string/bytearray transformations in Python 3

- by Craig McQueen

Python 3 cleans up Python's handling of Unicode strings. I assume as part of this effort, the codecs in Python 3 have become more restrictive, according to the Python 3 documentation compared to the Python 2 documentation. For example, codecs that conceptually convert a bytestream to a different form of bytestream have been removed: base64_codec bz2_codec hex_codec And codecs that conceptually convert Unicode to a different form of Unicode have also been removed (in Python 2 it actually went between Unicode and bytestream, but conceptually it's really Unicode to Unicode I reckon): rot_13 My main question is, what is the "right way" in Python 3 to do what these removed codecs used to do? They're not codecs in the strict sense, but "transformations". But the interface and implementation would be very similar to codecs. I don't care about rot_13, but I'm interested to know what would be the "best way" to implement a transformation of line ending styles (Unix line endings vs Windows line endings) which should really be a Unicode-to-Unicode transformation done before encoding to byte stream, especially when UTF-16 is being used, as discussed this other SO question.

Read the article

Python BOM error in Ascii file

- by Intosia

I have a wierd annoying problem with Python 2.6 I trying to run this file (and the other), on my Embedded Linux ARM board. http://svn.tuxisalive.com/software_suite_v3/smart-core/smart-server/trunk/TDSService.py I get this error File "tuxhttpserver.py", line 1 SyntaxError: encoding problem: with BOM I know that error is about the BOM bytes etc etc. BUT, there are NO BOM bytes, its plain Ascii. I checked with a Hexeditor, and the linux File command says its Ascii. Im freaking out here... The code worked fine on my Sheevaplug (also a ARM based system).

Read the article

Batch convert latin-1 files to utf-8 using iconv

- by Jasmo

I'm having this one PHP project on my OSX which is in latin1 -encoding. Now I need to convert files to UTF8. I'm not much a shell coder and I tried something I found from internet: mkdir new for a in ls -R *; do iconv -f iso-8859-1 -t utf-8 <"$a" new/"$a" ; done But that does not create the directory structure and it gives me heck load of errors when run. Can anyone come up with neat solution?

Read the article

I can't change HTTP request header Content-Type value using jQuery

- by Matt

Hi I tried to override HTTP request header content by using jQuery's AJAX function. It looks like this $.ajax({ type : "POST", url : url, data : data, contentType: "application/x-www-form-urlencoded;charset=big5", beforeSend: function(xhr) { xhr.setRequestHeader("Accept-Charset","big5"); xhr.setRequestHeader("Content-Type","application/x-www-form-urlencoded;charset=big5"); }, success: function(rs) { target.html(rs); } }); Content-Type header is default to "application/x-www-form-urlencoded; charset=UTF-8", but it obviously I can't override its value no matter I use 'contentType' or 'beforeSend' approaches. Could anyone adivse me a hint that how do I or can I change the HTTP request's content-type value? thanks a lot. btw, is there any good documentation that I can study JavaScript's XMLHttpRequest's encoding handling?

Read the article

Silverlight Video Player that plays .MP4 & .FLV

- by YeahStu

I am currently using the Silverlight 2 Video Player to stream videos. I have been very pleased with it but it only seems to stream .WMV files. Does anyone know if there is a good Silverlight video player that will stream other types of video files, especially .MP4 & .FLV? I would be happy to use Silverlight 3 if necessary. EDIT: Because I like this player and have not found a great option, I am considering encoding files as I receive them so that they will always be streamed later as a .WMV. Unless I determine a good player (I am considering flash at this point), I will have to go down this road.

Read the article

Watermarking Flash Videos (server-side)

- by Roberto Aloi

Hi all, I have a bunch of flash videos that I need to watermark with user related information, to make illegal re-distribution of these files harder. I'm wondering how can this be done server-side. If done client-side, it will be quite easy for the user to intercept the videos before they are watermarked. Since the watermark should contain user-specific information I can't really watermark the videos before encoding them (unless I have an encoded video per user - not feasible). I'm expecting this to affect the streaming performances a lot, though. Any idea how this can be done (possibly in an efficient way)?

Read the article

European signs in img src problem

- by Rakoon

Hey. I recently encountered a strange problem on my website. Images with æ ø and å in them (Western European signs) Won't display. The character encoding on all sites is "Iso-8859-1" I can print æ ø and å on the page without problems. If I right click the "broken image" and choose properties, it displays the filename with the european signs. (/admin/content/galleri/å.jpg) the code for img looks like this <img name='bilde' src='content/{$_SESSION["linkname"]}/{$row["img"]}' class='topmargin_ss leftmargin_ms rightmargin_s' width='80' height='80'> (Wasn't allowed to post images so the code is without starting and ending brackets) Made 4 files: z.jpg æ.jpg ø.jpg å.jpg Only z.jpg shows up, they are the exact same jpg. The images are uploaded using php code, which works, uploads to the right directory and has no problem with the european signs. Does anybody know what could be causing this?

Read the article

What video codecs have most amount of content and thus popular at present/in future?

- by goldenmean

Hi, I want to find out if I can get some data on the percentage wise distribution of video content, for different video codecs currently used for video encoding. I know there are different applications/use-case scenarios which have different encoder used but i want to consdier all that and have a overall usage number(%) My guess is(highest to lowest % of content) - H.264(AVC) DivX MPEG2 VP6 Where do H.263, MPEG4, VC-1, RV, Theora, etc. fit in here. How may this look like in future? PS:I would like this to be community wiki to have get wider range of inputs, if someone with privileges can do it for me please. thank you. -AD

Read the article

Java application failing on special characters.

- by Scottm

An application I am working on reads information from files to populate a database. Some of the characters in the files are non-English, for example accented French characters. The application is working fine in Windows but on our Solaris machine it is failing to recognise the special characters and is throwing an exception. For example when it encounters the accented e in "Gérer" it says :- Encountered: "\u0161" (353), after : "\'G\u00c3\u00a9rer les mod\u00c3" (an exception which is thrown from our application) I suspect that in order to stop this from happening I need to change the file.encoding property of the JVM. I tried to do this via System.setProperty() but it has not stopped the error from occurring. Are there any suggestions for what I could do? I was thinking about setting the basic locale of the solaris platform in /etc/default/init to be UTF-8. Does anyone think this might help? Any thoughts are much appreciated.

Read the article

[Integrity] of a Http Post Request from Iphone to web server

- by gotye

Hey everyone, I am currently building a module that makes possible to comment a news and as you probably understood, I will need to insert this new comment in my web database. I know this stuff can be very fastidous so I would like to know if someone has a method which could assure the integrity of the request by checking some of the usual important stuff liek : trimming the string encoding it ? escaping it ? and so on ... If you have some tips to achieve a good insert, do not hesitate ;) Thank you for your time, Gotye.

Read the article

How to search for a string including spaces in Objective C?

- by AlexCu

I have a real basic command-line program, in Objective-C, that searches for user inputed information. Unfourtunately, the code will only read the first word in series of words that the user enters. For example, if the user enters in "Apples are great", only "Apples" is kept (and hence searched later on), excluding the "are great" part of the sentence. Here's what I have so far: char enteredQuery [128]; // array 'name' to hold the scanf string NSString *searchQuery; // ending NSString to hold and compare the user inputed data NSLog(@"Enter search query:"); scanf("%s", enteredQuery); //will read the next line searchQuery = [NSString stringWithCString: enteredQuery encoding: NSASCIIStringEncoding]; //converts scanf data into a NSString type I know it's got to do with me using scanf or the character-encoder conversion, but I can't seem to figure it out. Any help in solving the problem is very appreciated! Thanks.

Read the article

Why Solr admin query page interprets UTF-8 as ISO-8859-1

- by Scott Chu

I deploy a war to my Tomcat 6.0.35 on Win7 64bit and when I use full-interface query page (I mean form.jsp) in Solr Admin to query 2 Chinese character (say it's C1C2) , the debug info shows: <lst name="debug"> <str name="rawquerystring">æ°è</str> <str name="querystring">æ°è</str> <str name="parsedquery">NEWSID:æ°è</str> <str name="parsedquery_toString">NEWSID:æ°è</str> ... You can see C1C2 becomes æ°è. I deploy same war file to Tomcat on Linux or on another Win7 64bit of my colleagues' computer, the encoding acts well. Does anyone know why and how can I avoid this problem? Thanks in advance!

Read the article

Manipulating both unicode and ASCII character set in C#

- by Murlex

I have this mapping in my C# application string [,] unicode2Ascii = { { "ஹ", "\x86" } }; ஹ - is the unicode value for a tamil literal "ஹ". This is the raw hex literal for the unicode value saved by MS Word as a byte sequence. I am trying to map these unicode value "strings" to a hex value under 255 (so as to accommodate non-unicode supported systems). I trying to use string.replace like this: S = S.replace(unicode2Ascii[0,0], unicode2Ascii[0,1]); However the resultant ouput has a ? instead of the actual hex 0x86 stored. Any pointer on how I could set the encoding for the second element of that array to something like windows-1252? Or is there a better way to do this conversion? thanks in advance

Read the article

How to get rid of "d»z" or "ï»¿" characters

- by Cassandra

I have website based on Umbraco 5. I have installed contact form plugin (http://cultivjupitercontact.codeplex.com/). And on the web page at the end of this contact form there are always characters "d»z". It looks like that: ... <input type="submit" value="Send" /> </fieldset> <input name='uformpostroutevals' type='hidden' value='somevalue' /></form>d»z I suspect there is something wrong with encoding. I have tried to change it(to ANSI or UTF-8 without BOM but it didn't helped. Perhaps I have changed it in wrong file, cause I don't really know where exactly this 'd»z'is coming from. All I know it came with this plugin. On different server those extra characters are "ï»¿". How can I get rid of those extra characters? Any help much appreciated!

Read the article

Should I convert overly-long UTF-8 strings to their shortest normal form?

- by Grant McLean

I've just been reworking my Encoding::FixLatin Perl module to handle overly-long UTF-8 byte sequences and convert them to the shortest normal form. My question is quite simply "is this a bad idea"? A number of sources (including this RFC) suggest that any over-long UTF-8 should be treated as an error and rejected. They caution against "naive implementations" and leave me with the impression that these things are inherently unsafe. Since the whole purpose of my module is to clean up messy data files with mixed encodings and convert them to nice clean utf8, this seems like just one more thing I can clean up so the application layer doesn't have to deal with it. My code does not concern itself with any semantic meaning the resulting characters might have, it simply converts them into a normalised form. Am I missing something. Is there a hidden danger I haven't considered?

Read the article

Python "string_escape" vs "unicode_escape"

- by Mike Boers

According to the docs, the builtin string encoding string_escape: Produce[s] a string that is suitable as string literal in Python source code ...while the unicode_escape: Produce[s] a string that is suitable as Unicode literal in Python source code So, they should have roughly the same behaviour. BUT, they appear to treat single quotes differently: >>> print """before '" \0 after""".encode('string-escape') before \'" \x00 after >>> print """before '" \0 after""".encode('unicode-escape') before '" \x00 after The string_escape escapes the single quote while the Unicode one does not. Is it safe to assume that I can simply: >>> escaped = my_string.encode('unicode-escape').replace("'", "\\'") ...and get the expected behaviour?

Read the article

Get rid of gray brackets arond editable text in restricted Word docs

- by Brendan

I'm trying to work out a problem in Word that I thought was simply a glitch from 2003 until we upgraded to 2010 and the problem persisted. For our corporate letterhead, we set up the template with placeholder text, highlight the text, and then make the document read-only with the exception of the selected text. The editable text turns yellow and gains these brackets around them: Once these brackets appear, they'll always show on the screen. That I can handle, though I'd like to learn how to hide them on-screen if that's possible. When the document is printed while protected, it works fine. When the document is printed while NOT protected, part of the bracket shows up on the paper! I guess the ultimate question is, how can I get rid of the brackets altogether? I can see why they exist but in my use case they create more problems than they solve. I'd like someone to be able to read the doc without seeing brackets, and I'd like other people in my department to be able to print without having to re-restrict it first. I tried to turn off bookmarks because that's what seemed to come up when I searched around, but that didn't do anything.

Read the article

Setting values and display Text in Android Spinner

- by kaibuki

Hi, I need help in setting up value and display text in spinner. as per now I am populating my spinner by array adapter e.g mySpinner.setAdapter(myAdapter); and as far as I know after doing this the display text and the value of spinner at same position is same. The other attribute that I can get from spinner is the position on the item. now in my case I want to make spinner like the drop down box, which we have in .NET. which holds a text and value. where as text is displayed and value is at back end. so if I change drop down box , I can either use its selected text or value. but its not happening in android spinner case. For Example: Text Value Cat 10 Mountain 5 Stone 9 Fish 14 River 13 Loin 17 so from above array I am only displaying non-living objects text, and what i want is that when user select them I get there value i.e. like when Mountain selected i get 5 I hope this example made my question a bit more clear... thankx

Read the article

iTextSharp error - cannot convert type 'Collections.Generic.List' to 'iTextSharp.text.Element'

- by mike

I am trying to export pdf file using aspx and c#. I got the following error. Cannot implicitly convert type 'System.Collections.Generic.List'' to 'iTextSharp.text.Element' I have the following code using iTextSharp.text; using iTextSharp.text.pdf; using iTextSharp.text.html.simpleparser; StringBuilder strB = new StringBuilder(); document.Open(); if (text.Length.Equals(0))//export the text { GridView1.DataBind(); using (StringWriter sWriter = new StringWriter(strB)) { using (HtmlTextWriter htWriter = new HtmlTextWriter(sWriter)) { GridView1.RenderControl(htWriter); } } } else //export the grid { strB.Append(text); } using (TextReader sReader = new StringReader(strB.ToString())) { StyleSheet styles = new StyleSheet(); List<Element> list = new List<Element>(); list = HTMLWorker.ParseToList(sReader, styles); foreach (IElement elm in list) { document.Add(elm); } } I got the error in this line: list = HTMLWorker.ParseToList(sReader, styles); It's the first time that I am trying to export pdf files. I tried to cast the list element , however this did not solve my error. Any advice would be helpful!!!

Read the article

jQuery: AJAX umlauts & special characters are a mess

- by rayne

I've just created my first ajax function with jQuery which actually works, but unfortunately the character encoding (for characters like ä, ö, ü, ß, c, c, å, ø) is a nightmare. My files and my database are all UTF-8. I've tried a multitude of options in the ajax function and the PHP function, none of which were satisfactory. This is my ajax var dataString = { 'name': name, 'mail': mail // other stuff } $.ajax({ type: "POST", url: "/post.php", data: dataString, contentType: "application/x-www-form-urlencoded;charset=UTF-8", cache: false, success: function(html){ // do stuff } I've tried it without contentType: "application/x-www-form-urlencoded;charset=UTF-8" and I've tried to wrap the affected data in encodeURIComponent(), none of which worked. When I use that AJAX with htmlentities() in my php, my umlauts look like this in plain text: UE Ã?, AE Ã?, OE Ã?, ue Ã¼, ae Ã¤, oe o And like this in the database: UE Ãœ , AE Ã„, OE Ã–, ue Ã¼, ae Ã¤, oe o If I don't use htmlentities() but mysql_real_escape_string() instead (or neither), they look good in plain text, but they look like this in the database: AE Ã„, OE Ã–, UE Ãœ, ae Ã¤ oe Ã¶ ue Ã¼ I've been trying tons of options for hours now, but I can't find a solution that works. So far the only option I seem to have is having them look like a total mess in the database, but that would be very contraproductive if those data sets need to be edited.

Read the article

Marshal.StringToCoTaskMemAnsi converting non-Latin characters when sending raw data to a printer

- by rem

For sending raw data to a thermal DATAMAX printer I'm using RawPrinterHelper class from this Microsoft KB article. When a string sent to printer contains only Latin characters, everything is OK. But non-Latin, in my case Russian characters in a string, are not printed correct. I think the problem is in using Marshal.StringToCoTaskMemAnsi method for converting the string: public static bool SendStringToPrinter(string szPrinterName, string szString) { IntPtr pBytes; Int32 dwCount; // How many characters are in the string? dwCount = szString.Length; // Assume that the printer is expecting ANSI text, and then convert // the string to ANSI text. pBytes = Marshal.StringToCoTaskMemAnsi(szString); // Send the converted ANSI string to the printer. SendBytesToPrinter(szPrinterName, pBytes, dwCount); Marshal.FreeCoTaskMem(pBytes); return true; } Just to note, Russian characters in the string are put in hex format, like "\x83", but nevertheless the method doesn't put this hex value in unmanaged memory as it is, but converts it, I think, according with ANSI code page to a character and then printer can not read it correctly. If I try to compose a file, using Hex editor and put correct hex values in place of non-Latin characters and then send the file to a printer using another method from the same class SendFileToPrinter, everything, including Russian characters is printed correctly. How in this case the problem with sending string, containing non-Latin characters, could be solved?

Search Results

Search found 38289 results on 1532 pages for 'text encoding'.

Page 89/1532 | < Previous Page | 85 86 87 88 89 90 91 92 93 94 95 96 | Next Page >

- by Andreas Gohr

- by mridang

- by mridang

- by Grant McLean

- by Craig McQueen

- by Intosia

- by Jasmo

- by Matt

- by YeahStu

- by Roberto Aloi

- by Rakoon

- by goldenmean

- by Scottm

- by gotye

- by AlexCu

- by Scott Chu

- by Murlex

- by Cassandra

- by Grant McLean

- by Mike Boers

- by Brendan

- by kaibuki

- by mike

- by rayne

- by rem

< Previous Page | 85 86 87 88 89 90 91 92 93 94 95 96 | Next Page >