Search Results

Search found 5325 results on 213 pages for 'huffman encoding'.

Page 33/213 | < Previous Page | 29 30 31 32 33 34 35 36 37 38 39 40 | Next Page >

Debugging ASP.NET Strings Downloaded to Browser (MontrÃ©al instead of Montréal)

- by jdk

I'm downloading a vCard to the browser using Response.Write to output .NET strings with special accented characters. Mime type is text/x-vcard and French characters are appearing wrong in Outlook, for example Montréal;Québec .NET string shows as MontrÃ©al QuÃ©bec in browser. I'm using this vCard generator code from CodeProject.com I've played with the System.Encoding sample code at the bottom of this linked MSDN page to convert the unicode string into bytes and then write the ascii bytes but then I get Montr?al Qu?bec (progress but not a win). Also I've tried setting content type to both us-ascii and utf-8 of the response. If I open the downloaded vCard in Windows Notepad and save it as ANSI text (instead of default unicode format) and open in Outlook it's okay. So my assumption is I need to cause download of ANSI charset but am unsure if I'm doing it wrong or have a misunderstanding of where to start. Update: Looking at the raw HTTP, it appears my French characters are being downloaded in the unexpected format so it looks like I need to do some work on the server side... (full size)

Read the article
How can I extract similarities/patterns from a collection of binary strings?

- by JohnIdol

I have a collection of binary strings of given size encoding effective solutions to a given problem. By looking at them, I can spot obvious similarities and intuitively see patterns of symmetry and periodicity. Are there mathematical/algorithmic tools I can "feed" this set of strings to and get results that might give me an idea of what this set of strings have in common? By doing so I would be able to impose a structure (or at least favor some features over others) on candidate solutions in order to greatly reduce the search space, maximizing chances to find optimal solutions for my problem (I am using genetic algorithms as the search tool - but this is not pivotal to the question). Any pointers/approaches appreciated.

Read the article
Problem with UTF-8

- by Pablo Fernandez

I'm using castor as an OXM mapper, and I'm having a problem with UTF-8 encoding. The code here shows the issue: //Marshaller configuration ByteArrayOutputStream baos = new ByteArrayOutputStream(); OutputStreamWriter os = new OutputStreamWriter(baos, UTF_8); Marshaller marshaller = new Marshaller(os); marshaller.setSuppressXSIType(true); //Mappings configuration Mapping map = new Mapping(); map.loadMapping(MarshallingService.class.getResource(MAPPINGS_PATH)); marshaller.setMapping(map); //Example //BEFORE MARSHALLING: This prints correctly the UTF-8 Chars object.getName() ; marshaller.marshal(object); //AFTER MARSHALLING: This returns the characters like \435\235\654\345 return baos.toString(UTF_8);

Read the article
PHP File unreadable after being downloaded

- by Drew

Hi I have a script that creates a file and stores it on the server. The file is encoded in UTF-8 and is a kind of xml file for the cmap software. If i open the file directly from the server then there is no problem and the file can be read. I am forcing a download of this file when a user goes to a specific url. After such a download, the file is unreadable by the cmap software. I have to go into my text editor (notepad++) and change the encoding from UTF-8 to UTF-8 without BOM. Am I sending the wrong headers? Is php doing something to the file when it is downloading it? Any advice on this would really be appreciated. Cheers Drew

Read the article
Java servlet and UTF-8 problem

- by Gabriele

I have some problem with UTF-8. My client (realized in GWT) make a request to my servlet, with some parametres in the URL, as follow: http://localhost:8080/servlet?param=value When in the servlet I retrieve the URL, I have some problem with UTF-8 characters. I use this code: protected void service(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException { request.setCharacterEncoding("UTF-8"); String reqUrl = request.getRequestURL().toString(); String queryString = request.getQueryString(); System.out.println("Request: "+reqUrl + "?" + queryString); ... So, if I call this url: http://localhost:8080/servlet?param=così the result is like this: Request: http://localhost:8080/servlet?param=cos%C3%AC What can I do to set up properly the character encoding?

Read the article
Debugging ASP.NET Strings Downloaded to Browser

- by jdk

I'm downloading a vCard to the browser using Response.Write to output .NET strings with special accented characters. Mime type is text/x-vcard and French characters are appearing wrong in Outlook, for example Montréal;Québec .NET string shows as MontrÃ©al QuÃ©bec in browser. I'm using this vCard code from CodeProject.com I've played with the System.Encoding sample code at the bottom of this linked MSDN page to convert the unicode string into bytes and then write the ascii bytes but then I get Montr?al Qu?bec (progress but not a win). Also I've tried setting content type to both us-ascii and utf-8 of the response. Apparently the vcard file downloads as unicode. If I save it as ASCII text and open in Outlook it's okay. So my assumption is I need to cause download of ASCII but am unsure if I'm doing it wrong or have a misunderstanding of where to start.

Read the article
C#: How to print a unicode string to console?

- by Lopper

How do I print out the value of a unicode String in C# to the console? byte[] unicodeBytes = new byte[] {0x61, 0x70, 0x70, 0x6C, 0x69, 0x63, 0x61, 0x74, 0x69, 0x6F, 0x6E, 0x2F, 0x70, 0x63, 0x61, 0x70}; string unicodeString = Encoding.Unicode.GetString(unicodeBytes); Console.WriteLine(unicodeString); What I get for the above is "?????????" However, I see the following in the autos and local window when in debug mode for the value of unicodeString which is what I wanted to display. "??????????" How do I print out the correct result to the console as what the autos and local window for debugging demonstrated?

Read the article
How can I encode four unsigned bytes (0-255) to a float and back again using HLSL?

- by Statement

Hello! I am facing a task where one of my hlsl shaders require multiple texture lookups per pixel. My 2d textures are fixed to 256*256, so two bytes should be sufficient to address any given texel given this constraint. My idea is then to put two xy-coordinates in each float, giving me eight xy-coordinates in pixel space when packed in a Vector4 format image. These eight coordinates are then used to sample another texture(s). The reason for doing this is to save graphics memory and an attempt to optimize processing time, since then I don't require multiple texture lookups. By the way: Does anyone know if encoding/decoding 16 bytes from/to 4 floats using 1 sampling is slower than 4 samplings with unencoded data?

Read the article
ruby 1.9: invalid byte sequence in UTF-8

- by Marc Seeger

I'm writing a crawler in ruby (1.9) that consumes lots of HTML from a lot of random sites. When trying to extract links, I decided to just use .scan(/href="(.*?)"/i) instead of nokogiri/hpricot (major speedup). The problem is that I now receive a lot of "invalid byte sequence in UTF-8" errors. From what I understood, the net/http library doesn't have any encoding specific options and the stuff that comes in is basically not properly tagged. What would be the best way to actually work with that incoming data? I tried .encode with the replace and invalid options set, but no success so far...

Read the article
Saving CSV in cocoa

- by happyCoding25

Hello, I need to make a cvs file in cocoa. To see how to set it up I created one in Numbers and opened it with text edit it looked like this: Results,,,,,,,,,,,, ,,,,,,,,,,,, A,10,,,,,,,,,,, B,10,,,,,,,,,,, C,10,,,,,,,,,,, D,10,,,,,,,,,,, E,10,,,,,,,,,,, So to replicate this in cocoa I used: NSString *CVSData = [NSString stringWithFormat:@"Results\n,,,,,,,,,,,,\nA,%@,,,,,,,,,,,\nB,%@,,,,,,,,,,,\nC,%@,,,,,,,,,,,\nD,%@,,,,,,,,,,,\nE,%@,,,,,,,,,,,",[dataA stringValue], [dataB stringValue], [dataC stringValue], [dataD stringValue], [dataE stringValue]]; Then [CVSData writeToFile:[savePanel filename] atomically:YES]; But when I try to open the saved file with Numbers I get the error “Untitled.cvs” could not be handled because Numbers cannot open files in the “Numbers Document” format. Could this be something with the way cocoa is encoding the file? Thanks for any help

Read the article
problem using base64 encoder and InputStreamReader

- by karoberts

I have some CLOB columns in a database that I need to put Base64 encoded binary files in. These files can be large, so I need to stream them, I can't read the whole thing in at once. I'm using org.apache.commons.codec.binary.Base64InputStream to do the encoding, and I'm running into a problem. My code is essentially this FileInputStream fis = new FileInputStream(file); Base64InputStream b64is = new Base64InputStream(fis, true, -1, null); InputStreamReader reader = new InputStreamReader(b64is); preparedStatement.setCharacterStream(1, reader); When I run the above code, I get one of these during the execution of the update java.io.IOException: Underlying input stream returned zero bytes, it is thrown deep in the InputStreamReader code. Why would this not work? It seems to me like the reader would attempt to read from the base 64 stream, which would read from the file stream, and everything should be happy.

Read the article
Why Read In UTF-16LE File Won't Convert "\r\n" Into "\n" In Windows

- by Dbger

I am using Perl to read UTF-16LE files in Windows 7. If I read in an ascii file with following code: open CUR_FILE, "<", $asciiFile; Then each "\r\n" in file will be converted into a "\n" in memory; if I read in an UTF-16LE(windows 1200) file with following code: open CUR_FILE, "<:encoding(UTF-16LE)", $utf16leFile; Then "\r\n" will keep unchanged. This inconsistency cause problems when I trying to regexp lines with line breaks. My questions is: Is this how unicode works in Perl & Windows? Or Am I using the wrong code? Thanks so much!

Read the article
Handling over-long UTF-8 sequences

- by Grant McLean

I've just been reworking my Encoding::FixLatin Perl module to handle over-long utf8 byte sequences and convert them to the shortest normal form. My question is quite simply "is this a bad idea"? A number of sources (including this RFC) suggest that any over-long utf8 should be treated as an error and rejected. They caution against "naive implementations" and leave me with the impression that these things are inherently unsafe. Since the whole purpose of my module is to clean up messy data files with mixed encodings and convert them to nice clean utf8, this seems like just one more thing I can clean up so the application layer doesn't have to deal with it. My code does not concern itself with any semantic meaning the resulting characters might have, it simply converts them into a normalised form. Am I missing something. Is there a hidden danger I haven't considered?

Read the article
HTTP Data chunks over multiple packets?

- by myforwik

What is the correct way for a HTTP server to send data over multiple packets? For example I want to transfer a file, the first packet I send is: HTTP/1.1 200 OK Content-type: application/force-download Content-Type: application/download Content-Type: application/octet-stream Content-Description: File Transfer Content-disposition: attachment; filename=test.dat Content-Transfer-Encoding: chunked 400 <first 1024 bytes here> 400 <next 1024 bytes here> 400 <next 1024 bytes here> Now I need to make a new packet, if I just send: 400 <next 1024 bytes here> All the clients close there connections on me and the files are cut short. What headers do I put in a second packet to continue on with the data stream?

Read the article
Python unicode issues (2.6)

- by ephemeralis

I'm currently working on a irc bot for a multi-lingual channel, and I'm encountering some issues with unicode which are proving nearly impossible to solve. No matter what configuration of unicode encoding I seem to try, the list function which the below code sits within just flat out does nothing (c.notice is a class function which sends a NOTICE command to the irc server) or when it does do something, spits out something which obviously isn't encoded. The command should be sending ??, but instead it seems hellbent on sending å¤©å with a previous configuration of the same commands. The one I have specified below is of the 'send nothing' variety. I haven't worked with unicode before this, and thus I am quite stuck. I'm also positive that I'm doing this completely wrong as a consequence. (compileCMD just takes a list and spits out a single string of all the elements within the list) uk = self.compileCMD(self.faq.keys(),0) ukeys = unicode(uk,"utf-8").encode("utf-8") c.notice(nick, u"Current list of faq entries: %s" % (uk))

Read the article
AutoKey - clipboard.get_selection() function fails on certain strings

- by LonnieBest

I've simplified my script so you can focus on the essence my problem. In AutoKey (not AutoHotKey), I made a Hot-Key (shift-alt-T) that performs this script on any string I have highlighted (like in gedit for example -- but any other gui editor too). strSelectedText = clipboard.get_selection() keyboard.send_keys(" " + strSelectedText) The script modifies the highlighted text and adds a space to the beginning of the string. It works for most strings I highlight, but not this one: * Copyright © 2008–2012 Lonnie Best. Licensed under the MIT License. It works for this string: * Add a Space 2.0.1 but not on this one: * Add a Space 2.0.1 – At the python command prompt, it has no problem any of those strings, yet the clipboard.get_selection() function seems to get corrupted by them. I'm rather new to python scripting, so I'm not sure if this is an AutoKey bug, or if I'm missing some knowledge I should know about encoding/preparing strings in python. Please help. I'm doing this on Ubuntu 12.04: sudo apt-get install autokey-qt

Read the article
Efficient way to ASCII encode UTF-8

- by Andreas Gohr

I'm looking for a simple and efficient way to store UTF-8 strings in ASCII-7. With efficient I mean the following: all ASCII chars in the input should stay ASCII chars in the output the resulting string should be as short as possible the operation needs to be reversable without any data loss there should be no restriction on the input length the whole UTF-8 range should be allowed My first idea was to use Punycode (IDNA) as it fits the first three requirements, but it fails at the last two. Can anyone recommend an alternative encoding scheme? Even better if there's some code available to look at.

Read the article
Load JSON in Python as header character set

- by mridang

Hi everyone, I've always found character sets and encodings complicated to understand and here I'm faced with another problem. My apologies for any inaccuracies. I'll do my best. I'm requesting data from a server which returns JSON. In the HTTP headers it also returns the character set like so: Content-Type: text/html; charset=UTF-8 I'm using the JSON library in Python to load the JSON using the json.loads method. When I pass it the returned JSON, it gives me a dictionary in Unicode. I've Googled around and I know that JSON should return Unicode as JavaScript strings are Unicode objects. How can I load the JSON as UTF-8? I would like to use the same encoding as specified in the response header. I've read this post but it didn't help. Thank you.

Read the article
Why ????? is displayed instead of non-english characters?

- by smhnaji

I first created a simple HTML page that uses UTF-8 as its character encoding. Then I moved the HTML content as a view in codeigniter and it was still ok (I had used non-english characters that were ok as always) I added a simple dynamic functionality (there is a contact us form in it that emails users feedback to site admin). Please note that the characters were ok at localhost (which is a LAMP server running on Ubuntu 12.04 LTS) Strange is that when I uploaded the app to server, all persian characters are shown as ???? (For example ??? (which means Name) is shown ??? and so so...) I have not even connected to mysql or any other DBMS. It's the only page in the website (it's more an under construction page) and nothing else has been used in it. Maybe I should state that I have also used session library to thank the user after his feedback was sent to admins, nothing else. I have really no idea about the problem. UPDATE Now I can see that the problem is only with cPanel. On Directadmin I can see that everything is normal. Both Chromium and Firefox DO use UTF-8 as page's character encoding. URL is http://WEBSITE.COM/dmf/dynamic/ (dmf is the abbreviation of the project name!). There is nothing non-english in the URL. The page's code is as follows: <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> <title>??? ???????</title> <link rel="stylesheet" type="text/css" href="<?php echo base_url('template/css/style.css'); ?>" />  <script src="http://ajax.googleapis.com/ajax/libs/jquery/1/jquery.min.js"> </script>  <script src="http://releases.flowplayer.org/5.1.1/flowplayer.min.js"></script>  <link rel="stylesheet" type="text/css" href="http://releases.flowplayer.org/5.1.1/skin/minimalist.css" /> </head> <body> <div id="wrapper"> <header> <h1>??? ???????</h1> </header> <section id="box-container"> <?php echo form_open('contact', "id='contact-us'"); echo form_fieldset('???? ?? ??'); if ($this->session->userdata('mailsent')) { echo '<div>??????? ???? ??? ????? ??</div>'; $this->session->sess_destroy(); } echo '<input tabindex="1" id="name-in" value="???" type="text" name="name"/> <input tabindex="2" id="mail-in" value="?????" type="email" name="email"/> <textarea tabindex="3" id="content-in" name="message">???????</textarea> <input tabindex="4" id="submit" type="submit" value="?????" />'; echo '<div class="clear"></div>'; echo form_fieldset_close(); echo form_close(); ?> <div id="sms-comp"> <h2>?????? ??????</h2> <p> <span id="comp-title">?? ??? ????</span> ???? ??????? ???? ??? </p> </div> <div id="last-program"> <h2>?????? ????? ??????</h2> <div class="flowplayer"> <video id="my_video_1" width="212" height="126" poster="<?php echo base_url('template/images/img.jpg'); ?>" controls="controls" src="http://archive.org/download/Pbtestfilemp4videotestmp4/video_test.ogv" type='video/mp4'> </video> </div> </div> <div class="clear"></div> </section> </div> <footer> ????? ? ????? : <a href="http://powered-by.com/" target="_blank">????? ???</a> </footer> </body> </html>

Read the article
What is "=C2=A0" in MIME encoded, quoted-printable text?

- by TheSoftwareJedi

This is an example raw email I am trying to parse: MIME-version: 1.0 Content-type: text/html; charset=UTF-8 Content-transfer-encoding: quoted-printable X-Mailer: Verizon Webmail X-Originating-IP: [x.x.x.x] =C2=A0test testing testing 123 What is =C2=A0? I have tried a half dozen quoted-printable parsers, but none handle this correctly. Honestly, for now, I'm coding: //TODO WTF encoded = encoded.Replace("=C2=A0", ""); Because I can't figure out why that text is there randomly within the MIME content, and isn't supposed to be rendered into anything. By just removing it, I'm getting the desired effect - but WHY?!

Read the article
Load JSON in Python as header chracterset

- by mridang

Hi everyone, I've always found character-sets and encodings complicated to understand and here I'm faced with another problem. My apologies for any inaccuracies. I'll do my best. I'm requesting data from a server which returns JSON. In the HTTP headers it also returns the character.set like so: Content-Type: text/html; charset=UTF-8 I'm using the JSON library in python to load the JSON using the json.loads method. When I pass it the returned JSON, it gives me a dictionary in Unicode. I've Googled around and I know that JSON should return Unicode as JavaScript strings are Unicode objects. How can I load the JSON as UTF-8. I would like to use the same encoding as specified in the response header. I've read this post but it didn't help. Thank you.

Read the article
Should I convert overlong UTF-8 strings to their shortest normal form?

- by Grant McLean

I've just been reworking my Encoding::FixLatin Perl module to handle overlong UTF-8 byte sequences and convert them to the shortest normal form. My question is quite simply "is this a bad idea"? A number of sources (including this RFC) suggest that any over-long UTF-8 should be treated as an error and rejected. They caution against "naive implementations" and leave me with the impression that these things are inherently unsafe. Since the whole purpose of my module is to clean up messy data files with mixed encodings and convert them to nice clean utf8, this seems like just one more thing I can clean up so the application layer doesn't have to deal with it. My code does not concern itself with any semantic meaning the resulting characters might have, it simply converts them into a normalised form. Am I missing something. Is there a hidden danger I haven't considered?

Read the article
Stream/string/bytearray transformations in Python 3

- by Craig McQueen

Python 3 cleans up Python's handling of Unicode strings. I assume as part of this effort, the codecs in Python 3 have become more restrictive, according to the Python 3 documentation compared to the Python 2 documentation. For example, codecs that conceptually convert a bytestream to a different form of bytestream have been removed: base64_codec bz2_codec hex_codec And codecs that conceptually convert Unicode to a different form of Unicode have also been removed (in Python 2 it actually went between Unicode and bytestream, but conceptually it's really Unicode to Unicode I reckon): rot_13 My main question is, what is the "right way" in Python 3 to do what these removed codecs used to do? They're not codecs in the strict sense, but "transformations". But the interface and implementation would be very similar to codecs. I don't care about rot_13, but I'm interested to know what would be the "best way" to implement a transformation of line ending styles (Unix line endings vs Windows line endings) which should really be a Unicode-to-Unicode transformation done before encoding to byte stream, especially when UTF-16 is being used, as discussed this other SO question.

Read the article
Python BOM error in Ascii file

- by Intosia

I have a wierd annoying problem with Python 2.6 I trying to run this file (and the other), on my Embedded Linux ARM board. http://svn.tuxisalive.com/software_suite_v3/smart-core/smart-server/trunk/TDSService.py I get this error File "tuxhttpserver.py", line 1 SyntaxError: encoding problem: with BOM I know that error is about the BOM bytes etc etc. BUT, there are NO BOM bytes, its plain Ascii. I checked with a Hexeditor, and the linux File command says its Ascii. Im freaking out here... The code worked fine on my Sheevaplug (also a ARM based system).

Read the article
Batch convert latin-1 files to utf-8 using iconv

- by Jasmo

I'm having this one PHP project on my OSX which is in latin1 -encoding. Now I need to convert files to UTF8. I'm not much a shell coder and I tried something I found from internet: mkdir new for a in ls -R *; do iconv -f iso-8859-1 -t utf-8 <"$a" new/"$a" ; done But that does not create the directory structure and it gives me heck load of errors when run. Can anyone come up with neat solution?

Read the article

< Previous Page | 29 30 31 32 33 34 35 36 37 38 39 40 | Next Page >