Search Results

Search found 28425 results on 1137 pages for 'source encoding'.

Page 121/1137 | < Previous Page | 117 118 119 120 121 122 123 124 125 126 127 128  | Next Page >

  • Properly handling unicode characters in Rails

    - by Gdeglin
    By default Rails allows users of our application to input non-utf8 data, such as: ¶®«¼ However when we attempt to retrieve the data from our database and render it in a template Rails incorrectly assumes that it is in UTF-8 format and throws an error. ArgumentError: invalid byte sequence in UTF-8 What is the best way to handle this? I have seen one fix that suggested sanitizing the data in every place the user can input it. However, that would involve changing a considerable amount of code and it would strip out the characters entirely. Ideally we would want some characters converted to their UTF-8 equivalents. Our environment: Ruby: 1.9.1 Rails 2.3.5 MySql Gem: 2.8.1 This is a serious and urgent problem for us so your answers are very appreciated!

    Read the article

  • Inconsistent get_class_methods vs method_exists when using UTF8 characters in PHP code

    - by coma
    I have this class in a UTF-8 encoded file called EnUTF8.Class.php: class EnUTF8 { public function ñññ() { return 'ñññ()'; } } and in another UTF-8 encoded file: require_once('EnUTF8.Class.php'); require_once('OneBuggy.Class.php'); $utf8 = new EnUTF8(); //$buggy = new OneBuggy(); echo (method_exists($utf8, 'ñññ')) ? 'ñññ() exists!' : 'ñññ() does not exist...'; echo "\n\n----------------------------------\n\n" print_r(get_class_methods($utf8)); echo "\n----------------------------------\n\n" echo $utf8->ñññ(); that produces the expected result: ñññ() exists! ---------------------------------- Array ( [0] => ñññ ) ---------------------------------- ñññ() but if... require_once('EnUTF8.Class.php'); require_once('OneBuggy.Class.php'); $utf8 = new EnUTF8(); $buggy = new OneBuggy(); echo (method_exists($utf8, 'ñññ')) ? 'ñññ() exists!' : 'ñññ() does not exist...'; echo "\n\n----------------------------------\n\n" print_r(get_class_methods($utf8)); echo "\n----------------------------------\n\n" echo $utf8->ñññ(); then the weirdness appears!!!: ñññ() does not exist! ---------------------------------- Array ( [0] => ñññ ) ---------------------------------- Fatal error: Call to undefined method EnUTF8::ñññ() in /var/www/test.php on line 16 Well, the thing is that OneBuggy.Class.php is UTF-8 encoded too and shares absolutly nothing with EnUTF8.Class.php so... where is the bug? UPDATED: Well, after a long debugging time I found this in OneBuggy.Class.php constructor: setlocale (LC_ALL, "es_ES@euro", "es_ES", "esp"); so I did... //setlocale (LC_ALL, "es_ES@euro", "es_ES", "esp"); and now it works but why?.

    Read the article

  • How to unescape special characters from BeautifulSoup output?

    - by Suhail
    Hi, I am facing issues with the special characters like ° and ® which represent the degree Fahrenheit sign and the registered sign, when i print the string the contains the special characters, it gives output like this: Preheat oven to 350&deg; F Welcome to Lorem Ipsum Inc&reg; Is there a way I can output the exact characters and not their codes? Please let me know.

    Read the article

  • Convert French to ASCII (French speakers are wanted)

    - by Andrey
    i need to convert French text to most correct analog in ASCII. Let me explain. In German you should convert ä to ae, this is not simple removing of diacritics, it is finding most correct analogue. Please help me with French. I found that there is no programmatic way to do it, i create Dictionary<char, string>. To convert (+ capitals): é, à, è, ù, â, ê, î, ô, û, ë, ï, ü, ÿ, ç. and any other you suggest! Please write suggested substitution in ascii. Thanks, Andrey.

    Read the article

  • What is the role/responsibility of a 'shell'?

    - by Rune
    Hi, I have been looking at the source code of the IronPython project and the Orchard CMS project. IronPython operates with a namespace called Microsoft.Scripting.Hosting.Shell (part of the DLR). The Orchard Project also operates with the concept of a 'shell' indirectly in various interfaces (IShellContainerFactory, IShellSettings). None of the projects mentioned above have elaborate documentation, so picking up the meaning of a type (class etc.) from its name is pretty valuable if you are trying to figure out the overall application structure/architecture by reading the source code. Now I am wondering: what do the authors of this source code have in mind when they refer to a 'shell'? When I hear the word 'shell', I think of something like a command line interpreter. This makes sense for IronPython, since it has an interactive interpreter. But to me, it doesn't make much sense with respect to a Web CMS. What should I think of, when I encounter something called a 'shell'? What is, in general terms, the role and responsibility of a 'shell'? Can that question even be answered? Is the meaning of 'shell' subjective (making the term useless)? Thanks.

    Read the article

  • Java UTF-8 to ASCII conversion with supplements

    - by bozo
    Hi, we are accepting all sorts of national characters in UTF-8 string on the input, and we need to convert them to ASCII string on the output for some legacy use. (we don't accept Chinese and Japanese chars, only European languages) We have a small utility to get rid of all the diacritics: public static final String toBaseCharacters(final String sText) { if (sText == null || sText.length() == 0) return sText; final char[] chars = sText.toCharArray(); final int iSize = chars.length; final StringBuilder sb = new StringBuilder(iSize); for (int i = 0; i < iSize; i++) { String sLetter = new String(new char[] { chars[i] }); sLetter = Normalizer.normalize(sLetter, Normalizer.Form.NFC); try { byte[] bLetter = sLetter.getBytes("UTF-8"); sb.append((char) bLetter[0]); } catch (UnsupportedEncodingException e) { } } return sb.toString(); } The question is how to replace all the german sharp s (ß, Ð, d) and other characters that get through the above normalization method, with their supplements (in case of ß, supplement would probably be "ss" and in case od Ð supplement would be either "D" or "Dj"). Is there some simple way to do it, without million of .replaceAll() calls? So for example: Ðonardan = Djonardan, Blaß = Blass and so on. We can replace all "problematic" chars with empty space, but would like to avoid this to make the output as similar to the input as possible. Thank you for your answers, Bozo

    Read the article

  • What is a good resource for HTML character codes -> glyph and...

    - by Ben
    Hi, I've already found a good site to convert HTML character codes to their respective glyphs: http://www.public.asu.edu/~rjansen/glyph_encoding.html However, I need a bit more information. Does anyone know of a site like the one above that also provides information on what type of character code it is? Meaning, is it a special character? Is the glyph visible? Etc... So far I have found some tables with this information, but they aren't as complete as the resource above. I would really like to get my hands on a complete table. Thanks, -Ben

    Read the article

  • Explanation of JAXB error: Invalid byte 1 of 1-byte UTF-8 sequence

    - by Marcus
    We're parsing an XML document using JAXB and get this error: [org.xml.sax.SAXParseException: Invalid byte 1 of 1-byte UTF-8 sequence.] at javax.xml.bind.helpers.AbstractUnmarshallerImpl.createUnmarshalException(AbstractUnmarshallerImpl.java:315) What exactly does this mean and how can we resolve this?? We are executing the code as: jaxbContext = JAXBContext.newInstance(Results.class); Unmarshaller unmarshaller = jaxbContext.createUnmarshaller(); unmarshaller.setSchema(getSchema()); results = (Results) unmarshaller.unmarshal(new FileInputStream(inputFile)); Update Issue appears to be due to this "funny" character in the XML file: ¿ Why would this cause such a problem??

    Read the article

  • Filtering Wikipedia's XML dump: error on some accents

    - by streetpc
    I'm trying to index Wikpedia dumps. My SAX parser make Article objects for the XML with only the fields I care about, then send it to my ArticleSink, which produces Lucene Documents. I want to filter special/meta pages like those prefixed with Category: or Wikipedia:, so I made an array of those prefixes and test the title of each page against this array in my ArticleSink, using article.getTitle.startsWith(prefix). In English, everything works fine, I get a Lucene index with all the pages except for the matching prefixes. In French, the prefixes with no accent also work (i.e. filter the corresponding pages), some of the accented prefixes don't work at all (like Catégorie:), and some work most of the time but fail on some pages (like Wikipédia:) but I cannot see any difference between the corresponding lines (in less). I can't really inspect all the differences in the file because of its size (5 GB), but it looks like a correct UTF-8 XML. If I take a portion of the file using grep or head, the accents are correct (even on the incriminated pages, the <title>Catégorie:something</title> is correctly displayed by grep). On the other hand, when I rectreate a wiki XML by tail/head-cutting the original file, the same page (here Catégorie:Rock par ville) gets filtered in the small file, not in the original… Any idea ? Alternatives I tried: Getting the file (commented lines were tried wihtout success): FileInputStream fis = new FileInputStream(new File(xmlFileName)); //ReaderInputStream ris = ReaderInputStream.forceEncodingInputStream(fis, "UTF-8" ); //(custom function opening the stream, reading it as UFT-8 into a Reader and returning another byte stream) //InputSource is = new InputSource( fis ); is.setEncoding("UTF-8"); parser.parse(fis, handler); Filtered prefixes: ignoredPrefix = new String[] {"Catégorie:", "Modèle:", "Wikipédia:", "Cat\uFFFDgorie:", "Mod\uFFFDle:", "Wikip\uFFFDdia:", //invalid char "Catégorie:", "Modèle:", "Wikipédia:", // UTF-8 as ISO-8859-1 "Image:", "Portail:", "Fichier:", "Aide:", "Projet:"}; // those last always work

    Read the article

  • [Ruby] Why do I have to URI.encode even safe characters for Net::HTTP requests?

    - by Matthias
    I was trying to send a GET request to Twitter (user ID replaced for privacy reasons) using Net::HTTP: url = URI.parse("http://api.twitter.com/1/friends/ids.json?user_id=12345") resp = Net::HTTP.get_response(url) this throws an exception in Net::HTTP: NoMethodError: undefined method empty?' for #<URI::HTTP:0x59f5c04> from /System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/lib/ruby/1.8/net/http.rb:1470:ininitialize' just by coincidence, I stumbled upon a similar code snippet, which used URI.encode prior to URI.parse, so I copied that and tried again: url = URI.parse(URI.encode("http://api.twitter.com/1/friends/ids.json?user_id=12345")) resp = Net::HTTP.get_response(url) now it works fine, but why? There are no reserved characters that need escaping in the URL I mentioned, so why do I have to call URI.encode for get_response to succeed?

    Read the article

  • Turning HTML character entities to 'regular' letters... why is it only partially working?

    - by Jack W-H
    I'm using all of the below to take a field called 'code' from my database, get rid of all the HTML entities, and print it 'as usual' to the site: <?php $code = preg_replace('~&#x([0-9a-f]+);~ei', 'chr(hexdec("\\1"))', $code); $code = preg_replace('~&#([0-9]+);~e', 'chr("\\1")', $code); $code = html_entity_decode($code); ?> However the exported code still looks like this: progid:DXImageTransform.Microsoft.AlphaImageLoader(src=’img/the_image.png’); See what's going on there? How many other things can I run on the string to turn them into darn regular characters?! Thanks! Jack

    Read the article

  • Query MySQL with unicode char code.

    - by Ben
    Hi, I have been having trouble searching through a MySQL table, trying to find entries with the character (UTF-16 code 200E) in a particular column. This particular code doesn't have a glyph, so it doesn't seem to work when I try to paste it into my search term. Is there a way to specify characters as their respective code point instead for a query? Thanks, -Ben

    Read the article

  • .aspx character coding

    - by kwek-kwek
    I am having an problem. First time working with a windows server, do you know if there is any problem in character coding? My document is set to content="text/html; charset=UTF-8" but it's giving me funny words... you can check it here. This site is a pure HTML with few includes but anything else is just HTML. I can convert them to HTML entities but that is basically wasting my time. I never had this problem with any website I did except for this. Some others said "The problems seems to be that you have converted the text into utf-8 twice.". But how would Coverted it twice since dreamweaver should convert it for me but in this case it doesn't.

    Read the article

  • JS encodeURIComponent result different from the one created by FORM

    - by Marco Demaio
    I thought values entered in forms are properly encoded by browsers. But this simple test shows it's not true: <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"> <html><head> <meta http-equiv="Content-Type" content="text/html; charset=windows-1252"> <title></title> </head><body> <form id="test" action="test_get_vs_encodeuri.html" method="GET" onsubmit="alert(encodeURIComponent(this.one.value));"> <input name="one" type="text" value="Euro-€"> <input type="submit" value="SUBMIT"> </form> </body></html> When hitting submit button: encodeURICompenent encodes input value into "Euro-%E2%82%AC" while browser into the GET query writes only a simple "Euro-%80" Could somone explain? Or is encodeURIComponent doing unnecessary conversions?

    Read the article

  • display of umlauts in firefox

    - by Mike D
    I was doing some web searching and found some strange things involving umlauts. For example if you do a google or yahoo search for the word "nther" you are likely to find things like G&#xfc;nther which I take to be Gunther with an umlaut over the u. Now my question is what if anything can I do to cause these characters to be properly displayed by Firefox under windows XP? An amazing thing is that I had to introduce spaces in the G & # etc string otherwise it was properly displayed here as u with umlaut!

    Read the article

  • Serializing chinese characters with Xerces 2.6

    - by Gianluca
    I have a Xerces (2.6) DOMNode object encoded UTF-8. I use to read its TEXT element like this: CBuffer DomNodeExtended::getText( const DOMNode* node ) const { char* p = XMLString::transcode( node->getNodeValue( ) ); CBuffer xNodeText( p ); delete p; return xNodeText; } Where CBuffer is, well, just a buffer object which is lately persisted as it is in a DB. This works until in the TEXT there are just common ASCII characters. If we have i.e. chinese ones they get lost in the transcode operation. I've googled a lot seeking for a solution. It looks like with Xerces 3, the DOMWriter class should solve the problem. With Xerces 2.6 I'm trying the XMLTranscoder, but no success yet. Could anybody help?

    Read the article

  • base 64 URL decode with Ruby/Rails?

    - by seth.vargo
    I am working with the Facebook API and Ruby on Rails and I'm trying to parse the JSON that comes back. The problem I'm running into is that Facebook base64URL encodes their data. There is no built-in base64URL decode for Ruby. For the difference between a base64 encoded and base64URL encoded, see wikipedia. How do I decode this using Ruby/Rails? Edit: Because some people have difficulty reading - base64 URL is DIFFERENT than base64

    Read the article

  • Ruby custom class to and from YAML;

    - by Sanarothe
    Hi. I'm having trouble deserializing a ruby class that I wrote to YAML. Where I want to be I want to be able to pass one object around as a full 'question' which includes the question text, some possible answers (For multi. choice) and the correct answer. One module (The encoder) takes input, builds a 'question' class out of it and appends it to the question pool. Another module reads a question pool and builds an array of 'question' objects. Where I am currently Sample Question Pool --- | --- !ruby/object:MultiQ a: "no" answer: "no" b: "no" c: "no" d: "no" text: "yes?" Encoder dump to YAML file. Object is a MultiQ filled up with input. (See below.) def dump(file, object) File.open(file, 'a') do |out| YAML.dump(object.to_yaml, out) end object = nil end MultiQ Class definition class MultiQ attr_accessor :text, :answer, :a, :b, :c, :d def initialize(text, answer, a, b, c, d) @text = text @answer = answer @a = a @b = b @c = c @d = d end end The decoder (I've been trying different things, so what's here wasn't my first or best guess. But I'm at a loss and the documentation doesn't really explain things thoroughly enough.) File.open( "test_set.yaml" ) do |yf| YAML.load_documents( yf ) { |item| new = YAML.object_maker( MultiQ, item) puts new } end Questions you can answer How do I achieve my goal? What methods should I use, between parsing, loading files or documents, to successfully deserialize a Ruby class? I've already looked over the YAML Rdoc, and I didn't absorb very much, so please don't just link me to it. What other methods would you suggest using? Is there a better way to store questions like this? Should I be using document db, relational db, xml? Some other format?

    Read the article

  • c# Remove special chars from a File

    - by jmpena
    Hello i have a problem, im trying to open a textfile and remove all the special chars ñ Ñ ' á í etc... the file its a Layout that the clients send to me and i parse it to send the file to an AS400 server but i have to remove all special chars. THE PROBLES IS: some files with some special chars when i open it in c# it read the special chars and Two different chars and move the entire line one space to the right and then the information that has to be in that position wont be OK. i take the same file and open it in Notepad and the file is OK but when i open it in WordPad it looks like 2 chars (for just 1 especial char) Example: in the file i have: "0001 0003JUAN PEÑA33441JPENATEST" But in c# it shows "0001 0003JUAN PEï¦A33441JPENATEST" im using the encondig 1251 any help?

    Read the article

  • how to properly display utf encoded characters on my utf-8 encoded page?

    - by Ali
    Hi guys I'm retrieving emails and some of my emails have utf encoded text. However even though my page is encoded as utf 8 - in some places when I try to out put utf text I get funny characters like : =?utf-8?B?Rlc6INqp24zYpyDYotm+INin2LMg2YXYs9qp2LHYp9uB2bkg2qnbjCDZhtmC?= =?utf-8?B?2YQg2qnYsdiz2qnYqtuSINuB24zaug==?= Whereas in other areas of the same page it displays fine. WHats going on?

    Read the article

  • How do you get the glyph for a character encoded as '&#333;' from a utf-8 encoded database field usi

    - by AE
    I have a MySQL database table with a collation of 'utf8_general_ci' and the value in the field is: x & #299; bán yá wén (without the spaces). When this is converted (for example by StackOverflow's editor) it looks like this: xī bán yá wén where the second character looks like a lower case i with a bar over the top. In PHP, what function converts the & #299 ; entity into the ī character? I've tried using html_entity_decode($str,ENT_COMPAT,'UTF-8'), however I get characters like the following: yÄ«n wén or zhÅ•ng wén I'm pretty sure there's something I don't understand about the decoding, which is why I'm using the wrong function. Can anyone shed some light on how to get the single character glyph that's represented by the entity & #299 and similar high-number characters above 255? Many thanks, AE

    Read the article

< Previous Page | 117 118 119 120 121 122 123 124 125 126 127 128  | Next Page >