forbidden characters - Page 19

Lucene and Special Characters

- by Brandon

I am using Lucene.Net 2.0 to index some fields from a database table. One of the fields is a 'Name' field which allows special characters. When I perform a search, it does not find my document that contains a term with special characters. I index my field as such: Directory DALDirectory = FSDirectory.GetDirectory(@"C:\Indexes\Name", false); Analyzer analyzer = new StandardAnalyzer(); IndexWriter indexWriter = new IndexWriter(DALDirectory, analyzer, true, IndexWriter.MaxFieldLength.UNLIMITED); Document doc = new Document(); doc.Add(new Field("Name", "Test (Test)", Field.Store.YES, Field.Index.TOKENIZED)); indexWriter.AddDocument(doc); indexWriter.Optimize(); indexWriter.Close(); And I search doing the following: value = value.Trim().ToLower(); value = QueryParser.Escape(value); Query searchQuery = new TermQuery(new Term(field, value)); Searcher searcher = new IndexSearcher(DALDirectory); TopDocCollector collector = new TopDocCollector(searcher.MaxDoc()); searcher.Search(searchQuery, collector); ScoreDoc[] hits = collector.TopDocs().scoreDocs; If I perform a search for field as 'Name' and value as 'Test', it finds the document. If I perform the same search as 'Name' and value as 'Test (Test)', then it does not find the document. Even more strange, if I remove the QueryParser.Escape line do a search for a GUID (which, of course, contains hyphens) it finds documents where the GUID value matches, but performing the same search with the value as 'Test (Test)' still yields no results. I am unsure what I am doing wrong. I am using the QueryParser.Escape method to escape the special characters and am storing the field and searching by the Lucene.Net's examples. Any thoughts?

Read the article

Replacing accented/umlauted characters with their unadorned counterparts in C# [closed]

- by Andrew Rollings

Duplicate of 249087 I have a bunch of user generated addresses that may contain characters with diacritic marks. What is the most effective (i.e. generic) way (apart from a straightforward replace) to automatically convert any such characters to their closest English equivalent? E.g. any of àâãäå would become a æ would become the two separate letters ae ç would become c any of èéêë would become e etc. for all possible letter variations (preferably without having to find and encode lookups for each diacritic form of the letter). (Note: I have to pass these addresses on to third party software that is incapable of printing anything other than English characters. I'd rather the software was capable of handling them, but I have no control over that.) EDIT: Never mind... Found the answer [here][2]. It showed up in the "Related" section to the right of the question after I posted, but not in my prior search or as a pre-post suggestion. Hmm. I added the 'diacritics' tag to the other question in any case. EDIT 2: Jeez! Who voted this -1 after I closed it?

Read the article

How to parse XML with special characters?

- by Snooze

Whenever I try to parse XML with special characters such as o or ???? I get an error. The xml documents claims to use UTF-8 encoding but that does not seem to be the case. Here is what the troublesome text looks like when I view the XML in Firefox: Bleach: The Diamond Dust Rebellion - MÅ? Hitotsu no HyÅ?rinmaru; Bleach - The DiamondDust Rebellion - Mou Hitotsu no Hyourinmaru On the actual website, Å? is actually the character o. <br /> One day, Doraemon and his friends meet Professor Mangetsu (æº?æ??å??ç??, Professor Mangetsu?), who studies magic and magical beings such as goblins, and his daughter Miyoko (ç¾?å¤?å?, Miyoko?), and are warned of the dangerous approximation of the "star of the Underworld" to the Earth's orbit.<br /> <br /> And once again, on the actual website, those characters appear as ???? and ???. The actual XML file is formatted properly other than those special characters, which certainly do not appear to be using the UTF-8 encoding. Is there a way to get NSXML to parse these XML files?

Read the article

Word wrap in multiline textbox after 35 characters

- by Kanavi

<asp:TextBox CssClass="txt" ID="TextBox1" runat="server" onkeyup="CountChars(this);" Rows="20" Columns="35" TextMode="MultiLine" Wrap="true"> </asp:TextBox> I need to implement word-wrapping in a multi-line textbox. I cannot allow users to write more then 35 chars a line. I am using the following code, which breaks at precisely the specified character on every line, cutting words in half. Can we fix this so that if there's not enough space left for a word on the current line, we move the whole word to the next line? function CountChars(ID) { var IntermediateText = ''; var FinalText = ''; var SubText = ''; var text = document.getElementById(ID.id).value; var lines = text.split("\n"); for (var i = 0; i < lines.length; i++) { IntermediateText = lines[i]; if (IntermediateText.length <= 50) { if (lines.length - 1 == i) FinalText += IntermediateText; else FinalText += IntermediateText + "\n"; } else { while (IntermediateText.length > 50) { SubText = IntermediateText.substring(0, 50); FinalText += SubText + "\n"; IntermediateText = IntermediateText.replace(SubText, ''); } if (IntermediateText != '') { if (lines.length - 1 == i) FinalText += IntermediateText; else FinalText += IntermediateText + "\n"; } } } document.getElementById(ID.id).value = FinalText; $('#' + ID.id).scrollTop($('#' + ID.id)[0].scrollHeight); } Edit - 1 I have to show total max 35 characters in line without specific word break and need to keep margin of two characters from the right. Again, the restriction should be for 35 characters but need space for total 37 (Just for the Visibility issue.)

Read the article

Corrupt UTF-8 Characters with PHP 5.2.10 and MySQL 5.0.81

- by jkndrkn

We have an application hosted on both a local development server and a live site. We are experiencing UTF-8 corruption issues and are looking to figure out how to resolve them. The system is run using symfony 1.0 with Propel. On our development server, we are running PHP 5.2.0 and MySQL 5.0.32. We do not experience corrupted UTF-8 characters there. On our live site, PHP 5.2.10 and MySQL 5.0.81 is running. On that server, certain characters such as ô´ and S are corrupted once they are stored in the database. The corrupted characters are showing up as either question marks or approximations of the original character with adjacent question marks. Examples of corruption: Uncorrupted: ô´ Corrupted: ô? Uncorrupted: S Corrupted: ? We are currently using the following techniques on both development and live servers: Executing the following queries prior to execution of any other queries: SET NAMES 'utf8' COLLATE 'utf8_unicode_ci' SET CHARSET 'utf8' Setting the <meta> Content-Type value to: <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> Adding the following to our .htaccess file: AddDefaultCharset utf-8 Using mb_* (multibyte) PHP functions where necessary. Being sure to set database columns to use utf8_unicode_ci collation. These techniques are sufficient for our development site, but do not work on the live site. On the live site I've also tried adding mysql_set_encoding('ut8', $mysql_connection) but this does not help either. I have found some evidence that newer versions of PHP and MySQL are mishandling UTF-8 character encodings.

Read the article

How to convert string with double high/wide characters to normal string [VC++6]

- by Shaitan00

My application typically recieves a string in the following format: " Item $5.69 " Some contants I always expect: - the LENGHT always 20 characters - the start index of the text always [5] - and most importantly the index of the DECIMAL for the price always [14] In order to identify this string correctly I validate all the expected contants listed above .... Some of my clients have now started sending the string with Doube-High / Double-Wide values (pair of characters which represent a single readable character) similar to the following: " Item $x80x90.x81x91x82x92 " For testing I simply scan the string character-by-character, compare char[i] and char[i+1] and replace these pairs with their corresponding single character when a match is found (works fine) as follows: [Code] for (int i=0; i < sData.length(); i++) { char ch = sData[i] & 0xFF; char ch2 = sData[i+1] & 0xFF; if (ch == '\x80' && ch2 == '\x90') zData.replace("\x80\x90", "0"); else if (ch == '\x81' && ch2 == '\x91') zData.replace("\x81\x91", "1"); else if (ch == '\x82' && ch2 == '\x92') zData.replace("\x82\x92", "2"); ... ... ... } [/Code] But the result is something like this: " Item $5.69 " Notice how this no longer matches my expectation: the lenght is now 17 (instead of 20) due to the 3 conversions and the decimal is now at index 13 (instead of 14) due to the conversion of the "5" before the decimal point. Ideally I would like to convert the string to a normal readable format keeping the constants (length, index of text, index of decimal) at the same place (so the rest of my application is re-usable) ... or any other suggestion (I'm pretty much stuck with this)... Is there a STANDARD way of dealing with these type of characters? Any help would be greatly appreciated, I've been stuck on this for a while now ... Thanks,

Read the article

Number of characters recommended for a statement

- by liaK

Hi, I have been using Qt 4.5 and so do C++. I have been told that it's a standard practice to maintain the length of each statement in the application to 80 characters. Even in Qt creator we can make a right border visible so that we can know whether we are crossing the 80 characters limit. But my question is, Is it really a standard being followed? Because in my application, I use indenting and all, so it's quite common that I cross the boundary. Other cases include, there might be a error statement which will be a bit explanatory one and which is in an inner block of code, so it too will cross the boundary. Usually my variable names look bit lengthier so as to make the names meaningful. When I call the functions of the variable names, again I will cross. Function names will not be in fewer characters either. I agree a horizontal scroll bar shows up and it's quite annoying to move back and forth. So, for function calls including multiple arguments, when the boundary is reached I will make the forth coming arguments in the new line. But besides that, for a single statement (for e.g a very long error message which is in double quotes " " or like longfun1()->longfun2()->...) if I use an \ and split into multiple lines, the readability becomes very poor. So is it a good practice to have those statement length restrictions? If this restriction in statement has to be followed? I don't think it depends on a specific language anyway. I added C++ and Qt tags since if it might. Any pointers regarding this are welcome.

Read the article

nginx inserting extra characters in Multi-status reply body

- by user125011

Here's the setup. I've got one server running apache/php hosting ownCloud. Among other things, I'm using to do CardDAV contact syncing. In order to make things work with my domain I have an nginx server running on the frontend as a reverse-proxy to the ownCloud server. My nginx config is as follows: server { listen 80; server_name cloud.mydomain.com; location / { proxy_set_header X-Forwarded-Host cloud.mydomain.com; proxy_set_header X-Forwarded-Proto http; proxy_set_header X-Forwarded-For $remote_addr; client_max_body_size 0; proxy_redirect off; proxy_pass http://server; } } The problem is that when my phone does a PROPFIND on the server, nginx adds extra characters to the content body that throw the phone off. Specifically, it prepends d611\r\n at the front of the body and appends 0\r\n\r\n to the end of the content. (I got this from wireshark.) It also re-chunks the result. How do I get nginx to send the original content as-is?

Read the article

ash scripting: space-containing variable refuses to be grepped

- by Luci Sandor

I am trying to run the script listed at http://talk.maemo.org/showthread.php?t=70866&page=2 on its intended hardware, a Nokia Linux phone running BusyBox ash. The script receives the name of WiFi network as a parameter, and tries to connect the phone to it. I suspect the script works, but my SSID, BU (802.1x), has space and parentheses in it. So when I type at the command prompt autoconnect.sh BU\ $802.1x$ I get various errors. First, LIST=`iwconfig wlan0 | awk -F":" '/ESSID/{print $2}'` if [ $LIST = "\"$1\"" ]; then ...fails, even I am connected to the network. The error is not avoided by using single or double quotes instead of escaping characters at the command prompt. Second, if [ -z `iwlist wlan0 scan | grep -m 1 -o \"$1\"` ]; then echo SSID \"$1\" not found; shows that grep does not find the string, although the same grep, typed directly into the command prompt, does find 'BU (802.1x)'. How do I quote $1 in the two circumstances above so that it will work with my network SSID, containing spaces and parentheses? Thank you.

Read the article

pdftotext not outputting hebrew characters

- by Ofri Raviv

I'm using Xpdf's pdftotext to get the text out of some hebrew pdf files on Ubuntu. On my local machine this worked fine. I then tried to do it on another machine and the hebrew characters don't show up in the text file. I verified that I have the language package (see below why I think so). Where else can I look for the problem? >> tail -2 /etc/xpdf/xpdfrc include /etc/xpdf/includes >> cat /etc/xpdf/includes # This file was automatically generated by /usr/sbin/update-xpdfrc. # Instead, add or remove files in /etc/xpdf/ then run # /usr/sbin/update-xpdfrc to regenerate this file. include /etc/xpdf/xpdfrc-latin2 include /etc/xpdf/xpdfrc-thai include /etc/xpdf/xpdfrc-greek include /etc/xpdf/xpdfrc-turkish include /etc/xpdf/xpdfrc-arabic include /etc/xpdf/xpdfrc-hebrew include /etc/xpdf/xpdfrc-cyrillic >> cat /etc/xpdf/xpdfrc-hebrew #----- begin Hebrew support package (2003-feb-16) unicodeMap ISO-8859-8 /usr/share/xpdf/hebrew/ISO-8859-8.unicodeMap unicodeMap Windows-1255 /usr/share/xpdf/hebrew/Windows-1255.unicodeMap #----- end Hebrew support package >> ls /usr/share/xpdf/hebrew/ ISO-8859-8.unicodeMap Windows-1255.unicodeMap

Read the article

Trouble with backslash characters and rsyslog writing to postgres

- by Flimzy

I have rsyslog 4.6.4 configured to write mail logs to a PostgreSQL database. It all works fine, until the log message contains a backslash, as in this example: Jun 12 11:37:46 dc5 postfix/smtp[26475]: Vk0nYDKdH3sI: to=<[email protected], relay=----.---[---.---.---.---]:25, delay=1.5, delays=0.77/0.07/0.3/0.35, dsn=4.3.0, status=deferred (host ----.---[199.85.216.241] said: 451 4.3.0 Error writing to file d:\pmta\spool\B\00000414, status = ERROR_DISK_FULL in "DATA" (in reply to end of DATA command)) The above is the log entry, as written to /var/log/mail.log. It is correct. The trouble is that the backslash characters in the file name are interpreted as escapes when sent to the following SQL recipe: $template dcdb, "SELECT rsyslog_insert(('%timereported:::date-rfc3339%'::TIMESTAMPTZ)::TIMESTAMP,'%msg:::escape-cc%'::TEXT,'%syslogtag%'::VARCHAR)",STDSQL :syslogtag, startswith, "postfix" :ompgsql:/var/run/postgresql,dc,root,;dcdb As a result, the rsyslog_insert() stored procedure gets the following value for as msg: Vk0nYDKdH3sI: to=<[email protected], relay=----.---[---.---.---.---]:25, delay=1.5, delays=0.77/0.07/0.3/0.35, dsn=4.3.0, status=deferred (host ----.---[199.85.216.241] said: 451 4.3.0 Error writing to file d:pmtaspoolB The \p, \s, \B and \0 in the file name are interpreted by PostgreSQL as literal p, s, and B followed by a NULL character, thus early-terminating the string. This behavior can be easiily confirmed with: dc=# SELECT 'd:\pmta\spool\B\00000414'; ?column? -------------- d:pmtaspoolB (1 row) dc=# Is there a way to correct this problem? Is there a way I'm not finding in the rsyslog docs to turn \ into \\?

Read the article

How to correctly set GNU Screen to display currently running program in hardstatus

- by johnny_bgoode

In bash, to display the name of the current program in the GNU Screen hardstatus line takes only two configuration lines. First, tell screen what the end of your prompt normally looks like, and supply a default title for a window when you are sitting at in the shell: shelltitle "$ |bash" Next, place this escape sequence in the PS1 variable, before the characters that normally terminate the prompt '$ ' in this case: \033k\033\\ This technique works, to a point. The hardstatus window title is updated to the name of the currently running program, and then switches back to the default title shortly after execution is finished. One major problem, however, is that this escape string is not escaped itself, causing line-wrapping problems with commands longer than the initial line. This was annoying, so I set out looking for a solution. Turns out, simply escaping the previous escape sequence corrects line wrapping: \[\033k\]\[\033\\\] Great! My hardstatus window title still updates to the name of the currently running program, and now my longer commands wrap to the second line correctly. However, with this new escape sequence in my PS1, screen updates the window title to the actual command I am typing, not simply the name of the current program once it is executed. I am wondering, has anyone gotten this working correctly - i.e. line wrapping and proper updating of the hardstatus window title?

Read the article

Batch Script to Trim lines in text to first 30 or 50 characters only

- by SuperUserMan

I am now new to scripts but i find it really difficult understanding "for" command (especially with that tokens and delimiters etc) . Saying so, i think that for command can be used to do what i am doing. If its not and there is an easier way, ignore my ignorance :( Say i have multiple lines in a text file abc.txt with each line starting and ending with " (quotes) E.g. a file of 3 lines "hey what is going on @mike220. I am working on your car. Its engine is in very bad condition" "Because if you knew, you'd get shredded and do it with certainty" "@honey220 Do you know someone who has busted their ass on a diet only for results to come to a screeching halt after a few weeks" How can i trim each line, within the quotes, to a Fixed length say 30 or 50 or 100 characters (including spaces) I want to enter the number of character in batch and it can trim accordingly and produce a file def.txt with trimmed lines within quotes. Say i enter 50, results of above example should be "hey what is going on @mike220. I am working on you" "Because if you knew, you'd get shredded and do it" "@honey220 Do you know someone who has busted their" Thanks P.S. if you use For command, kindly please explain the command. EDIT: Though the answer provided worked, there is an issue with non english text. I am getting garbled text in Output file for non english text in input file . Any help @barlop here is the nonenglish text ( 1 line) "???? ?? ???? ?? ???? ???? ??? ?????? ???"

Read the article

Permanent fix for unicode characters not displaying correctly (as boxes)

- by Chase

Please read this entire message before replying. First I know how to fix the issue on a temporary basis. I am looking for a permanent fix. I work with foreign language files a lot. Unfortunately sometimes all the unicode characters in windows explorer, notepad, and other places (as rendered by windows, probably GDI) do not display correctly. That is they display as square blocks, where as they had just been displaying correctly. There are countless methods to temporarily correct the issue. But again, I want a way to permanently resolve the issue. What I have tried: The silly "Hide fonts based on language settings". This setting only applies to what fonts you see in the fonts folder and font dropdowns. It doesn't disable foreign fonts (doesn't work, or if it does, it is temporary). Deleting the font cache file and rebooting (works.. usually, temporary solution). Changing my locale and then back (sometimes works, temporary solution). Rebooting my PC and getting lucky (50-50 chance, temporary solution). Changing my keyboard input/adding foreign keyboard (temporary solution that only seems to work once). Reinstalling windows (temporary solution, sometimes lasts a few months though, I have done this 7 times across 3 computers) What I have not tried: Buying Windows Ultimate and installing the interface packs. This is not a solution. I can't read Japanese/Chinese and I do not want my interface in those languages. What I will not do: Switch to a different brand operating system (unix, linux, mac os x) Switch to an older version of windows (Windows Vista, XP, 2000, etc). So can anyone recommend a permanent fix for the problem?

Read the article

PuTTY inserts random characters during a session

- by Zachary Polikarpus

I recently started renting space on a remote server so that I could work on a project. I found that a relatively painless way to access it on a windows machine is through PuTTY. However, there is one thing that has always irked me when using it: for seemingly no reason random characters are sometimes inserted at the cursor. Most of the time it is just a single tilde, but rarely it spits out what looks like some escape sequence ([[^8 or the like). It will only occur when I am focused on the window, whether I am typing or 20 feet away from the keyboard. If left for long enough, it will spit tildes at random intervals (average is about 1 minute). Finally, this behavior seems to be inconsistant when running programs such as nano or the mysql interface: in nano, instead of inserting tildes, it will set marks (ctrl-^); in mysql, lines will become un-editable. My question is this: Has anyone else experienced this sort of behavior in PuTTY? And if so, what can be done to prevent/correct this behavior?

Read the article

Python - pyparsing unicode characters

- by mgj

Hi..:) I tried using w = Word(printables), but it isn't working. How should I give the spec for this. 'w' is meant to process Hindi characters (UTF-8) The code specifies the grammar and parses accordingly. 671.assess :: ????? ::2 x=number + "." + src + "::" + w + "::" + number + "." + number If there is only english characters it is working so the code is correct for the ascii format but the code is not working for the unicode format. I mean that the code works when we have something of the form 671.assess :: ahsaas ::2 i.e. it parses words in the english format, but I am not sure how to parse and then print characters in the unicode format. I need this for English Hindi word alignment for purpose. The python code looks like this: # -*- coding: utf-8 -*- from pyparsing import Literal, Word, Optional, nums, alphas, ZeroOrMore, printables , Group , alphas8bit , # grammar src = Word(printables) trans = Word(printables) number = Word(nums) x=number + "." + src + "::" + trans + "::" + number + "." + number #parsing for eng-dict efiledata = open('b1aop_or_not_word.txt').read() eresults = x.parseString(efiledata) edict1 = {} edict2 = {} counter=0 xx=list() for result in eresults: trans=""#translation string ew=""#english word xx=result[0] ew=xx[2] trans=xx[4] edict1 = { ew:trans } edict2.update(edict1) print len(edict2) #no of entries in the english dictionary print "edict2 has been created" print "english dictionary" , edict2 #parsing for hin-dict hfiledata = open('b1aop_or_not_word.txt').read() hresults = x.scanString(hfiledata) hdict1 = {} hdict2 = {} counter=0 for result in hresults: trans=""#translation string hw=""#hin word xx=result[0] hw=xx[2] trans=xx[4] #print trans hdict1 = { trans:hw } hdict2.update(hdict1) print len(hdict2) #no of entries in the hindi dictionary print"hdict2 has been created" print "hindi dictionary" , hdict2 ''' ####################################################################################################################### def translate(d, ow, hinlist): if ow in d.keys():#ow=old word d=dict print ow , "exists in the dictionary keys" transes = d[ow] transes = transes.split() print "possible transes for" , ow , " = ", transes for word in transes: if word in hinlist: print "trans for" , ow , " = ", word return word return None else: print ow , "absent" return None f = open('bidir','w') #lines = ["'\ #5# 10 # and better performance in business in turn benefits consumers . # 0 0 0 0 0 0 0 0 0 0 \ #5# 11 # vHyaapaar mEmn bEhtr kaam upbhOkHtaaomn kE lIe laabhpHrdd hOtaa hAI . # 0 0 0 0 0 0 0 0 0 0 0 \ #'"] data=open('bi_full_2','rb').read() lines = data.split('!@#$%') loc=0 for line in lines: eng, hin = [subline.split(' # ') for subline in line.strip('\n').split('\n')] for transdict, source, dest in [(edict2, eng, hin), (hdict2, hin, eng)]: sourcethings = source[2].split() for word in source[1].split(): tl = dest[1].split() otherword = translate(transdict, word, tl) loc = source[1].split().index(word) if otherword is not None: otherword = otherword.strip() print word, ' <-> ', otherword, 'meaning=good' if otherword in dest[1].split(): print word, ' <-> ', otherword, 'trans=good' sourcethings[loc] = str( dest[1].split().index(otherword) + 1) source[2] = ' '.join(sourcethings) eng = ' # '.join(eng) hin = ' # '.join(hin) f.write(eng+'\n'+hin+'\n\n\n') f.close() ''' if an example input sentence for the source file is: 1# 5 # modern markets : confident consumers # 0 0 0 0 0 1# 6 # AddhUnIk baajaar : AshHvsHt upbhOkHtaa . # 0 0 0 0 0 0 !@#$% the ouptut would look like this :- 1# 5 # modern markets : confident consumers # 1 2 3 4 5 1# 6 # AddhUnIk baajaar : AshHvsHt upbhOkHtaa . # 1 2 3 4 5 0 !@#$% Output Explanation:- This achieves bidirectional alignment. It means the first word of english 'modern' maps to the first word of hindi 'AddhUnIk' and vice versa. Here even characters are take as words as they also are an integral part of bidirectional mapping. Thus if you observe the hindi WORD '.' has a null alignment and it maps to nothing with respect to the English sentence as it doesn't have a full stop. The 3rd line int the output basically represents a delimiter when we are working for a number of sentences for which your trying to achieve bidirectional mapping. What modification should i make for it to work if the I have the hindi sentences in Unicode(UTF-8) format.

Read the article

Convert extended ASCII characters to it's right presentation using Console.ReadKey() method and ConsoleKeyInfo variable

- by mishamosher

Readed about 30 minutes, and didn't found some specific for this in this site. Suppose the following, in C#, console application: ConsoleKeyInfo cki; cki = Console.ReadKey(true); Console.WriteLine(cki.KeyChar.ToString()); //Or Console.WriteLine(cki.KeyChar) as well Console.ReadKey(true); Now, let's put ¿ in the console entry, and asign it to cki via a Console.ReadKey(true). What will be shown isn't the ¿ symbol, the ¨ symbol is the one that's shown instead. And the same happens with many other characters. Examples: ñ shows ¤, ¡ shows -, ´ shows ï. Now, let's take the same code snipplet and add some things for a more Console.ReadLine() like behavior: string data = string.Empty; ConsoleKeyInfo cki; for (int i = 0; i < 10; i++) { cki = Console.ReadKey(true); data += cki.KeyChar; } Console.WriteLine(data); Console.ReadKey(true); The question, how to handle this by the right way, end printing the right characters that should be stored on data, not things like ¨, ¤, -, ï, etc? Please note that I want a solution that works with ConsoleKeyInfo and Console.ReadKey(), not use other variable types, or read methods. EDIT: Because ReadKey() method, that comes from Console namespace, depends on Kernel32.dll and it definetively bad handles the extended ASCII and unicode, it's not an option anymore to just find a valid conversion for what it returns. The only valid way to handle the bad behavior of ReadKey() is to use the cki.Key property that's written in cki = Console.ReadKey(true) execution and apply a switch to it, then, return the right values on dependence of what key was pressed. For example, to handle the Ñ key pressing: string data = string.Empty; ConsoleKeyInfo cki; cki = Console.ReadKey(true); switch (cki.Key) { case ConsoleKey.Oem3: if (cki.Modifiers.ToString().Contains("Shift")) //Could added handlers for Alt and Control, but not putted in here to keep the code small and simple data += "Ñ"; else data += "ñ"; break; } Console.WriteLine(data); Console.ReadKey(true); So, now the question has a wider focus... Which others functions completes it's execution with only one key pressed, and returns what's pressed (a substitute of ReadKey())? I think that there's not such substitutes, but a confirmed answer would be usefull. EDIT2: HA! Found the way, for something I used for so many times Windows 98 SE. There are the codepages, the ones responsibles for how's presented the info in the console. ReadLine() reconfigures the codepage to use properly the extended ASCII and Unicode characters. ReadKey() leaves it in EN-US default (codepage 850). Just use a codepage that prints the characters you want, and that's all. Refer to http://en.wikipedia.org/wiki/Code_page for some of them :) So, for the Ñ key press, the solution is this: Console.OutputEncoding = Encoding.GetEncoding(1252); //Also 28591 is valid for `Ñ` key, and others too string data = string.Empty; ConsoleKeyInfo cki; cki = Console.ReadKey(true); data += cki.KeyChar; Console.WriteLine(data); Console.ReadKey(true); Simple :) Now I'm wrrr with myself... how could I forget those codepages!? Question answered, so, no more about this!

Read the article

PHP: Writing non-english characters to XML - encoding problem

- by Dean

Hello, I wrote a small PHP script to edit the site news XML file. I used DOM to manipulate the XML (Loading, writing, editing). It works fine when writing English characters, but when non-English characters are written, PHP throws an error when trying to load the file. If I manually type non-English characters into the file - it's loaded perfectly fine, but if PHP writes the non-English characters the encoding goes wrong, although I specified the utf-8 encoding. Any help is appreciated. Errors: Warning: DOMDocument::load() [domdocument.load]: Entity 'times' not defined in filepath Warning: DOMDocument::load() [domdocument.load]: Input is not proper UTF-8, indicate encoding ! Bytes: 0x91 0x26 0x74 0x69 in filepath Here are the functions responsible for loading and saving the file (self-explanatory): function get_tags_from_xml(){ // Load news entries from XML file for display $errors = Array(); if(!$xml_file = load_news_file()){ // Load file // String indicates error presence $errors = "file not found"; return $errors; } $taglist = $xml_file->getElementsByTagName("text"); return $taglist; } function set_news_lang(){ // Sets the news language global $news_lang; if($_POST["news-lang"]){ $news_lang = htmlentities($_POST["news-lang"]); } elseif($_GET["news-lang"]){ $news_lang = htmlentities($_GET["news-lang"]); } else{ $news_lang = "he"; } } function load_news_file(){ // Load XML news file for proccessing, depending on language global $news_lang; $doc = new DOMDocument('1.0','utf-8'); // Create new XML document $doc->load("news_{$news_lang}.xml"); // Load news file by language $doc->formatOutput = true; // Nicely format the file return $doc; } function save_news_file($doc){ // Save XML news file, depending on language global $news_lang; $doc->saveXML($doc->documentElement); $doc->save("news_{$news_lang}.xml"); } Here is the code for writing to XML (add news): <?php ob_start()?> <?php include("include/xml_functions.php")?> <?php include("../include/functions.php")?> <?php get_lang();?> <?php //TODO: ADD USER AUTHENTICATION! if(isset($_POST["news"]) && isset($_POST["news-lang"])){ set_news_lang(); $news = htmlentities($_POST["news"]); $xml_doc = load_news_file(); $news_list = $xml_doc->getElementsByTagName("text"); // Get all existing news from file $doc_root_element = $xml_doc->getElementsByTagName("news")->item(0); // Get the root element of the new XML document $new_news_entry = $xml_doc->createElement("text",$news); // Create the submited news entry $doc_root_element->appendChild($new_news_entry); // Append submited news entry $xml_doc->appendChild($doc_root_element); save_news_file($xml_doc); header("Location: /cpanel/index.php?lang={$lang}&news-lang={$news_lang}"); } else{ header("Location: /cpanel/index.php?lang={$lang}&news-lang={$news_lang}"); } ?> <?php ob_end_flush()?>

Read the article

ZPL II Extended Characters

- by Mauro

I'm trying to print extended code page 850 characters using ZPL II to a Zebra S4M. Whenever one of the extended characters I.E. ASCII value 127 is used I get a box of varying shades of grey instead of the actual value. I'm trying to print ± and ° (ALT+0177 and ALT+0176). I suspect its the RawPrinterHelper I am trying to use (as downloaded from MS, and another from CodeProject) however I cant see where the character codes are going wrong. Weirdly, printing direct from Notepad renders the correct characters, which leads me to believe it is a problem with the raw printer helper class. I am not tied to using the Raw Printer Helper class so if there is a better way of doing it, I am more than happy to see them. SAMPLE ZPLII Without escaped chars ^XA ^FO30,200^AD^FH,18,10^FD35 ± 2 ° ^FS ^FS ^XZ With escaped chars (tried both upper and lower case) ^XA ^FO30,200^AD^FH,18,10^FD35 _b0 2 _b1 ^FS ^FS ^XZ Raw Printer Helper [StructLayout(LayoutKind.Sequential)] public struct DOCINFO { [MarshalAs(UnmanagedType.LPWStr)] public string printerDocumentName; [MarshalAs(UnmanagedType.LPWStr)] public string pOutputFile; [MarshalAs(UnmanagedType.LPWStr)] public string printerDocumentDataType; } public class RawPrinter { [ DllImport("winspool.drv", CharSet = CharSet.Unicode, ExactSpelling = false, CallingConvention = CallingConvention.StdCall)] public static extern long OpenPrinter(string pPrinterName, ref IntPtr phPrinter, int pDefault); [ DllImport("winspool.drv", CharSet = CharSet.Unicode, ExactSpelling = false, CallingConvention = CallingConvention.StdCall)] public static extern long StartDocPrinter(IntPtr hPrinter, int Level, ref DOCINFO pDocInfo); [ DllImport("winspool.drv", CharSet = CharSet.Unicode, ExactSpelling = true, CallingConvention = CallingConvention.StdCall)] public static extern long StartPagePrinter(IntPtr hPrinter); [ DllImport("winspool.drv", CharSet = CharSet.Ansi, ExactSpelling = true, CallingConvention = CallingConvention.StdCall)] public static extern long WritePrinter(IntPtr hPrinter, string data, int buf, ref int pcWritten); [ DllImport("winspool.drv", CharSet = CharSet.Unicode, ExactSpelling = true, CallingConvention = CallingConvention.StdCall)] public static extern long EndPagePrinter(IntPtr hPrinter); [ DllImport("winspool.drv", CharSet = CharSet.Unicode, ExactSpelling = true, CallingConvention = CallingConvention.StdCall)] public static extern long EndDocPrinter(IntPtr hPrinter); [ DllImport("winspool.drv", CharSet = CharSet.Unicode, ExactSpelling = true, CallingConvention = CallingConvention.StdCall)] public static extern long ClosePrinter(IntPtr hPrinter); public static void SendToPrinter(string printerJobName, string rawStringToSendToThePrinter, string printerNameAsDescribedByPrintManager) { IntPtr handleForTheOpenPrinter = new IntPtr(); DOCINFO documentInformation = new DOCINFO(); int printerBytesWritten = 0; documentInformation.printerDocumentName = printerJobName; documentInformation.printerDocumentDataType = "RAW"; OpenPrinter(printerNameAsDescribedByPrintManager, ref handleForTheOpenPrinter, 0); StartDocPrinter(handleForTheOpenPrinter, 1, ref documentInformation); StartPagePrinter(handleForTheOpenPrinter); WritePrinter(handleForTheOpenPrinter, rawStringToSendToThePrinter, rawStringToSendToThePrinter.Length, ref printerBytesWritten); EndPagePrinter(handleForTheOpenPrinter); EndDocPrinter(handleForTheOpenPrinter); ClosePrinter(handleForTheOpenPrinter); } }

Read the article

Use HttpGet with illegal characters in the URL

- by kaciula

I am trying to use DefaultHttpClient and HttpGet to make a request to a web service. Unfortunately the web service URL contains illegal characters such as { (ex: domain.com/service/{username}). It's obvious that the web service naming isn't well written but I can't change it. When I do HttpGet(url), I get that I have an illegal character in the url (that is { and }). If I encode the URL before that, there is no error but the request goes to a different URL where there is nothing. The url, although has illegal characters, works from the browser but the HttpGet implementation doesn't let me use it. What should I do or use instead to avoid this problem?

Read the article

C# UTF8 output keep encoded characters intact

- by Stefan Pohl

Hello, i have a very simple question I can't seem to get my head around. I have a properly encoded UTF8-String I parse into a JObject with Json.NET, fiddle around with some values and write it to the commandline, keeping the encoded characters intact. Everything works great except for the keeping the encoded characters intact part. Code: var json = "{roster: [[\"Tulg\u00f4r\", 990, 1055]]}"; var j = JObject.Parse(json); for (int i = 0; i < j["roster"].Count(); i++) { j["roster"][i][1] = ((int)j["roster"][i][1]) * 3; j["roster"][i][2] = ((int)j["roster"][i][2]) * 3; } Console.WriteLine(JsonConvert.SerializeObject(j, Formatting.None)); Actual Output: {"roster":[["Tulgôr",2970,3165]]} Desired Output: {"roster":[["Tulg\u00f4r",2970,3165]]} It seems like my phrasing in Google is inappropriate since nothing useful came up. I'm sure it's something uber-easy and i will feel pretty stupid afterwards. :)

Read the article

VB.NET Debug Error

- by Daniel

I get this error "Illegal characters in path" for this code: Dim strm As System.IO.FileStream strm = New System.IO.FileStream(filepath, IO.FileMode.Open, IO.FileAccess.Read)

Read the article

bash tips needed for understanding how to escape characters in command-line

- by Jesper Rønn-Jensen

My knowledge of commandline bash is missing on a particular area: I constantly forget how to properly escape characters. Today I wanted to echo this string into a file: #!/bin/env bash python -m SimpleHTTPServer echo "#!/bin/env bash\npython -m SimpleHTTPServer" server.sh && chmod +x server.sh -bash: !/bin/env: event not found That's right: Remember to escape ! or bash will think it's a special bash event command. But I can't get the escaping right! \! yields \! in the echoed string, and so does \\!. Furthermore, \n will not translate to a line break. Do you have some general tips that makes it easier for me to understand escaping rules? To be very precise, I'll accept an answer which tells me which characters I should escape on the bash command line? Including how to correctly output newline and exclamation mark in my example.

Read the article

Properly handling unicode characters in Rails

- by Gdeglin

By default Rails allows users of our application to input non-utf8 data, such as: ¶®«¼ However when we attempt to retrieve the data from our database and render it in a template Rails incorrectly assumes that it is in UTF-8 format and throws an error. ArgumentError: invalid byte sequence in UTF-8 What is the best way to handle this? I have seen one fix that suggested sanitizing the data in every place the user can input it. However, that would involve changing a considerable amount of code and it would strip out the characters entirely. Ideally we would want some characters converted to their UTF-8 equivalents. Our environment: Ruby: 1.9.1 Rails 2.3.5 MySql Gem: 2.8.1 This is a serious and urgent problem for us so your answers are very appreciated!

Read the article

Apple Push Notifications With Foreign Accent Characters Not Receiving

- by confeng

I'm sending push notifications and when the message contains foreign characters (Turkish in my case) like I, s, ç, g... The message does not arrive to devices. Here's my code: $message = 'THIS is push'; $passphrase = 'mypass'; $ctx = stream_context_create(); stream_context_set_option($ctx, 'ssl', 'local_cert', 'MyPemFile.pem'); stream_context_set_option($ctx, 'ssl', 'passphrase', $passphrase); // Open a connection to the APNS server $fp = stream_socket_client( 'ssl://gateway.push.apple.com:2195', $err, $errstr, 60, STREAM_CLIENT_CONNECT|STREAM_CLIENT_PERSISTENT, $ctx); if (!$fp) exit("Failed to connect: $err $errstr" . PHP_EOL); echo 'Connected to Apple service. ' . PHP_EOL; // Encode the payload as JSON $body['aps'] = array( 'alert' => $message, 'sound' => 'default' ); $payload = json_encode($body); $result = 'Start'.PHP_EOL; $tokenArray = array('mytoken'); foreach ($tokenArray as $item) { // Build the binary notification $msg = chr(0) . pack('n', 32) . pack('H*', $item) . pack('n', strlen($payload)) . $payload; // Send it to the server $result = fwrite($fp, $msg, strlen($msg)); if (!$result) echo 'Failed message'.PHP_EOL; else echo 'Successful message'.PHP_EOL; } // Close the connection to the server fclose($fp); I have tried encoding $message variable with utf8_encode() but the message received as "THÝS is push". And other ways like iconv() didn't work for me, some of them cropped Turkish characters, some didn't receive at all. I also have header('content-type: text/html; charset: utf-8'); and <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /> in my page. I don't think the problem appears while I set the value but maybe with pack() function. Any ideas to solve this without replacing characters with English?

Search Results

Search found 5776 results on 232 pages for 'forbidden characters'.

Page 19/232 | < Previous Page | 15 16 17 18 19 20 21 22 23 24 25 26 | Next Page >

- by Brandon

- by Andrew Rollings

- by Snooze

- by Kanavi

- by jkndrkn

- by Shaitan00

- by liaK

- by user125011

- by Luci Sandor

- by Ofri Raviv

- by Flimzy

- by johnny_bgoode

- by SuperUserMan

- by Chase

- by Zachary Polikarpus

- by mgj

- by mishamosher

- by Dean

- by Mauro

- by kaciula

- by Stefan Pohl

- by Daniel

- by Jesper Rønn-Jensen

- by Gdeglin

- by confeng

< Previous Page | 15 16 17 18 19 20 21 22 23 24 25 26 | Next Page >