chinese characters - Page 18

"sed" regex help: Replacing characters

- by powerbar

I want to change characters in a XML file by using sed. The input looks like this:  <root> <tree foo="abcd" bar="abccdcd" /> <dontTouch foo="asd" bar="abc" /> </root> Now I want to change all c to X in the bar tag of the tree element.  <root> <tree foo="abcd" bar="abXXdXd" /> <dontTouch foo="asd" bar="abc" /> </root> How is the correct sed command? Please consider, there can be more than one occurence of c (next to each other or not) in one tag... I tried this myself, but it won't change multiple c, and it does append a X :( sed -i 's/$<tree.*bar=\".*$c$.*\"\/>$/\1X\2/g' Input.xml

Read the article

Hebrew (utf8) characters in windows cmd console

- by epeleg

I previously asked this Q: utf8 hebrew on mysql console on debian (via putty on windows) And managed to get it working by starting mysql with --default-character-set=utf8 and setting putty to show utf8 as well. Now I need to do the same but on a windows server. The data is again the same but when I start mysql with --default-character-set=utf8 it I see multuple characters where I am supposed to see hebrew. I think the problem is with the set up of windows cmd console that it does not properly display utf8. any ideas ?

Read the article

Meta key in Terminal.app vs national characters

- by yacoob

I'm using Terminal.app, and I'd like to use emacs running inside - either locally, or after sshing to remote server. Problem is, I can't get working Meta modifier. Namely, if I enable 'Use option as meta key', Option key works like proper Meta, but I lose ability to enter Polish diacritics (aelósznzc), that are entered with right Option. If I disable 'Use option as meta key', my Meta is gone, but I can again use Polish characters. In this state they appear only with right Option modifier, so I guess it's Terminal.app's fault that it doesn't make a difference between left and right Option key, when the relevant preference is selected. What are my options then? Is there a good solution for my problem? I can always use ESC as a poor man's Meta replacement, but I don't like that idea.

Read the article

Using sed to convert hex characters in postgresql dump file

- by Bernt

I am working on moving several databases from a Postgresql 8.3 server to a Postgresql 8.4 server. It has worked fine so far, but one base has given me some trouble. The database is listed as unicode-encoded in the 8.3-server, but somehow a client program has managed to inject some invalid unicode data into it. When I do a normal dump and restore using postgres' custom format, the new server won't accept it, complaining about unicode errors. My plan is to do a plain text dump of the database, then use sed to replace the invalid characters with nothing (they are not needed). But how do you make sed work on hex/binary values in a file?

Read the article

VMWare and ALT GR key results in missing characters

- by donat

For some odd reason WMware products hijack the AltGy-key despite I make sure that other keys are used as hot keys to release mouse and keyboard from the virtual machine. While this is not a problem for US keyboards, european however who extensively use AltGR for characters such as pipe (|), at-sign (@), left brace ({) and right brace (}). This seem to happen both in Windows and Linux and I can not seem to find a solution that works for both. :( Anyone have an idea how to fix this without the need to modify the guest OS every time? Thank you.

Read the article

[Linux] Bind/map Character to alt+[some key]?

- by Paul

OS: Ubuntu In programming and various terminal programs (Screen, Vim) the [, ], { and } tends to be used a lot. I'm using a Norwegian keyboard where these are placed such that I have to stretch my fingers a bit too long for whats comfortable. To make it easier I though I'd try to make alt+[some key] be one of these characters. Is there a way that I can bind, say alt+æ (Norwegian letter) to '{' system wide? Btw, is such thing called binding, mapping or something else? I'm getting a bit confused by the terms... :)

Read the article

When I type certain characters, they come out as backquotes

- by JXG

Very strange behavior. Some background: I bought a new lenovo G550 laptop, running Windows 7. I live in Israel. When I type certain keys, in any application, they are prefaced with the backquote (`). These characters are: Insert, Delete, Left Ctrl (the right-hand one is fine), - (the regular dash: the one on the keypad is fine), =, 5 (the regular one), 4 (the one on the numeric keypad, whether or not Num Lock is on), and PgDn (the regular one). When I press the Fn key with these I don't get the behavior. Does anyone know why this is happening, or how I can fix it?

Read the article

Incremental search for un/accented characters

- by user38983

Does emacs have an incremental search mode, where searching for a character will search for itself and for any other versions of the character with accent marks, similar to how Google Chrome (at least v27) will do when searching in a page? Alternatively, is there an additional library or piece of elisp code that can put incremental search in such a mode? For example, incremental search for: 'manana', would find 'manana' or 'mañana' 'motley crue', would also find 'Mötley Crüe' (with case-sensitivity off). Even a solution that only covers a subset of these characters would be helpful.

Read the article

php returns junk characters at end of everything

- by blindJesse

php appears to be adding junk characters to the end of everything it returns on a friend's site. I'm not an admin on the server but I'd like to give an informed complaint to get this fixed. The site is http://daytoncodebreakers.org. You can see some junk at the end of every page on the site (what appear to be question marks with something else in the middle). I originally thought this was a wordpress issue, but check out http://daytoncodebreakers.org/whereisini.php (which is just a call to phpinfo), and http://daytoncodebreakers.org/hello.php (which is just 'Hello World'). I'm not sure if this is the most appropriate site, but I think this is a server config issue, so I'm posting it here (rather than stackoverflow or superuser). Feel free to move it if want.

Read the article

Excel, Lookup special characters and spaces.

- by Sisyphus

I have an excel, spreadsheet that has multiple sheets. The first sheet is an index of files, I am using the following forumla to look up a value in column A, references against the index sheet, if it matches then it copies the value from column B from the index sheet. The forumla is: =IF($A3="", "", (LOOKUP($A3, INDEX!$A$3:$A$26, INEDEX!B$3:B$26))) It works for data that has no spaces and special characters, anybody have any ideas why it doesn't work and how I can make it work? Thanks in advance.

Read the article

How to support 3-byte UTF-8 Characters in ANT

- by efelton

I am trying to support UTF-8 characters in my ANT script. As long as the character string are made up of 2-byte UTF-8 characters, such as: Lògìñ Ùsèr ÌÐ Then things work fine. When I use Unicode Han Character: ? Which, according to this site: http://www.fileformat.info/info/unicode/char/6211/index.htm Has a UTF-8 encoding of 0xE6 0x88 0x91 I can see in UltraEdit, my input properties file has the values "E6 88 91" all in a row, so I'm fairly confident that my input is correct. And When I open the same file in Notepad++ I can see all the characters correctly. Here is my Build Script: <?xml version="1.0" encoding="UTF-8" ?> <project name="utf8test" default="all" basedir="."> <target name="all"> <loadproperties encoding="UTF-8" srcfile="./apps.properties.all.txt" /> <echo>No encoding ${common.app.name}</echo> <echo encoding="UTF-8">UTF-8 ${common.app.name}</echo> <echo encoding="UnicodeLittle">UnicodeLittle ${common.app.name}</echo> <echo encoding="UnicodeLittleUnmarked">UnicodeLittleUnmarked ${common.app.name}</echo> <echo>${common.app.ServerName}</echo> <echo>${bb.vendor}</echo> <echo>No encoding ${common.app.UserIdText}</echo> <echo encoding="UTF-8">UTF-8 ${common.app.UserIdText}</echo> <echo encoding="UnicodeLittle">UnicodeLittle ${common.app.UserIdText}</echo> <echo encoding="UnicodeLittleUnmarked">UnicodeLittleUnmarked ${common.app.UserIdText}</echo> <echoproperties /> </target> </project> And here is my properties file: common.app=VrvPsLTst common.app.name=?? common.app.description=Pseudo Loc Test App for Build Script testing common.app.ServerName=http://Vèrìvò.com bb.vendor=Vèrìvò common.app.PasswordText=Pàsswòrð bb.override.list=MP_COPYRIGHTTEXT, "Çòpÿrìght 2012 Vèrívó Bùîlð TéàM" common.app.LoginButtonText=Lògìñ common.app.UserIdText=Ùsèr ÌÐ bb.SMSSuccess=Mèssàgéß Sùççêssfúllÿ Sëñt common.app.LoginScreenMessage=WèlçòMé Mêssàgë common.app.LoginProgressMessage=Àùthèñtìçàtíòñ îñ prógréss... ios.RegistrationText=Règìstràtíòñ Téxt ios.RegistrationURL=http://www.josscrowcroft.com/2011/code/utf-8-multibyte-characters-in-url-parameters-%E2%9C%93/ Here is what the output looks like: Buildfile: C:\Temp\utf8\build.xml all: [echo] No encoding ?? [echo] UTF-8 ?? [echo] ÿþU n i c o d e L i t t l e ? ? [echo] U n i c o d e L i t t l e U n m a r k e d ? ? [echo] http://Vèrìvò.com [echo] Vèrìvò [echo] No encoding Ùsèr ÌÐ [echo] UTF-8 Ã™sÃ¨r ÃŒÃ? [echo] ÿþU n i c o d e L i t t l e Ù s è r Ì Ð [echo] U n i c o d e L i t t l e U n m a r k e d Ù s è r Ì Ð [echoproperties] #Ant properties [echoproperties] #Mon Jun 18 15:25:13 EDT 2012 [echoproperties] ant.core.lib=C\:\\ant\\lib\\ant.jar [echoproperties] ant.file=C\:\\Temp\\utf8\\build.xml [echoproperties] ant.file.type=file [echoproperties] ant.file.type.utf8test=file [echoproperties] ant.file.utf8test=C\:\\Temp\\utf8\\build.xml [echoproperties] ant.home=c\:\\ant\\bin\\.. [echoproperties] ant.java.version=1.6 [echoproperties] ant.library.dir=C\:\\ant\\lib [echoproperties] ant.project.default-target=all [echoproperties] ant.project.invoked-targets=all [echoproperties] ant.project.name=utf8test [echoproperties] ant.version=Apache Ant version 1.8.1 compiled on April 30 2010 [echoproperties] awt.toolkit=sun.awt.windows.WToolkit [echoproperties] basedir=C\:\\Temp\\utf8 [echoproperties] bb.SMSSuccess=M\u00E8ss\u00E0g\u00E9\u00DF S\u00F9\u00E7\u00E7\u00EAssf\u00FAll\u00FF S\u00EB\u00F1t [echoproperties] bb.override.list=MP_COPYRIGHTTEXT, "\u00C7\u00F2p\u00FFr\u00ECght 2012 V\u00E8r\u00EDv\u00F3 B\u00F9\u00EEl\u00F0 T\u00E9\u00E0?" [echoproperties] bb.vendor=V\u00E8r\u00ECv\u00F2 [echoproperties] common.app=VrvPsLTst [echoproperties] common.app.LoginButtonText=L\u00F2g\u00EC\u00F1 [echoproperties] common.app.LoginProgressMessage=\u00C0\u00F9th\u00E8\u00F1t\u00EC\u00E7\u00E0t\u00ED\u00F2\u00F1 \u00EE\u00F1 pr\u00F3gr\u00E9ss... [echoproperties] common.app.LoginScreenMessage=W\u00E8l\u00E7\u00F2?\u00E9 M\u00EAss\u00E0g\u00EB [echoproperties] common.app.PasswordText=P\u00E0ssw\u00F2r\u00F0 [echoproperties] common.app.ServerName=http\://V\u00E8r\u00ECv\u00F2.com [echoproperties] common.app.UserIdText=\u00D9s\u00E8r \u00CC\u00D0 [echoproperties] common.app.description=Pseudo Loc Test App for Build Script testing [echoproperties] common.app.name=?? [echoproperties] file.encoding=Cp1252 [echoproperties] file.encoding.pkg=sun.io [echoproperties] file.separator=\\ [echoproperties] ios.RegistrationText=R\u00E8g\u00ECstr\u00E0t\u00ED\u00F2\u00F1 T\u00E9xt [echoproperties] ios.RegistrationURL=http\://www.josscrowcroft.com/2011/code/utf-8-multibyte-characters-in-url-parameters-%E2%9C%93/ [echoproperties] java.awt.graphicsenv=sun.awt.Win32GraphicsEnvironment [echoproperties] java.awt.printerjob=sun.awt.windows.WPrinterJob [echoproperties] java.class.path=c\:\\ant\\bin\\..\\lib\\ant-launcher.jar;C\:\\Temp\\utf8\\.\\;C\:\\Program Files (x86)\\Java\\jre7\\lib\\ext\\QTJava.zip;C\:\\ant\\lib\\ant-antlr.jar;C\:\\ant\\lib\\ant-apache-bcel.jar;C\:\\ant\\lib\\ant-apache-bsf.jar;C\:\\ant\\lib\\ant-apache-log4j.jar;C\:\\ant\\lib\\ant-apache-oro.jar;C\:\\ant\\lib\\ant-apache-regexp.jar;C\:\\ant\\lib\\ant-apache-resolver.jar;C\:\\ant\\lib\\ant-apache-xalan2.jar;C\:\\ant\\lib\\ant-commons-logging.jar;C\:\\ant\\lib\\ant-commons-net.jar;C\:\\ant\\lib\\ant-contrib-1.0b3.jar;C\:\\ant\\lib\\ant-jai.jar;C\:\\ant\\lib\\ant-javamail.jar;C\:\\ant\\lib\\ant-jdepend.jar;C\:\\ant\\lib\\ant-jmf.jar;C\:\\ant\\lib\\ant-jsch.jar;C\:\\ant\\lib\\ant-junit.jar;C\:\\ant\\lib\\ant-launcher.jar;C\:\\ant\\lib\\ant-netrexx.jar;C\:\\ant\\lib\\ant-nodeps.jar;C\:\\ant\\lib\\ant-starteam.jar;C\:\\ant\\lib\\ant-stylebook.jar;C\:\\ant\\lib\\ant-swing.jar;C\:\\ant\\lib\\ant-testutil.jar;C\:\\ant\\lib\\ant-trax.jar;C\:\\ant\\lib\\ant-weblogic.jar;C\:\\ant\\lib\\ant.jar;C\:\\ant\\lib\\bb-ant-tools.jar;C\:\\ant\\lib\\xercesImpl.jar;C\:\\ant\\lib\\xml-apis.jar;C\:\\Program Files\\Java\\jre7\\lib\\tools.jar [echoproperties] java.class.version=51.0 [echoproperties] java.endorsed.dirs=C\:\\Program Files\\Java\\jre7\\lib\\endorsed [echoproperties] java.ext.dirs=C\:\\Program Files\\Java\\jre7\\lib\\ext;C\:\\Windows\\Sun\\Java\\lib\\ext [echoproperties] java.home=C\:\\Program Files\\Java\\jre7 [echoproperties] java.io.tmpdir=C\:\\Users\\efelton\\AppData\\Local\\Temp\\ [echoproperties] java.library.path=C\:\\Windows\\SYSTEM32;C\:\\Windows\\Sun\\Java\\bin;C\:\\Windows\\system32;C\:\\Windows;C\:\\Windows\\SYSTEM32;C\:\\Windows;C\:\\Windows\\SYSTEM32\\WBEM;C\:\\Windows\\SYSTEM32\\WINDOWSPOWERSHELL\\V1.0\\;C\:\\PROGRAM FILES\\INTEL\\WIFI\\BIN\\;C\:\\PROGRAM FILES\\COMMON FILES\\INTEL\\WIRELESSCOMMON\\;C\:\\PROGRAM FILES (X86)\\MICROSOFT SQL SERVER\\100\\TOOLS\\BINN\\;C\:\\PROGRAM FILES\\MICROSOFT SQL SERVER\\100\\TOOLS\\BINN\\;C\:\\PROGRAM FILES\\MICROSOFT SQL SERVER\\100\\DTS\\BINN\\;C\:\\PROGRAM FILES (X86)\\MICROSOFT SQL SERVER\\100\\TOOLS\\BINN\\VSSHELL\\COMMON7\\IDE\\;C\:\\PROGRAM FILES (X86)\\MICROSOFT SQL SERVER\\100\\DTS\\BINN\\;C\:\\Program Files\\ThinkPad\\Bluetooth Software\\;C\:\\Program Files\\ThinkPad\\Bluetooth Software\\syswow64;C\:\\Program Files (x86)\\QuickTime\\QTSystem\\;C\:\\Program Files (x86)\\AccuRev\\bin;C\:\\Program Files\\Java\\jdk1.7.0_04\\bin;C\:\\Program Files (x86)\\IDM Computer Solutions\\UltraEdit\\;. [echoproperties] java.runtime.name=Java(TM) SE Runtime Environment [echoproperties] java.runtime.version=1.7.0_04-b22 [echoproperties] java.specification.name=Java Platform API Specification [echoproperties] java.specification.vendor=Oracle Corporation [echoproperties] java.specification.version=1.7 [echoproperties] java.vendor=Oracle Corporation [echoproperties] java.vendor.url=http\://java.oracle.com/ [echoproperties] java.vendor.url.bug=http\://bugreport.sun.com/bugreport/ [echoproperties] java.version=1.7.0_04 [echoproperties] java.vm.info=mixed mode [echoproperties] java.vm.name=Java HotSpot(TM) 64-Bit Server VM [echoproperties] java.vm.specification.name=Java Virtual Machine Specification [echoproperties] java.vm.specification.vendor=Oracle Corporation [echoproperties] java.vm.specification.version=1.7 [echoproperties] java.vm.vendor=Oracle Corporation [echoproperties] java.vm.version=23.0-b21 [echoproperties] line.separator=\r\n [echoproperties] os.arch=amd64 [echoproperties] os.name=Windows 7 [echoproperties] os.version=6.1 [echoproperties] path.separator=; [echoproperties] sun.arch.data.model=64 [echoproperties] sun.boot.class.path=C\:\\Program Files\\Java\\jre7\\lib\\resources.jar;C\:\\Program Files\\Java\\jre7\\lib\\rt.jar;C\:\\Program Files\\Java\\jre7\\lib\\sunrsasign.jar;C\:\\Program Files\\Java\\jre7\\lib\\jsse.jar;C\:\\Program Files\\Java\\jre7\\lib\\jce.jar;C\:\\Program Files\\Java\\jre7\\lib\\charsets.jar;C\:\\Program Files\\Java\\jre7\\lib\\jfr.jar;C\:\\Program Files\\Java\\jre7\\classes [echoproperties] sun.boot.library.path=C\:\\Program Files\\Java\\jre7\\bin [echoproperties] sun.cpu.endian=little [echoproperties] sun.cpu.isalist=amd64 [echoproperties] sun.desktop=windows [echoproperties] sun.io.unicode.encoding=UnicodeLittle [echoproperties] sun.java.command=org.apache.tools.ant.launch.Launcher -cp .;C\:\\Program Files (x86)\\Java\\jre7\\lib\\ext\\QTJava.zip [echoproperties] sun.java.launcher=SUN_STANDARD [echoproperties] sun.jnu.encoding=Cp1252 [echoproperties] sun.management.compiler=HotSpot 64-Bit Tiered Compilers [echoproperties] sun.os.patch.level=Service Pack 1 [echoproperties] user.country=US [echoproperties] user.dir=C\:\\Temp\\utf8 [echoproperties] user.home=C\:\\Users\\efelton [echoproperties] user.language=en [echoproperties] user.name=efelton [echoproperties] user.script= [echoproperties] user.timezone= [echoproperties] user.variant= BUILD SUCCESSFUL Total time: 1 second Thank you for your help EDIT\UPDATE 6/19/2012 I am developing in a Windows environment. I have installed a TTF from: http://freedesktop.org/wiki/Software/CJKUnifonts/Download I have updated UltraEdit to use the TTF and I can see the Chinese characters. <?xml version="1.0" encoding="UTF-8" ?> <project name="utf8test" default="all" basedir="."> <target name="all"> <echo>??</echo> <echo encoding="ISO-8859-1">ISO-8859-1 ??</echo> <echo encoding="UTF-8">UTF-8 ??</echo> <echo file="echo_output.txt" append="true" >?? ${line.separator}</echo> <echo file="echo_output.txt" append="true" encoding="ISO-8859-1">ISO-8859-1 ?? ${line.separator}</echo> <echo file="echo_output.txt" append="true" encoding="UTF-8">UTF-8 ?? ${line.separator}</echo> <echo file="echo_output.txt" append="true" encoding="UnicodeLittle">UnicodeLittle ?? ${line.separator}</echo> <echo file="echo_output.txt" append="true" encoding="UnicodeLittleUnmarked">UnicodeLittleUnmarked ?? ${line.separator}</echo> </target> </project> The output captured by running inside UltraEdit is: Buildfile: E:\temp\utf8\build.xml all: [echo] ?? [echo] ISO-8859-1 ?? [echo] UTF-8 ?? BUILD SUCCESSFUL Total time: 1 second And the echo_output.txt file shows up like this: ?? ISO-8859-1 ?? UTF-8 ?? ÿþU n i c o d e L i t t l e ? ? U n i c o d e L i t t l e U n m a r k e d ? ? So there appears to be somehting fundamentally wrong with how my ANT environment is set up since I cannot simply echo the character to the screen or to a file.

Read the article

Java how can I add an accented "e" to a string?

- by behrk2

Hello, With the help of tucuxi from the existing post Java remove HTML from String without regular expressions I have built a method that will parse out any basic HTML tags from a string. Sometimes, however, the original string contains html hexadecimal characters like é (which is an accented e). I have started to add functionality which will translate these escaped characters into real characters. You're probably asking: Why not use regular expressions? Or a third party library? Unfortunately I cannot, as I am developing on a BlackBerry platform which does not support regular expressions and I have never been able to successfully add a third party library to my project. So, I have gotten to the point where any é is replaced with "e". My question now is, how do I add an actual 'accented e' to a string? Here is my code: public static String removeHTML(String synopsis) { char[] cs = synopsis.toCharArray(); String sb = new String(); boolean tag = false; for (int i = 0; i < cs.length; i++) { switch (cs[i]) { case '<': if (!tag) { tag = true; break; } case '>': if (tag) { tag = false; break; } case '&': char[] copyTo = new char[7]; System.arraycopy(cs, i, copyTo, 0, 7); String result = new String(copyTo); if (result.equals("&#x00E9")) { sb += "e"; } i += 7; break; default: if (!tag) sb += cs[i]; } } return sb.toString(); } Thanks!

Read the article

oracle pl/sql bug: can't put_line more than 2000 characters

- by FrustratedWithFormsDesigner

Has anyone else noticed this phenomenon where dbms_output.put_line is unable to print more than 2000 characters at a time? Script is: set serveroutput on size 100000; declare big_str varchar2(2009); begin for i in 1..2009 loop big_str := big_str||'x'; end loop; dbms_output.put_line(length(big_str)); dbms_output.put_line(big_str); end; / I copied and pasted the output into an editor (Notepad++) which told me there were only 2000 characters, not 2009 which is what I think should have been pasted. This also happens with a few of my test scripts - only 2000 characters get printed. I have a workaround to print like this: dbms_output.put_line(length(big_str)); dbms_output.put_line(substr(big_str,1,1999)); dbms_output.put_line(substr(big_str,2000)); This adds new lines to the output, makes it hard to read when the text you're working with is preformatted. Has anyone else noticed this? Is it really a bug or some sort of obscure feature? Is there a better workaround? Is there any other information on this out there? Oracle version is: 10.2.0.3.0, using PL/SQL Developer (from Allround Automation).

Read the article

Creating files with french characters and encoding.

- by Kevin

HI, I am creating a file like so. FileStream temp = File.Create( this.FileName ); Then putting data in the file like so. this.Writer = new StreamWriter( this.Stream ); this.Writer.WriteLine( strMessage ); That code is encapsulated in a class hierarchy but that is the meat and potatoes of it. My problem is this. MSDN says that the default encoding for creating a file this way is UTF8. And when I write a french character such as é Textpad interprets the file as a UTF 8 file, but notepad++ says it's "ANSI as UTF8" or maybe it's an ansi file but is reading it as UTF8. When I create a file the same way without the french character both textpad and notepad++ read the file as an ansi file even though according to msdn it should be a utf 8 file still. Which program should be trusted. Notepad++ or textpad - Notepad++ seems to be more consistant, but is still the oppossite to what MSDN says it should be. My problem is that we create files that get sent off to another company and depending on whether there are french characters the encoding seems to keep changing. Or is there a better way to determine the encoding of a file. I've read about byte order marks and preambles but as far as I understand neither are guaranteed to be there. We initially thought that all the files we were building were ansi. Also please note that both ansi and utf8 should handle the french characters appropriately as the characters are part of both character sets.

Read the article

Replacing XML reserved characters in SQL Server 2005

- by Barn

I'm working on a system that takes relational data from a sql server DB and uses SSIS to produce an XML extract using sql server 2005's 'FOR XML PATH' command and a schema. The problem lies with replacing the XML reserved characters. 'FOR XML PATH' is only replacing <, , and &, not ' and ", so I need a way of replacing these myself. I've tried pre-processing the fields in the database to replace XML reserved characters with their entitised equivalents (e.g. & becomes &), but once these fields are used to construct XML using FOR XML the leading & is replaced with &, so I end up with &amp; where I should have &. What I've tried so far is altering the element's contents after the XML has been constructed using XQuery inside SQL server like so: DECLARE @data VARCHAR(MAX) SET @data = CONVERT(VARCHAR(MAX), [my xml column].query(' data(/root/node_i_want)') SELECT @data = [function to replace quotes etc](@data) SET [my xml column].modify('replace value of (/root/node_i_want)[1] with sql:variable("@data")') but I get the same problem. Essentially, is there something wrong I'm doing with the above, or a way to tell FOR XML to entitise other characters, or something like that? Basically anything short of having to write a program to change the XML after it has been assembled in large batches and saved to files!

Read the article

Problem with XML encoding of database contents with Latin characters

- by user89691

I have an ASP Access database that contains strings in various European languages. The database was populated prior by agents in the respective countries. It contains entries with accented etc characters as you would expect. If I open the database with MS Access these characters show up fine. For example the the German equivalent of "Open" shows as "Öffnen" (hopefully you can see an "O" with 2 dots above it!). I have ASP code that reads the database and returns records in XML. The text is passed to XMLEncode to construct the XML, but that only seems to deal with the 5 specials like "<", "&", etc. If I dump the XML the accented characters are unchanged. <English>Open</English> <German>Öffnen</German> If I look at the raw packets with Wireshark I see that the "Ö" byte is hex D6, which appears to be it's decimal Unicode and ISO 8859-1 value. The problem starts when I try to parse the XML in client-side JS. I get: "An invalid character was found in text content" from IE. FF and Chrome happily accept the XML without hiccup but the browser shows the "Ö" character as a diamond with a question mark inside. http://www.validome.org/xml/validate/ reports "encoding error." http://www.w3schools.com/dom/dom_validate.asp thinks it is fine. The XML is UTF-8 encoded. What do I need to do to have IE accept my XML without complaint? What do I need to do to have browsers display the stuff correctly?

Read the article

Lucene and Special Characters

- by Brandon

I am using Lucene.Net 2.0 to index some fields from a database table. One of the fields is a 'Name' field which allows special characters. When I perform a search, it does not find my document that contains a term with special characters. I index my field as such: Directory DALDirectory = FSDirectory.GetDirectory(@"C:\Indexes\Name", false); Analyzer analyzer = new StandardAnalyzer(); IndexWriter indexWriter = new IndexWriter(DALDirectory, analyzer, true, IndexWriter.MaxFieldLength.UNLIMITED); Document doc = new Document(); doc.Add(new Field("Name", "Test (Test)", Field.Store.YES, Field.Index.TOKENIZED)); indexWriter.AddDocument(doc); indexWriter.Optimize(); indexWriter.Close(); And I search doing the following: value = value.Trim().ToLower(); value = QueryParser.Escape(value); Query searchQuery = new TermQuery(new Term(field, value)); Searcher searcher = new IndexSearcher(DALDirectory); TopDocCollector collector = new TopDocCollector(searcher.MaxDoc()); searcher.Search(searchQuery, collector); ScoreDoc[] hits = collector.TopDocs().scoreDocs; If I perform a search for field as 'Name' and value as 'Test', it finds the document. If I perform the same search as 'Name' and value as 'Test (Test)', then it does not find the document. Even more strange, if I remove the QueryParser.Escape line do a search for a GUID (which, of course, contains hyphens) it finds documents where the GUID value matches, but performing the same search with the value as 'Test (Test)' still yields no results. I am unsure what I am doing wrong. I am using the QueryParser.Escape method to escape the special characters and am storing the field and searching by the Lucene.Net's examples. Any thoughts?

Read the article

Replacing accented/umlauted characters with their unadorned counterparts in C# [closed]

- by Andrew Rollings

Duplicate of 249087 I have a bunch of user generated addresses that may contain characters with diacritic marks. What is the most effective (i.e. generic) way (apart from a straightforward replace) to automatically convert any such characters to their closest English equivalent? E.g. any of àâãäå would become a æ would become the two separate letters ae ç would become c any of èéêë would become e etc. for all possible letter variations (preferably without having to find and encode lookups for each diacritic form of the letter). (Note: I have to pass these addresses on to third party software that is incapable of printing anything other than English characters. I'd rather the software was capable of handling them, but I have no control over that.) EDIT: Never mind... Found the answer [here][2]. It showed up in the "Related" section to the right of the question after I posted, but not in my prior search or as a pre-post suggestion. Hmm. I added the 'diacritics' tag to the other question in any case. EDIT 2: Jeez! Who voted this -1 after I closed it?

Read the article

How to parse XML with special characters?

- by Snooze

Whenever I try to parse XML with special characters such as o or ???? I get an error. The xml documents claims to use UTF-8 encoding but that does not seem to be the case. Here is what the troublesome text looks like when I view the XML in Firefox: Bleach: The Diamond Dust Rebellion - MÅ? Hitotsu no HyÅ?rinmaru; Bleach - The DiamondDust Rebellion - Mou Hitotsu no Hyourinmaru On the actual website, Å? is actually the character o. <br /> One day, Doraemon and his friends meet Professor Mangetsu (æº?æ??å??ç??, Professor Mangetsu?), who studies magic and magical beings such as goblins, and his daughter Miyoko (ç¾?å¤?å?, Miyoko?), and are warned of the dangerous approximation of the "star of the Underworld" to the Earth's orbit.<br /> <br /> And once again, on the actual website, those characters appear as ???? and ???. The actual XML file is formatted properly other than those special characters, which certainly do not appear to be using the UTF-8 encoding. Is there a way to get NSXML to parse these XML files?

Read the article

Corrupt UTF-8 Characters with PHP 5.2.10 and MySQL 5.0.81

- by jkndrkn

We have an application hosted on both a local development server and a live site. We are experiencing UTF-8 corruption issues and are looking to figure out how to resolve them. The system is run using symfony 1.0 with Propel. On our development server, we are running PHP 5.2.0 and MySQL 5.0.32. We do not experience corrupted UTF-8 characters there. On our live site, PHP 5.2.10 and MySQL 5.0.81 is running. On that server, certain characters such as ô´ and S are corrupted once they are stored in the database. The corrupted characters are showing up as either question marks or approximations of the original character with adjacent question marks. Examples of corruption: Uncorrupted: ô´ Corrupted: ô? Uncorrupted: S Corrupted: ? We are currently using the following techniques on both development and live servers: Executing the following queries prior to execution of any other queries: SET NAMES 'utf8' COLLATE 'utf8_unicode_ci' SET CHARSET 'utf8' Setting the <meta> Content-Type value to: <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> Adding the following to our .htaccess file: AddDefaultCharset utf-8 Using mb_* (multibyte) PHP functions where necessary. Being sure to set database columns to use utf8_unicode_ci collation. These techniques are sufficient for our development site, but do not work on the live site. On the live site I've also tried adding mysql_set_encoding('ut8', $mysql_connection) but this does not help either. I have found some evidence that newer versions of PHP and MySQL are mishandling UTF-8 character encodings.

Read the article

Word wrap in multiline textbox after 35 characters

- by Kanavi

<asp:TextBox CssClass="txt" ID="TextBox1" runat="server" onkeyup="CountChars(this);" Rows="20" Columns="35" TextMode="MultiLine" Wrap="true"> </asp:TextBox> I need to implement word-wrapping in a multi-line textbox. I cannot allow users to write more then 35 chars a line. I am using the following code, which breaks at precisely the specified character on every line, cutting words in half. Can we fix this so that if there's not enough space left for a word on the current line, we move the whole word to the next line? function CountChars(ID) { var IntermediateText = ''; var FinalText = ''; var SubText = ''; var text = document.getElementById(ID.id).value; var lines = text.split("\n"); for (var i = 0; i < lines.length; i++) { IntermediateText = lines[i]; if (IntermediateText.length <= 50) { if (lines.length - 1 == i) FinalText += IntermediateText; else FinalText += IntermediateText + "\n"; } else { while (IntermediateText.length > 50) { SubText = IntermediateText.substring(0, 50); FinalText += SubText + "\n"; IntermediateText = IntermediateText.replace(SubText, ''); } if (IntermediateText != '') { if (lines.length - 1 == i) FinalText += IntermediateText; else FinalText += IntermediateText + "\n"; } } } document.getElementById(ID.id).value = FinalText; $('#' + ID.id).scrollTop($('#' + ID.id)[0].scrollHeight); } Edit - 1 I have to show total max 35 characters in line without specific word break and need to keep margin of two characters from the right. Again, the restriction should be for 35 characters but need space for total 37 (Just for the Visibility issue.)

Read the article

How to convert string with double high/wide characters to normal string [VC++6]

- by Shaitan00

My application typically recieves a string in the following format: " Item $5.69 " Some contants I always expect: - the LENGHT always 20 characters - the start index of the text always [5] - and most importantly the index of the DECIMAL for the price always [14] In order to identify this string correctly I validate all the expected contants listed above .... Some of my clients have now started sending the string with Doube-High / Double-Wide values (pair of characters which represent a single readable character) similar to the following: " Item $x80x90.x81x91x82x92 " For testing I simply scan the string character-by-character, compare char[i] and char[i+1] and replace these pairs with their corresponding single character when a match is found (works fine) as follows: [Code] for (int i=0; i < sData.length(); i++) { char ch = sData[i] & 0xFF; char ch2 = sData[i+1] & 0xFF; if (ch == '\x80' && ch2 == '\x90') zData.replace("\x80\x90", "0"); else if (ch == '\x81' && ch2 == '\x91') zData.replace("\x81\x91", "1"); else if (ch == '\x82' && ch2 == '\x92') zData.replace("\x82\x92", "2"); ... ... ... } [/Code] But the result is something like this: " Item $5.69 " Notice how this no longer matches my expectation: the lenght is now 17 (instead of 20) due to the 3 conversions and the decimal is now at index 13 (instead of 14) due to the conversion of the "5" before the decimal point. Ideally I would like to convert the string to a normal readable format keeping the constants (length, index of text, index of decimal) at the same place (so the rest of my application is re-usable) ... or any other suggestion (I'm pretty much stuck with this)... Is there a STANDARD way of dealing with these type of characters? Any help would be greatly appreciated, I've been stuck on this for a while now ... Thanks,

Read the article

Number of characters recommended for a statement

- by liaK

Hi, I have been using Qt 4.5 and so do C++. I have been told that it's a standard practice to maintain the length of each statement in the application to 80 characters. Even in Qt creator we can make a right border visible so that we can know whether we are crossing the 80 characters limit. But my question is, Is it really a standard being followed? Because in my application, I use indenting and all, so it's quite common that I cross the boundary. Other cases include, there might be a error statement which will be a bit explanatory one and which is in an inner block of code, so it too will cross the boundary. Usually my variable names look bit lengthier so as to make the names meaningful. When I call the functions of the variable names, again I will cross. Function names will not be in fewer characters either. I agree a horizontal scroll bar shows up and it's quite annoying to move back and forth. So, for function calls including multiple arguments, when the boundary is reached I will make the forth coming arguments in the new line. But besides that, for a single statement (for e.g a very long error message which is in double quotes " " or like longfun1()->longfun2()->...) if I use an \ and split into multiple lines, the readability becomes very poor. So is it a good practice to have those statement length restrictions? If this restriction in statement has to be followed? I don't think it depends on a specific language anyway. I added C++ and Qt tags since if it might. Any pointers regarding this are welcome.

Read the article

nginx inserting extra characters in Multi-status reply body

- by user125011

Here's the setup. I've got one server running apache/php hosting ownCloud. Among other things, I'm using to do CardDAV contact syncing. In order to make things work with my domain I have an nginx server running on the frontend as a reverse-proxy to the ownCloud server. My nginx config is as follows: server { listen 80; server_name cloud.mydomain.com; location / { proxy_set_header X-Forwarded-Host cloud.mydomain.com; proxy_set_header X-Forwarded-Proto http; proxy_set_header X-Forwarded-For $remote_addr; client_max_body_size 0; proxy_redirect off; proxy_pass http://server; } } The problem is that when my phone does a PROPFIND on the server, nginx adds extra characters to the content body that throw the phone off. Specifically, it prepends d611\r\n at the front of the body and appends 0\r\n\r\n to the end of the content. (I got this from wireshark.) It also re-chunks the result. How do I get nginx to send the original content as-is?

Read the article

ash scripting: space-containing variable refuses to be grepped

- by Luci Sandor

I am trying to run the script listed at http://talk.maemo.org/showthread.php?t=70866&page=2 on its intended hardware, a Nokia Linux phone running BusyBox ash. The script receives the name of WiFi network as a parameter, and tries to connect the phone to it. I suspect the script works, but my SSID, BU (802.1x), has space and parentheses in it. So when I type at the command prompt autoconnect.sh BU\ $802.1x$ I get various errors. First, LIST=`iwconfig wlan0 | awk -F":" '/ESSID/{print $2}'` if [ $LIST = "\"$1\"" ]; then ...fails, even I am connected to the network. The error is not avoided by using single or double quotes instead of escaping characters at the command prompt. Second, if [ -z `iwlist wlan0 scan | grep -m 1 -o \"$1\"` ]; then echo SSID \"$1\" not found; shows that grep does not find the string, although the same grep, typed directly into the command prompt, does find 'BU (802.1x)'. How do I quote $1 in the two circumstances above so that it will work with my network SSID, containing spaces and parentheses? Thank you.

Search Results

Search found 5604 results on 225 pages for 'chinese characters'.

Page 18/225 | < Previous Page | 14 15 16 17 18 19 20 21 22 23 24 25 | Next Page >

- by powerbar

- by epeleg

- by yacoob

- by Bernt

- by donat

- by Paul

- by JXG

- by user38983

- by blindJesse

- by Sisyphus

- by efelton

- by behrk2

- by FrustratedWithFormsDesigner

- by Kevin

- by Barn

- by user89691

- by Brandon

- by Andrew Rollings

- by Snooze

- by jkndrkn

- by Kanavi

- by Shaitan00

- by liaK

- by user125011

- by Luci Sandor

< Previous Page | 14 15 16 17 18 19 20 21 22 23 24 25 | Next Page >