rxvt unicode - Page 25 - Developer IT

How do you properly use WideCharToMultiByte

- by Obediah Stane

I've read the documentation here: http://msdn.microsoft.com/en-us/library/ms776420(VS.85).aspx I'm stuck on this parameter: lpMultiByteStr [out] Pointer to a buffer that receives the converted string. I'm not quite sure how to properly initialize the variable and feed it into the function

Read the article

Using C# to detect whether a filename character is considered international

- by Morten Mertner

I've written a small console application (source below) to locate and optionally rename files containing international characters, as they are a source of constant pain with most source control systems (some background on this below). The code I'm using has a simple dictionary with characters to look for and replace (and nukes every other character that uses more than one byte of storage), but it feels very hackish. What's the right way to (a) find out whether a character is international? and (b) what the best ASCII substitution character would be? Let me provide some background information on why this is needed. It so happens that the danish Å character has two different encodings in UTF-8, both representing the same symbol. These are known as NFC and NFD encodings. Windows and Linux will create NFC encoding by default but respect whatever encoding it is given. Mac will convert all names (when saving to a HFS+ partition) to NFD and therefore returns a different byte stream for the name of a file created on Windows. This effectively breaks Subversion, Git and lots of other utilities that don't care to properly handle this scenario. I'm currently evaluating Mercurial, which turns out to be even worse at handling international characters.. being fairly tired of these problems, either source control or the international character would have to go, and so here we are. My current implementation: public class Checker { private Dictionary<char, string> internationals = new Dictionary<char, string>(); private List<char> keep = new List<char>(); private List<char> seen = new List<char>(); public Checker() { internationals.Add( 'æ', "ae" ); internationals.Add( 'ø', "oe" ); internationals.Add( 'å', "aa" ); internationals.Add( 'Æ', "Ae" ); internationals.Add( 'Ø', "Oe" ); internationals.Add( 'Å', "Aa" ); internationals.Add( 'ö', "o" ); internationals.Add( 'ü', "u" ); internationals.Add( 'ä', "a" ); internationals.Add( 'é', "e" ); internationals.Add( 'è', "e" ); internationals.Add( 'ê', "e" ); internationals.Add( '¦', "" ); internationals.Add( 'Ã', "" ); internationals.Add( '©', "" ); internationals.Add( ' ', "" ); internationals.Add( '§', "" ); internationals.Add( '¡', "" ); internationals.Add( '³', "" ); internationals.Add( '', "" ); internationals.Add( 'º', "" ); internationals.Add( '«', "-" ); internationals.Add( '»', "-" ); internationals.Add( '´', "'" ); internationals.Add( '`', "'" ); internationals.Add( '"', "'" ); internationals.Add( Encoding.UTF8.GetString( new byte[] { 226, 128, 147 } )[ 0 ], "-" ); internationals.Add( Encoding.UTF8.GetString( new byte[] { 226, 128, 148 } )[ 0 ], "-" ); internationals.Add( Encoding.UTF8.GetString( new byte[] { 226, 128, 153 } )[ 0 ], "'" ); internationals.Add( Encoding.UTF8.GetString( new byte[] { 226, 128, 166 } )[ 0 ], "." ); keep.Add( '-' ); keep.Add( '=' ); keep.Add( '\'' ); keep.Add( '.' ); } public bool IsInternationalCharacter( char c ) { var s = c.ToString(); byte[] bytes = Encoding.UTF8.GetBytes( s ); if( bytes.Length > 1 && ! internationals.ContainsKey( c ) && ! seen.Contains( c ) ) { Console.WriteLine( "X '{0}' ({1})", c, string.Join( ",", bytes ) ); seen.Add( c ); if( ! keep.Contains( c ) ) { internationals[ c ] = ""; } } return internationals.ContainsKey( c ); } public bool HasInternationalCharactersInName( string name, out string safeName ) { StringBuilder sb = new StringBuilder(); Array.ForEach( name.ToCharArray(), c => sb.Append( IsInternationalCharacter( c ) ? internationals[ c ] : c.ToString() ) ); int length = sb.Length; sb.Replace( " ", " " ); while( sb.Length != length ) { sb.Replace( " ", " " ); } safeName = sb.ToString().Trim(); string namePart = Path.GetFileNameWithoutExtension( safeName ); if( namePart.EndsWith( "." ) ) safeName = namePart.Substring( 0, namePart.Length - 1 ) + Path.GetExtension( safeName ); return name != safeName; } } And this would be invoked like this: FileInfo file = new File( "Århus.txt" ); string safeName; if( checker.HasInternationalCharactersInName( file.Name, out safeName ) ) { // rename file }

Read the article

How can I get Perl to detect the bad UTF-8 sequences?

- by gorilla

I'm running Perl 5.10.0 and Postgres 8.4.3, and strings into a database, which is behind a DBIx::Class. These strings should be in UTF-8, and therefore my database is running in UTF-8. Unfortunatly some of these strings are bad, containing malformed UTF-8, so when I run it I'm getting an exception DBI Exception: DBD::Pg::st execute failed: ERROR: invalid byte sequence for encoding "UTF8": 0xb5 I thought that I could simply ignore the invalid ones, and worry about the malformed UTF-8 later, so using this code, it should flag and ignore the bad titles. if(not utf8::valid($title)){ $title="Invalid UTF-8"; } $data->title($title); $data->update(); However Perl seems to think that the strings are valid, but it still throws the exceptions. How can I get Perl to detect the bad UTF-8?

Read the article

What character encoding should I use for a web page containing mostly Arabic text? Is utf-8 okay?

- by Paul D. Waite

What character encoding should I use for a web page containing mostly Arabic text? Is utf-8 okay?

Read the article

problem with uploading arabic files

- by sword101

I am using Spring upload to upload files. When uploading an Arabic file and getting the original file name in the controller, I get something like: المغفلين.png Any ideas why this problem occur?

Read the article

utf8 format in xml

- by hussain

i want to know how to store this è (this type of symbols) in xml file if i store this symbol in xml file.. the file shows this symbol like ? i was inserted in front of xml file is <?xml version="1.0" encoding="UTF-8"?> but that doest not shows correct thanks and advance

Read the article

UnicodeDecodeError when redirecting to file

- by zedoo

Hi, I run this snippet twice, in the ubuntu terminal, (encoding set to utf-8) once with ./test.py and then with ./test.py >out.txt: uni = u"\u001A\u0BC3\u1451\U0001D10C" print uni Without redirection it prints garbage. With redirection I get a UnicodeDecodeError. Can someone explain why I get the error only in the second case, or even better give a detailed explanation of what's going on behind the curtain in both cases?

Read the article

PHP detecting filesystem encoding

- by Evert

Hi guys, I need to save files with non-latin filenames on a filesytem, using PHP. I want to make this work cross-platform. How do I know what encoding I can use to write the file? I understand many modern filesystems are UTF-8 based (is this correct?), but I doubt Windows XP is (for instance). So, is there a robust detection mechanism? Evert

Read the article

python input UnicodeDecodeError:

- by The man on the Clapham omnibus

python 3.x >>> a = input() hope >>> a 'hope' >>> b = input() håpe >>> b 'håpe' >>> c = input() start typing hå... delete using backspace... and change to hope Traceback (most recent call last): File "<stdin>", line 1, in <module> UnicodeDecodeError: 'utf8' codec can't decode byte 0xc3 in position 1: invalid continuation byte >>> The situation is not terrible, I am working around it, but find it strange that when deleting, the bytes get messed up. Has anyone else experienced this? the terminal history shows that I thought that I entered h?ope any ideas? in the script that is using this, I do import readline to give command line history.

Read the article

Charaters with jquery json

- by Mikk

Hi everyone, I'm using jquery $.getJSON to retrieve list of cities. Everything works fine, but I'm from Estonia (probably most of you don't know much about this country =D) and we are using some characters like õ, ü. ä, ö. When I pass letters like this to callback function, I keep getting empty strings. I've tried to base64 encode(server-side)-decode(jquery base64 plugin) strings (i thought it was a good idea as long as I can compress pages with php, so I don't have to worry about bandwidth), but in this way I end up with some random chinese symbols. What would be the best workaround for this problem. Thank you.

Read the article

Displaying images in webpage without src URL

- by Babiker

Recently i learned that i can display images in a web page without referencing an image URL as follows : <img class="disclosure" img src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAkAAAAJCAYAAADgkQYQAAAAAXNSR0IArs4c6QAAAAlwSFlzAAALEwAACxMBAJqcGAAAAAd0SU1FB9oIGRQbOY8MjgMAAABVSURBVBjTfc6xDcAwCETRM0rt5nbA+49j70DDAqSLsGXyJQqkVxxwNOeMiEA+waW1VuT/inrvG7wikht8UETy2ygVMjO4O8YYTf6AqrZyUwYlygAAXo+QLmeF4c4uAAAAAElFTkSuQmCC"> I had another small bmp image that i wanted to display, so i opened it in vim and the img source looke like: When i paste this code where it needs to be pasted i only get "BM?" How to i convert/paste this code properly to be used as an image source?

Read the article

Characters with jquery json

- by Mikk

Hi everyone, I'm using jquery $.getJSON to retrieve list of cities. Everything works fine, but I'm from Estonia (probably most of you don't know much about this country =D) and we are using some characters like õ, ü. ä, ö. When I pass letters like this to callback function, I keep getting empty strings. I've tried to base64 encode(server-side)-decode(jquery base64 plugin) strings (i thought it was a good idea as long as I can compress pages with php, so I don't have to worry about bandwidth), but in this way I end up with some random chinese symbols. What would be the best workaround for this problem. Thank you.

Read the article

Four byte encoding of U+00F6 (LATIN SMALL LETTER O WITH DIAERESIS)?

- by knorv

Which character encoding represents the character ö (U+00F6, LATIN SMALL LETTER O WITH DIAERESIS or simply put chr(246) in ISO-8859-1) as the four octets combination chr(195) . chr(63) . chr(194) . chr(164)?

Read the article

wchar to char in c++

- by Chris

I have a Windows CE console application that's entry point looks like this int _tmain(int argc, _TCHAR* argv[]) I want to check the contents of argv[1] for "-s" convert argv[2] into an integer. I am having trouble narrowing the arguments or accessing them to test. I initially tried the following with little success if (argv[1] == L"-s") I also tried using the narrow function of wostringstream on each character but this crashed the application. Can anyone shed some light? Thanks

Read the article

\w in PHP preg_replace covers only second byte of UTF-8 chars

- by Andrey

we have this code: $value = preg_replace("/[^\w]/", '', $value); where $value is in utf-8. After this transformation first byte of multibyte characters is stripped. How to make \w cover UTF-8 chars completely? Sorry, i am not very well in PHP

Read the article

Python's string.translate() doesn't fully work?

- by Rhubarb

Given this example, I get the error that follows: print u'\2033'.translate({2033:u'd'}) C:\Python26\lib\encodings\cp437.pyc in encode(self, input, errors) 10 11 def encode(self,input,errors='strict'): ---> 12 return codecs.charmap_encode(input,errors,encoding_map) 13 14 def decode(self,input,errors='strict'): UnicodeEncodeError: 'charmap' codec can't encode character u'\x83' in position 0

Read the article

Why does Perl lose foreign characters on Windows input - can this be fixed (if so, how) or is Perl an outdated dinosaur that just can't handle this?

- by Alex R

Note below how ã changes to a This is causing me a huge problem as foreign characters show up in URLs, e.g. http://pt.wikipedia.org/wiki/Cão The OS is Windows 7, 64-bit. The Perl is: This is perl 5, version 12, subversion 2 (v5.12.2) built for MSWin32-x64-multi-thread (with 8 registered patches, see perl -V for more detail) Copyright 1987-2010, Larry Wall Binary build 1202 [293621] provided by ActiveState http://www.ActiveState.com Built Sep 6 2010 22:53:42 Additional update: To get around my particular problem, I tried using File::Find instead of piped input. The issue actually gets worse:

Read the article

EBCDIC to ASCII conversion. Out of bound error. In C#.

- by mekrizzy

I tried creating a EBCDIC to ASCII convector in C# using this general conversion order(given below). Basically the program converted from ASCII to the equivalent integer and from there into EDCDIC using the order below. Now when I try compiling this in C# and try giving a EBCDIC string(got this from another file from another computer) it is showing 'Out of Bound' exception for some of the EBCDIC character. Why is this like this?? Is it about formating?? or C# ?? or windows? Extra: I tried just printing out all the ASCII and EBCDIC characters using a loop from 0..255 numbers but still its not showing many of the EBCDIC characters. Am I missing any standards? int[] eb2as = new int[256]{ 0, 1, 2, 3,156, 9,134,127,151,141,142, 11, 12, 13, 14, 15, 16, 17, 18, 19,157,133, 8,135, 24, 25,146,143, 28, 29, 30, 31, 128,129,130,131,132, 10, 23, 27,136,137,138,139,140, 5, 6, 7, 144,145, 22,147,148,149,150, 4,152,153,154,155, 20, 21,158, 26, 32,160,161,162,163,164,165,166,167,168, 91, 46, 60, 40, 43, 33, 38,169,170,171,172,173,174,175,176,177, 93, 36, 42, 41, 59, 94, 45, 47,178,179,180,181,182,183,184,185,124, 44, 37, 95, 62, 63, 186,187,188,189,190,191,192,193,194, 96, 58, 35, 64, 39, 61, 34, 195, 97, 98, 99,100,101,102,103,104,105,196,197,198,199,200,201, 202,106,107,108,109,110,111,112,113,114,203,204,205,206,207,208, 209,126,115,116,117,118,119,120,121,122,210,211,212,213,214,215, 216,217,218,219,220,221,222,223,224,225,226,227,228,229,230,231, 123, 65, 66, 67, 68, 69, 70, 71, 72, 73,232,233,234,235,236,237, 125, 74, 75, 76, 77, 78, 79, 80, 81, 82,238,239,240,241,242,243, 92,159, 83, 84, 85, 86, 87, 88, 89, 90,244,245,246,247,248,249, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57,250,251,252,253,254,255 }; The whole code is as follows: public string convertFromEBCDICtoASCII(string inputEBCDICString, int initialPos, int endPos) { string inputSubString = inputEBCDICString.Substring(initialPos, endPos); int[] e2a = new int[256]{ 0, 1, 2, 3,156, 9,134,127,151,141,142, 11, 12, 13, 14, 15, 16, 17, 18, 19,157,133, 8,135, 24, 25,146,143, 28, 29, 30, 31, 128,129,130,131,132, 10, 23, 27,136,137,138,139,140, 5, 6, 7, 144,145, 22,147,148,149,150, 4,152,153,154,155, 20, 21,158, 26, 32,160,161,162,163,164,165,166,167,168, 91, 46, 60, 40, 43, 33, 38,169,170,171,172,173,174,175,176,177, 93, 36, 42, 41, 59, 94, 45, 47,178,179,180,181,182,183,184,185,124, 44, 37, 95, 62, 63, 186,187,188,189,190,191,192,193,194, 96, 58, 35, 64, 39, 61, 34, 195, 97, 98, 99,100,101,102,103,104,105,196,197,198,199,200,201, 202,106,107,108,109,110,111,112,113,114,203,204,205,206,207,208, 209,126,115,116,117,118,119,120,121,122,210,211,212,213,214,215, 216,217,218,219,220,221,222,223,224,225,226,227,228,229,230,231, 123, 65, 66, 67, 68, 69, 70, 71, 72, 73,232,233,234,235,236,237, 125, 74, 75, 76, 77, 78, 79, 80, 81, 82,238,239,240,241,242,243, 92,159, 83, 84, 85, 86, 87, 88, 89, 90,244,245,246,247,248,249, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57,250,251,252,253,254,255 }; char chrItem = Convert.ToChar("0"); StringBuilder sb = new StringBuilder(); for (int i = 0; i < inputSubString.Length; i++) { try { chrItem = Convert.ToChar(inputSubString.Substring(i, 1)); sb.Append(Convert.ToChar(e2a[(int)chrItem])); sb.Append((int)chrItem); sb.Append((int)00); } catch (Exception ex) { Console.WriteLine("//" + ex.Message); return string.Empty; } } string result = sb.ToString(); sb = null; return result; }

Read the article

strange characters at beginning of file

- by luca

there are strange characters at the beginning of a file I'm editing (using textmate..) I don't know when they appeared, they're invisible in textmate but my script that reads the file goes crazy.. this is the first few chars in the file (as seen with od command): 0000000 177377 000120 000105 000117 000120 000114 000105 000072 the first 2 shouldn't be there I think.. maybe they were caused by some strange dropbox sync? Or something else.. but they tend to reappear (I don't yet know when..) My question: what is that 177377 and a simple way to remove it in my ruby script? thanks

Read the article

What's the deal with char.GetNumericValue?

- by mgroves

I was working on Project Euler 40, and was a bit bothered that there was no int.Parse(char). Not a big deal, but I did some asking around and someone suggested char.GetNumericValue. GetNumericValue seems like a very odd method to me: Takes in a char as a parameter and returns...a double? Returns -1.0 if the char is not '0' through '9' So what's the reasoning behind this method, and what purpose does returning a double serve? I even fired up Reflector and looked at InternalGetNumericValue, but it's just like watching Lost: every answer just leads to another question.

Read the article

Using Python, How to copy files in 'temporary internet files' folder in Windows

- by pythBegin

I am using this code to find files recursively in a folder , with size greater than 50000 bytes. def listall(parent): lis=[] for root, dirs, files in os.walk(parent): for name in files: if os.path.getsize(os.path.join(root,name))>500000: lis.append(os.path.join(root,name)) return lis This is working fine. But when I used this on 'temporary internet files' folder in windows, am getting this error. Traceback (most recent call last): File "<pyshell#4>", line 1, in <module> listall(a) File "<pyshell#2>", line 5, in listall if os.path.getsize(os.path.join(root,name))>500000: File "C:\Python26\lib\genericpath.py", line 49, in getsize return os.stat(filename).st_size WindowsError: [Error 123] The filename, directory name, or volume label syntax is incorrect: 'C:\\Documents and Settings\\khedarnatha\\Local Settings\\Temporary Internet Files\\Content.IE5\\EDS8C2V7\\??????+1[1].jpg' I think this is because windows gives names with special characters in this specific folder... Please help to sort out this issue.

Read the article

Obtain File size with os.path.getsize() in Python 2.7.5

- by Ruxuan Ouyang

I am new to python. I am trying to use os.path.getsize() to obtain the file size. However, if the file name is not in Englist, but in Chinese, Gemany, or French, etc, Python cannot recognize it and do not return the size of the file. Could you please help me with it? How can I let Python recognize the file's name and return the size of these kind of files? For example: The file's name is:?????????? ????????????? ? ????????????? ???????? ?? 2030?.doc path="C:\xxxx\xxx\xxxx\?????????? ????????????? ? ????????????? ???????? ?? 2030?.doc" I'd like to use" os.path.getsize(path) But it does not recognize the file name. Could you please kindly tell me what should I do? Thank you very much!

Read the article

Why does Perl lose foreign characters on Windows; can this be fixed (if so, how)?

- by Alex R

Note below how ã changes to a. NOTE2: Before you blame this on CMD.EXE and Windows pipe weirdness, see Experiment 2 below which gets a similar problem using File::Find. The particular problem I'm trying to fix involves working with image files stored on a local drive, and manipulating the file names which may contain foreign characters. The two experiments shown below are intermediate debugging steps. The ã character is common in latin languages. e.g. http://pt.wikipedia.org/wiki/Cão Experiment 1 Experiment 2 To get around my particular problem, I tried using File::Find instead of piped input. The issue actually gets worse: Debugging update: I tried some of the tricks listed at http://perldoc.perl.org/perlunicode.html, e.g. use utf8, use feature 'unicode_strings', etc, to no avail. Environment and Version Info The OS is Windows 7, 64-bit. The Perl is: This is perl 5, version 12, subversion 2 (v5.12.2) built for MSWin32-x64-multi-thread (with 8 registered patches, see perl -V for more detail) Copyright 1987-2010, Larry Wall Binary build 1202 [293621] provided by ActiveState http://www.ActiveState.com Built Sep 6 2010 22:53:42

Read the article

Differentiate between TCHAR and _TCHAR

- by Vulcan Eager

What are the various differences between the two symbols TCHAR and _TCHAR type defined in the Windows header tchar.h? Explain with examples. Briefly describe scenarios where you would use TCHAR as opposed to _TCHAR in your code. (10 marks)

Read the article

If a command line program is unsure of stdout's encoding, what encoding should it output?

- by mackstann

I have a command line program written in Python, and when I pipe it through another program on the command line, sys.stdout.encoding is None. This makes sense, I suppose -- the output could be another program, or a file you're redirecting it into, or whatever, and it doesn't know what encoding is desired. But neither do I! This program will be used by many different people (humor me) in different ways. Should I play it safe and output only ascii (replacing non-ascii chars with question marks)? Or should I output UTF-8, since it's so widespread these days?

Search Results

Search found 1490 results on 60 pages for 'rxvt unicode'.

Page 25/60 | < Previous Page | 21 22 23 24 25 26 27 28 29 30 31 32 | Next Page >

- by Obediah Stane

- by Morten Mertner

- by gorilla

- by Paul D. Waite

- by sword101

- by hussain

- by zedoo

- by Evert

- by The man on the Clapham omnibus

- by Mikk

- by Babiker

- by Mikk

- by knorv

- by Chris

- by Andrey

- by Rhubarb

- by Alex R

- by mekrizzy

- by luca

- by mgroves

- by pythBegin

- by Ruxuan Ouyang

- by Alex R

- by Vulcan Eager

- by mackstann

< Previous Page | 21 22 23 24 25 26 27 28 29 30 31 32 | Next Page >