unicode - Page 11 - Developer IT

is unicode( codecs.BOM_UTF8, "utf8" ) necessary in Python 2.7/3?

- by Brian M. Hunt

In a code review I came across the following code that contains the following: # Python bug that renders the unicode identifier (0xEF 0xBB 0xBF) # as a character. # If untreated, it can prevent the page from validating or rendering # properly. bom = unicode( codecs.BOM_UTF8, "utf8" ) r = r.replace(bom, '') This is in a function that passes a string to Response object (Django or Flask). Is this still a bug that needs this fix in Python 2.7 or 3? Something tells me it isn't, but I thought I'd ask because I don't know this problem very well. Thanks for reading.

Read the article

What does unicode character 
 represent?

- by vette982

The unicode is 
 and it's being used in an XML document.

Read the article

How can I convert japanese characters to unicode in Perl?

- by TopCoder

Can you point me tool to convert japanese characters to unicode?

Read the article

How to remove Unicode characters and/or convert OpenOffice spreadsheet cells to plaintext?

- by gonzobrains

I have an OpenOffice spreadsheet into which I occasionally copy/paste snippets from web pages. However, I need the file, as a whole, to be free of fancy formatting and non-ASCII text. Is tried highlighting cells and selecting "Default Formatting" but this still seems to keep extraneous characters even though it looks like normal text to the human eye. If this is not possible, is there a way to at least reveal the "raw" data within a cell so that I can manually strip it? Thanks, Jeff

Read the article

Why can't I display Unicode math symbols (U+2200..U+22FF)?

- by rodnower

I have windows XP and in any application on my computer i don't see this range, nether in IE nor in Notepad++... I become crazy... Thank you very much for ahead.

Read the article

Different font SIZES in a Text Editor, based on Script(Alphabet) type (ie. per Unicode Code-Block)

- by fred.bear

Some non-Latin-based scripts(alphabets) have more detail in their glyphs than do the Latin-based-script equivalents, and typically need a larger font to give the same degree of legibility (resolution-wise). Sometimes, both script types need to be present in the same file. Notepad++ allows different font SIZES (and colour, etc) courtesy of syntax-highlighting. This allows me to display larger-fonted non-Latin-based script in a // BIG-FONT comment. Although this has been quite handy for me in some situations, it is quite limited. A Word Processor can handle this scenario, but I'm not interested in that. I want a nice simple(?) plain(?) Text Editor to do it... on a per script-type basis... eg. mixing Latin-1 and Devanagari (and Mandarin, and ... Such a thing may not exits, but Notepad++ has shown that a simple(?) plain(?) Text Editor is capable of it. Does anyone know of such a Text Editor? ...Q. Why not a Word Processor? ...A. Because GCC and Python don't like that format! but UTF-8 is fine.

Read the article

SQLBits - Unicode Porn

- by Most Valuable Yak (Rob Volk)

We've just finished up a fantastic event at SQLBits X in London! If you've never been to SQLBits and you can make it to the UK, I highly recommend it. If you didn't attend, here's what you missed. Meanwhile, for those who attended the Lightning Talk sessions and were disappointed that I ran out of time, here's the last part that you would have seen: /* How to Lose Friends and Irritate People...With Unicode! Rob Volk SQLBits X - London - March 31, 2012 */ -- some sexy SQL DECLARE @oohbaby TABLE(i INT NOT NULL UNIQUE, uni_char AS NCHAR(i), hex AS CAST(i AS BINARY(2))) INSERT @oohbaby VALUES(664),(1022),(1023),(1120),(1150),(8857),(11609),(42420),(42427) -- change results font to larger size, some only work in grid font SELECT * FROM @oohbaby SELECT NCHAR(1022) + NCHAR(1023) AS Page3Girl It's probably better that you run this yourself, in the privacy of your own home/office, you know *wink* *wink* *nudge* *nudge* *say no more*

Read the article

Best way to convert a Unicode URL to ASCII (UTF-8 percent-escaped) in Python?

- by benhoyt

I'm wondering what's the best way -- or if there's a simple way with the standard library -- to convert a URL with Unicode chars in the domain name and path to the equivalent ASCII URL, encoded with domain as IDNA and the path %-encoded, as per RFC 3986. I get from the user a URL in UTF-8. So if they've typed in http://?.ws/? I get 'http://\xe2\x9e\xa1.ws/\xe2\x99\xa5' in Python. And what I want out is the ASCII version: 'http://xn--hgi.ws/%E2%99%A5'. What I do at the moment is split the URL up into parts via a regex, and then manually IDNA-encode the domain, and separately encode the path and query string with different urllib.quote() calls. # url is UTF-8 here, eg: url = u'http://?.ws/?'.encode('utf-8') match = re.match(r'([a-z]{3,5})://(.+\.[a-z0-9]{1,6})' r'(:\d{1,5})?(/.*?)(\?.*)?$', url, flags=re.I) if not match: raise BadURLException(url) protocol, domain, port, path, query = match.groups() try: domain = unicode(domain, 'utf-8') except UnicodeDecodeError: return '' # bad UTF-8 chars in domain domain = domain.encode('idna') if port is None: port = '' path = urllib.quote(path) if query is None: query = '' else: query = urllib.quote(query, safe='=&?/') url = protocol + '://' + domain + port + path + query # url is ASCII here, eg: url = 'http://xn--hgi.ws/%E3%89%8C' Is this correct? Any better suggestions? Is there a simple standard-library function to do this?

Read the article

Fast, Unicode-capable, cross-platform programmer's text editor that shows invisibles like ZWSP?

- by Roger_S

Our publishing workflow includes Windows and Linux machines (there are some Macs too, but not in the critical-path workflow). Many texts include both English and Khmer and are marked-up in XML. XML Copy Editor is the best cross-platform open-source XML editor I've discovered. It utilizes the Scintilla editing component, which is generally good with Unicode but which does not enable non-printing or invisible characters like U+200B (zero-width space) and U+200C (zero-width non-joiner) to be displayed. Khmer does not separate words with a space character as Western languages do, so ZWSP is used in electronic texts to enable applications to break lines easily. Ideally I'd edit the markup and the content in a single editor, but XML awareness is less important at times than being able to display invisibles. (OpenOffice.org Writer and Microsoft Word are the only two apps I know that will display ZWSP. They are not suitable for the markup and text manipulations that need to be done to prepare manuscripts for publication, unfortunately, although I guess they're fine for authoring.) I tried out a promising editor last week, but a search-and-replace regex operation that took under a second in TextPad 4.7.3 lasted over twenty seconds. So I want to mention that speed and the ability to handle large (up to 150mb) files is also a concern. Is there a good, fast, free or not too expensive text editor, with versions on Windows and Linux and maybe mac too, Unicode-aware and capable of displaying invisibles like ZWSP? That has syntax highlighting, can handle large files and is customizable enough that I won't tear my hair out in frustration? Thanks, Roger_S

Read the article

If I use Unicode on a ISO-8859-1 site, how will that be interpreted by a browser?

- by grg-n-sox

So I got a site that uses ISO-8859-1 encoding and I can't change that. I want to be sure that the content I enter into the web app on the site gets parsed correctly. The parser works on a character by character basis. I also cannot change the parser, I am just writing files for it to handle. The content in my file I am telling the app to display after parsing contains Unicode characters (or at least I assume so, even if they were produced by Windows Alt Codes mapped to CP437). Using entities is not an option due to the character by character operation of the parser. The only characters that the parser escapes upon output are markup sensitive ones like ampersand, less than, and greater than symbols. I would just go ahead and put this through to see what it looks like, but output can only be seen on a publishing, which has to spend a couple days getting approved and such, and that would be asking too much for just a test case. So, long story short, if I told a site to output ?ÇÑ¥?? on a site with a meta tag stating it is supposed to use ISO-8859-1, will a browser auto-detect the Unicode and display it or will it literally translate it as ISO-8859-1 and get a different set of characters?

Read the article

Which Perl moudle can handle variety of date formats with unicode characters ?

- by ram

My requirement is parsing xml files which contains wide varieties of timestamps based on the locales at which they are written. They may contain Unicode characters in case of Chinese or Korean locales. I have to parse these timestamps and put then in a standard format something like 2009-11-26 12:40:54 to put them in a oracle database. Sometimes I may not even know the locale and yet I have to parse the timestamps. I am looking for a module that automatically detects the timestamp format (including unicode characters for am and pm in their local language) and converts in to epoch time so that I can convert it back to what ever way I like to. I have gone through similar questions in this forum. Few suggested DateFormat module, and Date::Parse module. The perl distribution I am using is 5.10 so Date::Manip doesn't come as a core module. As I am supposed to use just the basic core modules and few CPAN modules(on request I cannot ask for all), I request you to kindly suggest me a good module that suffices all my requirements. Thanks in advance

Read the article

Sun Fire X4270 M3 SAP Enhancement Package 4 for SAP ERP 6.0 (Unicode) Two-Tier Standard Sales and Distribution (SD) Benchmark

- by Brian

Oracle's Sun Fire X4270 M3 server achieved 8,320 SAP SD Benchmark users running SAP enhancement package 4 for SAP ERP 6.0 with unicode software using Oracle Database 11g and Oracle Solaris 10. The Sun Fire X4270 M3 server using Oracle Database 11g and Oracle Solaris 10 beat both IBM Flex System x240 and IBM System x3650 M4 server running DB2 9.7 and Windows Server 2008 R2 Enterprise Edition. The Sun Fire X4270 M3 server running Oracle Database 11g and Oracle Solaris 10 beat the HP ProLiant BL460c Gen8 server using SQL Server 2008 and Windows Server 2008 R2 Enterprise Edition by 6%. The Sun Fire X4270 M3 server using Oracle Database 11g and Oracle Solaris 10 beat Cisco UCS C240 M3 server running SQL Server 2008 and Windows Server 2008 R2 Datacenter Edition by 9%. The Sun Fire X4270 M3 server running Oracle Database 11g and Oracle Solaris 10 beat the Fujitsu PRIMERGY RX300 S7 server using SQL Server 2008 and Windows Server 2008 R2 Enterprise Edition by 10%. Performance Landscape SAP-SD 2-Tier Performance Table (in decreasing performance order). SAP ERP 6.0 Enhancement Pack 4 (Unicode) Results (benchmark version from January 2009 to April 2012) System OS Database Users SAPERP/ECCRelease SAPS SAPS/Proc Date Sun Fire X4270 M3 2xIntel Xeon E5-2690 @2.90GHz 128 GB Oracle Solaris 10 Oracle Database 11g 8,320 20096.0 EP4(Unicode) 45,570 22,785 10-Apr-12 IBM Flex System x240 2xIntel Xeon E5-2690 @2.90GHz 128 GB Windows Server 2008 R2 EE DB2 9.7 7,960 20096.0 EP4(Unicode) 43,520 21,760 11-Apr-12 HP ProLiant BL460c Gen8 2xIntel Xeon E5-2690 @2.90GHz 128 GB Windows Server 2008 R2 EE SQL Server 2008 7,865 20096.0 EP4(Unicode) 42,920 21,460 29-Mar-12 IBM System x3650 M4 2xIntel Xeon E5-2690 @2.90GHz 128 GB Windows Server 2008 R2 EE DB2 9.7 7,855 20096.0 EP4(Unicode) 42,880 21,440 06-Mar-12 Cisco UCS C240 M3 2xIntel Xeon E5-2690 @2.90GHz 128 GB Windows Server 2008 R2 DE SQL Server 2008 7,635 20096.0 EP4(Unicode) 41,800 20,900 06-Mar-12 Fujitsu PRIMERGY RX300 S7 2xIntel Xeon E5-2690 @2.90GHz 128 GB Windows Server 2008 R2 EE SQL Server 2008 7,570 20096.0 EP4(Unicode) 41,320 20,660 06-Mar-12 Complete benchmark results may be found at the SAP benchmark website http://www.sap.com/benchmark. Configuration and Results Summary Hardware Configuration: Sun Fire X4270 M3 2 x 2.90 GHz Intel Xeon E5-2690 processors 128 GB memory Sun StorageTek 6540 with 4 * 16 * 300GB 15Krpm 4Gb FC-AL Software Configuration: Oracle Solaris 10 Oracle Database 11g SAP enhancement package 4 for SAP ERP 6.0 (Unicode) Certified Results (published by SAP): Number of benchmark users: 8,320 Average dialog response time: 0.95 seconds Throughput: Fully processed order line: 911,330 Dialog steps/hour: 2,734,000 SAPS: 45,570 SAP Certification: 2012014 Benchmark Description The SAP Standard Application SD (Sales and Distribution) Benchmark is a two-tier ERP business test that is indicative of full business workloads of complete order processing and invoice processing, and demonstrates the ability to run both the application and database software on a single system. The SAP Standard Application SD Benchmark represents the critical tasks performed in real-world ERP business environments. SAP is one of the premier world-wide ERP application providers, and maintains a suite of benchmark tests to demonstrate the performance of competitive systems on the various SAP products. See Also SAP Benchmark Website Sun Fire X4270 M3 Server oracle.com OTN Oracle Solaris oracle.com OTN Oracle Database 11g Release 2 Enterprise Edition oracle.com OTN Disclosure Statement Two-tier SAP Sales and Distribution (SD) standard SAP SD benchmark based on SAP enhancement package 4 for SAP ERP 6.0 (Unicode) application benchmark as of 04/11/12: Sun Fire X4270 M3 (2 processors, 16 cores, 32 threads) 8,320 SAP SD Users, 2 x 2.90 GHz Intel Xeon E5-2690, 128 GB memory, Oracle 11g, Solaris 10, Cert# 2012014. IBM Flex System x240 (2 processors, 16 cores, 32 threads) 7,960 SAP SD Users, 2 x 2.90 GHz Intel Xeon E5-2690, 128 GB memory, DB2 9.7, Windows Server 2008 R2 EE, Cert# 2012016. IBM System x3650 M4 (2 processors, 16 cores, 32 threads) 7,855 SAP SD Users, 2 x 2.90 GHz Intel Xeon E5-2690, 128 GB memory, DB2 9.7, Windows Server 2008 R2 EE, Cert# 2012010. Cisco UCS C240 M3 (2 processors, 16 cores, 32 threads) 7,635 SAP SD Users, 2 x 2.90 GHz Intel Xeon E5-2690, 128 GB memory, SQL Server 2008, Windows Server 2008 R2 DE, Cert# 2012011. Fujitsu PRIMERGY RX300 S7 (2 processors, 16 cores, 32 threads) 7,570 SAP SD Users, 2 x 2.90 GHz Intel Xeon E5-2690, 128 GB memory, SQL Server 2008, Windows Server 2008 R2 EE, Cert# 2012008. HP ProLiant DL380p Gen8 (2 processors, 16 cores, 32 threads) 7,865 SAP SD Users, 2 x 2.90 GHz Intel Xeon E5-2690, 128 GB memory, SQL Server 2008, Windows Server 2008 R2 EE, Cert# 2012012. SAP, R/3, reg TM of SAP AG in Germany and other countries. More info www.sap.com/benchmark

Read the article

Efficient Trie implementation for unicode strings

- by U Mad

I have been looking for an efficient String trie implementation. Mostly I have found code like this: Referential implementation in Java (per wikipedia) I dislike these implementations for mostly two reasons: They support only 256 ASCII characters. I need to cover things like cyrillic. They are extremely memory inefficient. Each node contains an array of 256 references, which is 4096 bytes on a 64 bit machine in Java. Each of these nodes can have up to 256 subnodes with 4096 bytes of references each. So a full Trie for every ASCII 2 character string would require a bit over 1MB. Three character strings? 256MB just for arrays in nodes. And so on. Of course I don't intend to have all of 16 million three character strings in my Trie, so a lot of space is just wasted. Most of these arrays are just null references as their capacity far exceeds the actual number of inserted keys. And if I add unicode, the arrays get even larger (char has 64k values instead of 256 in Java). Is there any hope of making an efficient trie for strings? I have considered a couple of improvements over these types of implementations: Instead of using array of references, I could use an array of primitive integer type, which indexes into an array of references to nodes whose size is close to the number of actual nodes. I could break strings into 4 bit parts which would allow for node arrays of size 16 at the cost of a deeper tree.

Read the article

Anyone ported Snoop Component Suite version 3.0 to Delphi 2010 ? (ie. Unicode issues)

- by user296191

Hi, Has anyone ported "Snoop Component Suite version 3.0" by http://www.netlab.co.kr To Delphi 2010 ? Its a great WinPCap library. Just doesn't work on Delphi 2010 (unicode) Thanks

Read the article

Is there a Pac-Man-like character in ASCII or Unicode?

- by Ricket

Simple question: is there a character that looks either like Pac-Man, or like the ghost in Pac-Man? With Google's recent Pac-Man logo, everyone should know what these look like, but in case you don't here are some sample images: If you answer "no" please provide a little more proof that you actually searched all unicode characters...

Read the article

How do I convert from unicode to single byte in C#?

- by xarzu

How do I convert from unicode to single byte in C#? This does not work: int level =1; string argument; // and then argument is assigned if (argument[2] == Convert.ToChar(level)) { // does not work } And this: char test1 = argument[2]; char test2 = Convert.ToChar(level); produces funky results. test1 can be: 49 '1' while test2 will be 1 ''

Read the article

What is better for PHP developers - Unicode or UTF-8?

- by Ole Jak

What is better for PHP developers - Unicode or UTF-8? I am going to create an international CMS. So I am going to have clients all over the world. They will speak all possible languages. What encoding format is better for browser recognition and for DB data storage?

Read the article

How to concatenate two unicode characters in DotNet and not have any space?

- by OutOFTouch

When I concatenate the following two unicode characters I see both but there is a space between them. Is there anyway to get rid of this space? StringBuilder sb = new StringBuilder(); int characterCode; characterCode = Convert.ToInt32("2758", 16); sb.Append((char)characterCode); characterCode = Convert.ToInt32("25c4", 16); sb.Append((char)characterCode);

Read the article

What is better for PHP developer - Unicode or UTF-8?

- by Ole Jak

What is better for PHP developer - Unicode or UTF-8? I am going to create international CMS. So I am going to have clients all over the werld. They will speak all posible languages. What encoding format is better for browser recognition and for DB data storing?

Read the article

Why does Unicode.org no longer offer a reference UTF-8/16/32 converter?

- by Steve Hanov

A reference converter from UTF-8/16/32 in C used to be available at ftp://ftp.unicode.org/Public/PROGRAMS/CVTUTF/. This included the files ConvertUTF.h and ConvertUTF.c. It was freely available and is incorporated into numerous open source projects. But now it's gone! What's the story? Can is still be legally used? Was there a problem with it?

Read the article

Which of the following Unicode characters should be used in HTML?

- by George Edison

I am aware that any Unicode character can be inserted into an HTML document via the following format:  ...where 0000 is the character code of the desired character My question is: which of these characters has the most widespread availability when it comes to the client's browser being able to display the character? In other words, what are the ranges of codes that should be used in an HTML document that is going to be widely deployed?

Read the article

How can I convert a Unicode codepoint (\uXXXX) into a character in Perl?

- by Peterim

I have some unicode codepoints (\u5315\u4e03\u58ec\u4e8c\u4e0a\u53b6\u4e4b), which I have to convert into actual characters they represent. What's the simplest way to do so? Thank you.

Read the article

Is it a good idea to use unicode symbols as Java identifiers?

- by Eric

I have a snippet of code that looks like this: double ?t = lastPollTime - pollTime; double a = 1 - Math.exp(-?t / t); average += a * (x - average); Just how bad an idea is it to use unicode characters in Java identifiers? Or is this perfectly acceptable?

Read the article

Ignore non-unicode programs language when installing software

- by mitya

This is something that is driving me nuts for a while and I haven't been able to find a solution for this problem anywhere. I am running Windows 7 and my "Language for non-Unicode programs" setting is set to Russian. I need for some non-unicode software that has a Russian UI. However, for most of my software I prefer to use the English UI. A lot of software out there is multilingual and is too smart for my liking. When installing, it switches the UI to Russian and the software UI stays in Russian after the installation without an option to change that, besides setting the "non-unicode language" to English. It switches back to Russian once I revert the setting and reboot. Most of the time it is driver software, i.e: Intel, HP, etc. How can force the installation to run English and stay that way after install, ignoring the "Language for non-Unicode programs" setting? Now, I understand this might be specific to the installer: MSI, Install Shield, etc. But any solution will be good, even if I have to apply it for every software installation. Thanks in advance for any helpful information!

Read the article

Qt/C++ regular expression library with unicode property support

- by Dave

I'm converting an application from the .Net framework to Qt using C++. The application makes extensive use of regular expression unicode properties, i.e. \p{L}, \p{M}, etc. I've just discovered that the QRegExp class lacks support for this among other things (lookbehinds, etc.) Can anyone recommend a C++ regular expression library that: Supports unicode properties Is unicode-aware in other respects (i.e. \w matches more than ASCII word characters) As a bonus, supports lookbehinds. Please don't point me to the wikipedia article; I don't trust it. That article says that QRegExp supports unicode properties. Unless I'm really doing something wrong, it doesn't. I'm looking for someone actually using unicode properties with a regex library in a project.

Search Results

Search found 1474 results on 59 pages for 'unicode'.

Page 11/59 | < Previous Page | 7 8 9 10 11 12 13 14 15 16 17 18 | Next Page >

- by Brian M. Hunt

- by vette982

- by TopCoder

- by gonzobrains

- by rodnower

- by fred.bear

- by Most Valuable Yak (Rob Volk)

- by benhoyt

- by Roger_S

- by grg-n-sox

- by ram

- by Brian

- by U Mad

- by user296191

- by Ricket

- by xarzu

- by Ole Jak

- by OutOFTouch

- by Ole Jak

- by Steve Hanov

- by George Edison

- by Peterim

- by Eric

- by mitya

- by Dave

< Previous Page | 7 8 9 10 11 12 13 14 15 16 17 18 | Next Page >