Search Results

Search found 4702 results on 189 pages for 'accented strings'.

Page 57/189 | < Previous Page | 53 54 55 56 57 58 59 60 61 62 63 64 | Next Page >

Cannot mount USB 3 harddrive

- by Thijs

I cannot mount an USB 3 harddrive. When looking at dmesg I got the following error: [ 96.463269] usb 3-2.1: >new low-speed USB device number 10 using xhci_hcd [ 96.485777] usb 3-2.1: >New USB device found, idVendor=046d, idProduct=c025 [ 96.485787] usb 3-2.1: >New USB device strings: Mfr=1, Product=2, SerialNumber=0 [ 96.485792] usb 3-2.1: >Product: USB-PS/2 Optical Mouse [ 96.485797] usb 3-2.1: >Manufacturer: B16_b_02 [ 96.486118] usb 3-2.1: >ep 0x81 - rounding interval to 64 microframes, ep desc says 80 microframes [ 96.490149] input: B16_b_02 USB-PS/2 Optical Mouse as /devices/pci0000:00/0000:00:14.0/usb3/3-2/3-2.1/3-2.1:1.0/input/input12 [ 96.490500] hid-generic 0003:046D:C025.0003: >input,hidraw0: USB HID v1.10 Mouse [B16_b_02 USB-PS/2 Optical Mouse] on usb-0000:00:14.0-2.1/input0 [ 114.088984] usb 3-2.3: >new high-speed USB device number 11 using xhci_hcd [ 114.105041] usb 3-2.3: >New USB device found, idVendor=13fd, idProduct=1618 [ 114.105051] usb 3-2.3: >New USB device strings: Mfr=0, Product=0, SerialNumber=0 [ 257.531777] usb 3-2.3: >USB disconnect, device number 11 [ 258.513912] usb 3-2.4: >new high-speed USB device number 12 using xhci_hcd [ 258.514046] usb 3-2.4: >Device not responding to set address. [ 258.717649] usb 3-2.4: >Device not responding to set address. [ 258.921203] usb 3-2.4: >device not accepting address 12, error -71 [ 258.937388] hub 3-2:1.0: >unable to enumerate USB device on port 4 I have tried to mount the drive on ubuntu-12.04 and it mounts just fine.

Read the article
Language parsing to find important words

- by Matt Huggins

I'm looking for some input and theory on how to approach a lexical topic. Let's say I have a collection of strings, which may just be one sentence or potentially multiple sentences. I'd like to parse these strings to and rip out the most important words, perhaps with a score that denotes how likely the word is to be important. Let's look at a few examples of what I mean. Example #1: "I really want a Keurig, but I can't afford one!" This is a very basic example, just one sentence. As a human, I can easily see that "Keurig" is the most important word here. Also, "afford" is relatively important, though it's clearly not the primary point of the sentence. The word "I" appears twice, but it is not important at all since it doesn't really tell us any information. I might expect to see a hash of word/scores something like this: "Keurig" => 0.9 "afford" => 0.4 "want" => 0.2 "really" => 0.1 etc... Example #2: "Just had one of the best swimming practices of my life. Hopefully I can maintain my times come the competition. If only I had remembered to take of my non-waterproof watch." This example has multiple sentences, so there will be more important words throughout. Without repeating the point exercise from example #1, I would probably expect to see two or three really important words come out of this: "swimming" (or "swimming practice"), "competition", & "watch" (or "waterproof watch" or "non-waterproof watch" depending on how the hyphen is handled). Given a couple examples like this, how would you go about doing something similar? Are there any existing (open source) libraries or algorithms in programming that already do this?

Read the article
Algorithmic problem - quickly finding all #'s where value %x is some given value

- by Steve B.

Problem I'm trying to solve, apologies in advance for the length: Given a large number of stored records, each with a unique (String) field S. I'd like to be able to find through an indexed query all records where Hash(S) % N == K for any arbitrary N, K (e.g. given a million strings, find all strings where HashCode(s) % 17 = 5. Is there some way of memoizing this so that we can quickly answer any question of this form without doing the % on every value? The motivation for this is a system of N distributed nodes, where each record has to be assigned to at least one node. The nodes are numbered 0 - (K-1) , and each node has to load up all of the records that match it's number: If we have 3 nodes Node 0 loads all records where Hash % 3 ==0 Node 1 loads all records where Hash % 3 ==1 Node 2 loads all records where Hash % 3 ==2 adding a 4th node, obviously all the assignments have to be recomputed - Node 0 loads all records where Hash % 4 ==0 ... etc I'd like to easily find these records through an indexed query without having to compute the mod individually. The best I've been able to come up with so far: If we take the prime factors of N (p1 * p2 * ... ) if N % M == I then p % M == I % p for all of N's prime factors e.g. 10 nodes : N % 10 == 6 then N % 2 = 0 == 6 %2 N % 5 = 1 == 6 %5 so storing an array of the "%" of N for the first "reasonable" number of primes for my data set should be helpful. For example in the above example we store the hash and the primes HASH PRIMES (array of %2, %3, %5, %7, ... ]) 16 [0 1 1 2 .. ] so looking for N%10 == 6 is equivalent to looking for all values where array[1]==1 and array[2] == 1. However, this breaks at the first prime larger than the highest number I'm storing in the factor table. Is there a better way?

Read the article
Is this high coupling?

- by Bono

Question I'm currently working a on an assignment for school. The assignment is to create a puzzle/calculator program in which you learn how to work with different datastructures (such as Stacks). We have generate infix math strings suchs as "1 + 2 * 3 - 4" and then turn them in to postfix math strings such as "1 2 + 3 * 4 -". In my book the author creates a special class for converting the infix notation to postfix. I was planning on using this but whilst I was about to implement it I was wondering if the following is what you would call "high coupling". I have read something about this (nothing that is taught in the book or anything) and was wondering about the aspect (since I still have to grasp it). Problem I have created a PuzzleGenerator class which generates the infix notation of the puzzle (or math string, whatever you want to call it) when it's instantiated. I was going to make a method getAnswer() in which I would instantiate the InToPost class (the class from the book) to convert the infix to postfox notation and then calculate the answer. But whilst doing this I thought: "Is using the InToPost class inside this method a form a high coupling, and would it be better to place this in a different method?" (such as a "convertPostfixToInfix" method, inside the PuzzleGenerator class) Thanks in advance.

Read the article
Are specific types still necessary?

- by MKO

One thing that occurred to me the other day, are specific types still necessary or a legacy that is holding us back. What I mean is: do we really need short, int, long, bigint etc etc. I understand the reasoning, variables/objects are kept in memory, memory needs to be allocated and therefore we need to know how big a variable can be. But really, shouldn't a modern programming language be able to handle "adaptive types", ie, if something is only ever allocated in the shortint range it uses fewer bytes, and if something is suddenly allocated a very big number the memory is allocated accordinly for that particular instance. Float, real and double's are a bit trickier since the type depends on what precision you need. Strings should however be able to take upp less memory in many instances (in .Net) where mostly ascii is used buth strings always take up double the memory because of unicode encoding. One argument for specific types might be that it's part of the specification, ie for example a variable should not be able to be bigger than a certain value so we set it to shortint. But why not have type constraints instead? It would be much more flexible and powerful to be able to set permissible ranges and values on variables (and properties). I realize the immense problem in revamping the type architecture since it's so tightly integrated with underlying hardware and things like serialization might become tricky indeed. But from a programming perspective it should be great no?

Read the article
High-level strategy for distinguishing a regular string from invalid JSON (ie. JSON-like string detection)

- by Jonline

Disclaimer On Absence of Code: I have no code to post because I haven't started writing; was looking for more theoretical guidance as I doubt I'll have trouble coding it but am pretty befuddled on what approach(es) would yield best results. I'm not seeking any code, either, though; just direction. Dilemma I'm toying with adding a "magic method"-style feature to a UI I'm building for a client, and it would require intelligently detecting whether or not a string was meant to be JSON as against a simple string. I had considered these general ideas: Look for a sort of arbitrarily-determined acceptable ratio of the frequency of JSON-like syntax (ie. regex to find strings separated by colons; look for colons between curly-braces, etc.) to the number of quote-encapsulated strings + nulls, bools and ints/floats. But the smaller the data set, the more fickle this would get look for key identifiers like opening and closing curly braces... not sure if there even are more easy identifiers, and this doesn't appeal anyway because it's so prescriptive about the kinds of mistakes it could find try incrementally parsing chunks, as those between curly braces, and seeing what proportion of these fractional statements turn out to be valid JSON; this seems like it would suffer less than (1) from smaller datasets, but would probably be much more processing-intensive, and very susceptible to a missing or inverted brace Just curious if the computational folks or algorithm pros out there had any approaches in mind that my semantics-oriented brain might have missed. PS: It occurs to me that natural language processing, about which I am totally ignorant, might be a cool approach; but, if NLP is a good strategy here, it sort of doesn't matter because I have zero experience with it and don't have time to learn & then implement/ this feature isn't worth it to the client.

Read the article
SQLite, python, unicode, and non-utf data

- by Nathan Spears

I started by trying to store strings in sqlite using python, and got the message: sqlite3.ProgrammingError: You must not use 8-bit bytestrings unless you use a text_factory that can interpret 8-bit bytestrings (like text_factory = str). It is highly recommended that you instead just switch your application to Unicode strings. Ok, I switched to Unicode strings. Then I started getting the message: sqlite3.OperationalError: Could not decode to UTF-8 column 'tag_artist' with text 'Sigur Rós' when trying to retrieve data from the db. More research and I started encoding it in utf8, but then 'Sigur Rós' starts looking like 'Sigur RÃ³s' note: My console was set to display in 'latin_1' as @John Machin pointed out. What gives? After reading this, describing exactly the same situation I'm in, it seems as if the advice is to ignore the other advice and use 8-bit bytestrings after all. I didn't know much about unicode and utf before I started this process. I've learned quite a bit in the last couple hours, but I'm still ignorant of whether there is a way to correctly convert 'ó' from latin-1 to utf-8 and not mangle it. If there isn't, why would sqlite 'highly recommend' I switch my application to unicode strings? I'm going to update this question with a summary and some example code of everything I've learned in the last 24 hours so that someone in my shoes can have an easy(er) guide. If the information I post is wrong or misleading in any way please tell me and I'll update, or one of you senior guys can update. Summary of answers Let me first state the goal as I understand it. The goal in processing various encodings, if you are trying to convert between them, is to understand what your source encoding is, then convert it to unicode using that source encoding, then convert it to your desired encoding. Unicode is a base and encodings are mappings of subsets of that base. utf_8 has room for every character in unicode, but because they aren't in the same place as, for instance, latin_1, a string encoded in utf_8 and sent to a latin_1 console will not look the way you expect. In python the process of getting to unicode and into another encoding looks like: str.decode('source_encoding').encode('desired_encoding') or if the str is already in unicode str.encode('desired_encoding') For sqlite I didn't actually want to encode it again, I wanted to decode it and leave it in unicode format. Here are four things you might need to be aware of as you try to work with unicode and encodings in python. The encoding of the string you want to work with, and the encoding you want to get it to. The system encoding. The console encoding. The encoding of the source file Elaboration: (1) When you read a string from a source, it must have some encoding, like latin_1 or utf_8. In my case, I'm getting strings from filenames, so unfortunately, I could be getting any kind of encoding. Windows XP uses UCS-2 (a Unicode system) as its native string type, which seems like cheating to me. Fortunately for me, the characters in most filenames are not going to be made up of more than one source encoding type, and I think all of mine were either completely latin_1, completely utf_8, or just plain ascii (which is a subset of both of those). So I just read them and decoded them as if they were still in latin_1 or utf_8. It's possible, though, that you could have latin_1 and utf_8 and whatever other characters mixed together in a filename on Windows. Sometimes those characters can show up as boxes, other times they just look mangled, and other times they look correct (accented characters and whatnot). Moving on. (2) Python has a default system encoding that gets set when python starts and can't be changed during runtime. See here for details. Dirty summary ... well here's the file I added: \# sitecustomize.py \# this file can be anywhere in your Python path, \# but it usually goes in ${pythondir}/lib/site-packages/ import sys sys.setdefaultencoding('utf_8') This system encoding is the one that gets used when you use the unicode("str") function without any other encoding parameters. To say that another way, python tries to decode "str" to unicode based on the default system encoding. (3) If you're using IDLE or the command-line python, I think that your console will display according to the default system encoding. I am using pydev with eclipse for some reason, so I had to go into my project settings, edit the launch configuration properties of my test script, go to the Common tab, and change the console from latin-1 to utf-8 so that I could visually confirm what I was doing was working. (4) If you want to have some test strings, eg test_str = "ó" in your source code, then you will have to tell python what kind of encoding you are using in that file. (FYI: when I mistyped an encoding I had to ctrl-Z because my file became unreadable.) This is easily accomplished by putting a line like so at the top of your source code file: # -*- coding: utf_8 -*- If you don't have this information, python attempts to parse your code as ascii by default, and so: SyntaxError: Non-ASCII character '\xf3' in file _redacted_ on line 81, but no encoding declared; see http://www.python.org/peps/pep-0263.html for details Once your program is working correctly, or, if you aren't using python's console or any other console to look at output, then you will probably really only care about #1 on the list. System default and console encoding are not that important unless you need to look at output and/or you are using the builtin unicode() function (without any encoding parameters) instead of the string.decode() function. I wrote a demo function I will paste into the bottom of this gigantic mess that I hope correctly demonstrates the items in my list. Here is some of the output when I run the character 'ó' through the demo function, showing how various methods react to the character as input. My system encoding and console output are both set to utf_8 for this run: '?' = original char <type 'str'> repr(char)='\xf3' '?' = unicode(char) ERROR: 'utf8' codec can't decode byte 0xf3 in position 0: unexpected end of data 'ó' = char.decode('latin_1') <type 'unicode'> repr(char.decode('latin_1'))=u'\xf3' '?' = char.decode('utf_8') ERROR: 'utf8' codec can't decode byte 0xf3 in position 0: unexpected end of data Now I will change the system and console encoding to latin_1, and I get this output for the same input: 'ó' = original char <type 'str'> repr(char)='\xf3' 'ó' = unicode(char) <type 'unicode'> repr(unicode(char))=u'\xf3' 'ó' = char.decode('latin_1') <type 'unicode'> repr(char.decode('latin_1'))=u'\xf3' '?' = char.decode('utf_8') ERROR: 'utf8' codec can't decode byte 0xf3 in position 0: unexpected end of data Notice that the 'original' character displays correctly and the builtin unicode() function works now. Now I change my console output back to utf_8. '?' = original char <type 'str'> repr(char)='\xf3' '?' = unicode(char) <type 'unicode'> repr(unicode(char))=u'\xf3' '?' = char.decode('latin_1') <type 'unicode'> repr(char.decode('latin_1'))=u'\xf3' '?' = char.decode('utf_8') ERROR: 'utf8' codec can't decode byte 0xf3 in position 0: unexpected end of data Here everything still works the same as last time but the console can't display the output correctly. Etc. The function below also displays more information that this and hopefully would help someone figure out where the gap in their understanding is. I know all this information is in other places and more thoroughly dealt with there, but I hope that this would be a good kickoff point for someone trying to get coding with python and/or sqlite. Ideas are great but sometimes source code can save you a day or two of trying to figure out what functions do what. Disclaimers: I'm no encoding expert, I put this together to help my own understanding. I kept building on it when I should have probably started passing functions as arguments to avoid so much redundant code, so if I can I'll make it more concise. Also, utf_8 and latin_1 are by no means the only encoding schemes, they are just the two I was playing around with because I think they handle everything I need. Add your own encoding schemes to the demo function and test your own input. One more thing: there are apparently crazy application developers making life difficult in Windows. #!/usr/bin/env python # -*- coding: utf_8 -*- import os import sys def encodingDemo(str): validStrings = () try: print "str =",str,"{0} repr(str) = {1}".format(type(str), repr(str)) validStrings += ((str,""),) except UnicodeEncodeError as ude: print "Couldn't print the str itself because the console is set to an encoding that doesn't understand some character in the string. See error:\n\t", print ude try: x = unicode(str) print "unicode(str) = ",x validStrings+= ((x, " decoded into unicode by the default system encoding"),) except UnicodeDecodeError as ude: print "ERROR. unicode(str) couldn't decode the string because the system encoding is set to an encoding that doesn't understand some character in the string." print "\tThe system encoding is set to {0}. See error:\n\t".format(sys.getdefaultencoding()), print ude except UnicodeEncodeError as uee: print "ERROR. Couldn't print the unicode(str) because the console is set to an encoding that doesn't understand some character in the string. See error:\n\t", print uee try: x = str.decode('latin_1') print "str.decode('latin_1') =",x validStrings+= ((x, " decoded with latin_1 into unicode"),) try: print "str.decode('latin_1').encode('utf_8') =",str.decode('latin_1').encode('utf_8') validStrings+= ((x, " decoded with latin_1 into unicode and encoded into utf_8"),) except UnicodeDecodeError as ude: print "The string was decoded into unicode using the latin_1 encoding, but couldn't be encoded into utf_8. See error:\n\t", print ude except UnicodeDecodeError as ude: print "Something didn't work, probably because the string wasn't latin_1 encoded. See error:\n\t", print ude except UnicodeEncodeError as uee: print "ERROR. Couldn't print the str.decode('latin_1') because the console is set to an encoding that doesn't understand some character in the string. See error:\n\t", print uee try: x = str.decode('utf_8') print "str.decode('utf_8') =",x validStrings+= ((x, " decoded with utf_8 into unicode"),) try: print "str.decode('utf_8').encode('latin_1') =",str.decode('utf_8').encode('latin_1') except UnicodeDecodeError as ude: print "str.decode('utf_8').encode('latin_1') didn't work. The string was decoded into unicode using the utf_8 encoding, but couldn't be encoded into latin_1. See error:\n\t", validStrings+= ((x, " decoded with utf_8 into unicode and encoded into latin_1"),) print ude except UnicodeDecodeError as ude: print "str.decode('utf_8') didn't work, probably because the string wasn't utf_8 encoded. See error:\n\t", print ude except UnicodeEncodeError as uee: print "ERROR. Couldn't print the str.decode('utf_8') because the console is set to an encoding that doesn't understand some character in the string. See error:\n\t",uee print print "Printing information about each character in the original string." for char in str: try: print "\t'" + char + "' = original char {0} repr(char)={1}".format(type(char), repr(char)) except UnicodeDecodeError as ude: print "\t'?' = original char {0} repr(char)={1} ERROR PRINTING: {2}".format(type(char), repr(char), ude) except UnicodeEncodeError as uee: print "\t'?' = original char {0} repr(char)={1} ERROR PRINTING: {2}".format(type(char), repr(char), uee) print uee try: x = unicode(char) print "\t'" + x + "' = unicode(char) {1} repr(unicode(char))={2}".format(x, type(x), repr(x)) except UnicodeDecodeError as ude: print "\t'?' = unicode(char) ERROR: {0}".format(ude) except UnicodeEncodeError as uee: print "\t'?' = unicode(char) {0} repr(char)={1} ERROR PRINTING: {2}".format(type(x), repr(x), uee) try: x = char.decode('latin_1') print "\t'" + x + "' = char.decode('latin_1') {1} repr(char.decode('latin_1'))={2}".format(x, type(x), repr(x)) except UnicodeDecodeError as ude: print "\t'?' = char.decode('latin_1') ERROR: {0}".format(ude) except UnicodeEncodeError as uee: print "\t'?' = char.decode('latin_1') {0} repr(char)={1} ERROR PRINTING: {2}".format(type(x), repr(x), uee) try: x = char.decode('utf_8') print "\t'" + x + "' = char.decode('utf_8') {1} repr(char.decode('utf_8'))={2}".format(x, type(x), repr(x)) except UnicodeDecodeError as ude: print "\t'?' = char.decode('utf_8') ERROR: {0}".format(ude) except UnicodeEncodeError as uee: print "\t'?' = char.decode('utf_8') {0} repr(char)={1} ERROR PRINTING: {2}".format(type(x), repr(x), uee) print x = 'ó' encodingDemo(x) Much thanks for the answers below and especially to @John Machin for answering so thoroughly.

Read the article
SQLite, python, unicode, and non-utf data

- by Nathan Spears

I started by trying to store strings in sqlite using python, and got the message: sqlite3.ProgrammingError: You must not use 8-bit bytestrings unless you use a text_factory that can interpret 8-bit bytestrings (like text_factory = str). It is highly recommended that you instead just switch your application to Unicode strings. Ok, I switched to Unicode strings. Then I started getting the message: sqlite3.OperationalError: Could not decode to UTF-8 column 'tag_artist' with text 'Sigur Rós' when trying to retrieve data from the db. More research and I started encoding it in utf8, but then 'Sigur Rós' starts looking like 'Sigur RÃ³s' note: My console was set to display in 'latin_1' as @John Machin pointed out. What gives? After reading this, describing exactly the same situation I'm in, it seems as if the advice is to ignore the other advice and use 8-bit bytestrings after all. I didn't know much about unicode and utf before I started this process. I've learned quite a bit in the last couple hours, but I'm still ignorant of whether there is a way to correctly convert 'ó' from latin-1 to utf-8 and not mangle it. If there isn't, why would sqlite 'highly recommend' I switch my application to unicode strings? I'm going to update this question with a summary and some example code of everything I've learned in the last 24 hours so that someone in my shoes can have an easy(er) guide. If the information I post is wrong or misleading in any way please tell me and I'll update, or one of you senior guys can update. Summary of answers Let me first state the goal as I understand it. The goal in processing various encodings, if you are trying to convert between them, is to understand what your source encoding is, then convert it to unicode using that source encoding, then convert it to your desired encoding. Unicode is a base and encodings are mappings of subsets of that base. utf_8 has room for every character in unicode, but because they aren't in the same place as, for instance, latin_1, a string encoded in utf_8 and sent to a latin_1 console will not look the way you expect. In python the process of getting to unicode and into another encoding looks like: str.decode('source_encoding').encode('desired_encoding') or if the str is already in unicode str.encode('desired_encoding') For sqlite I didn't actually want to encode it again, I wanted to decode it and leave it in unicode format. Here are four things you might need to be aware of as you try to work with unicode and encodings in python. The encoding of the string you want to work with, and the encoding you want to get it to. The system encoding. The console encoding. The encoding of the source file Elaboration: (1) When you read a string from a source, it must have some encoding, like latin_1 or utf_8. In my case, I'm getting strings from filenames, so unfortunately, I could be getting any kind of encoding. Windows XP uses UCS-2 (a Unicode system) as its native string type, which seems like cheating to me. Fortunately for me, the characters in most filenames are not going to be made up of more than one source encoding type, and I think all of mine were either completely latin_1, completely utf_8, or just plain ascii (which is a subset of both of those). So I just read them and decoded them as if they were still in latin_1 or utf_8. It's possible, though, that you could have latin_1 and utf_8 and whatever other characters mixed together in a filename on Windows. Sometimes those characters can show up as boxes, other times they just look mangled, and other times they look correct (accented characters and whatnot). Moving on. (2) Python has a default system encoding that gets set when python starts and can't be changed during runtime. See here for details. Dirty summary ... well here's the file I added: \# sitecustomize.py \# this file can be anywhere in your Python path, \# but it usually goes in ${pythondir}/lib/site-packages/ import sys sys.setdefaultencoding('utf_8') This system encoding is the one that gets used when you use the unicode("str") function without any other encoding parameters. To say that another way, python tries to decode "str" to unicode based on the default system encoding. (3) If you're using IDLE or the command-line python, I think that your console will display according to the default system encoding. I am using pydev with eclipse for some reason, so I had to go into my project settings, edit the launch configuration properties of my test script, go to the Common tab, and change the console from latin-1 to utf-8 so that I could visually confirm what I was doing was working. (4) If you want to have some test strings, eg test_str = "ó" in your source code, then you will have to tell python what kind of encoding you are using in that file. (FYI: when I mistyped an encoding I had to ctrl-Z because my file became unreadable.) This is easily accomplished by putting a line like so at the top of your source code file: # -*- coding: utf_8 -*- If you don't have this information, python attempts to parse your code as ascii by default, and so: SyntaxError: Non-ASCII character '\xf3' in file _redacted_ on line 81, but no encoding declared; see http://www.python.org/peps/pep-0263.html for details Once your program is working correctly, or, if you aren't using python's console or any other console to look at output, then you will probably really only care about #1 on the list. System default and console encoding are not that important unless you need to look at output and/or you are using the builtin unicode() function (without any encoding parameters) instead of the string.decode() function. I wrote a demo function I will paste into the bottom of this gigantic mess that I hope correctly demonstrates the items in my list. Here is some of the output when I run the character 'ó' through the demo function, showing how various methods react to the character as input. My system encoding and console output are both set to utf_8 for this run: '?' = original char <type 'str'> repr(char)='\xf3' '?' = unicode(char) ERROR: 'utf8' codec can't decode byte 0xf3 in position 0: unexpected end of data 'ó' = char.decode('latin_1') <type 'unicode'> repr(char.decode('latin_1'))=u'\xf3' '?' = char.decode('utf_8') ERROR: 'utf8' codec can't decode byte 0xf3 in position 0: unexpected end of data Now I will change the system and console encoding to latin_1, and I get this output for the same input: 'ó' = original char <type 'str'> repr(char)='\xf3' 'ó' = unicode(char) <type 'unicode'> repr(unicode(char))=u'\xf3' 'ó' = char.decode('latin_1') <type 'unicode'> repr(char.decode('latin_1'))=u'\xf3' '?' = char.decode('utf_8') ERROR: 'utf8' codec can't decode byte 0xf3 in position 0: unexpected end of data Notice that the 'original' character displays correctly and the builtin unicode() function works now. Now I change my console output back to utf_8. '?' = original char <type 'str'> repr(char)='\xf3' '?' = unicode(char) <type 'unicode'> repr(unicode(char))=u'\xf3' '?' = char.decode('latin_1') <type 'unicode'> repr(char.decode('latin_1'))=u'\xf3' '?' = char.decode('utf_8') ERROR: 'utf8' codec can't decode byte 0xf3 in position 0: unexpected end of data Here everything still works the same as last time but the console can't display the output correctly. Etc. The function below also displays more information that this and hopefully would help someone figure out where the gap in their understanding is. I know all this information is in other places and more thoroughly dealt with there, but I hope that this would be a good kickoff point for someone trying to get coding with python and/or sqlite. Ideas are great but sometimes source code can save you a day or two of trying to figure out what functions do what. Disclaimers: I'm no encoding expert, I put this together to help my own understanding. I kept building on it when I should have probably started passing functions as arguments to avoid so much redundant code, so if I can I'll make it more concise. Also, utf_8 and latin_1 are by no means the only encoding schemes, they are just the two I was playing around with because I think they handle everything I need. Add your own encoding schemes to the demo function and test your own input. One more thing: there are apparently crazy application developers making life difficult in Windows. #!/usr/bin/env python # -*- coding: utf_8 -*- import os import sys def encodingDemo(str): validStrings = () try: print "str =",str,"{0} repr(str) = {1}".format(type(str), repr(str)) validStrings += ((str,""),) except UnicodeEncodeError as ude: print "Couldn't print the str itself because the console is set to an encoding that doesn't understand some character in the string. See error:\n\t", print ude try: x = unicode(str) print "unicode(str) = ",x validStrings+= ((x, " decoded into unicode by the default system encoding"),) except UnicodeDecodeError as ude: print "ERROR. unicode(str) couldn't decode the string because the system encoding is set to an encoding that doesn't understand some character in the string." print "\tThe system encoding is set to {0}. See error:\n\t".format(sys.getdefaultencoding()), print ude except UnicodeEncodeError as uee: print "ERROR. Couldn't print the unicode(str) because the console is set to an encoding that doesn't understand some character in the string. See error:\n\t", print uee try: x = str.decode('latin_1') print "str.decode('latin_1') =",x validStrings+= ((x, " decoded with latin_1 into unicode"),) try: print "str.decode('latin_1').encode('utf_8') =",str.decode('latin_1').encode('utf_8') validStrings+= ((x, " decoded with latin_1 into unicode and encoded into utf_8"),) except UnicodeDecodeError as ude: print "The string was decoded into unicode using the latin_1 encoding, but couldn't be encoded into utf_8. See error:\n\t", print ude except UnicodeDecodeError as ude: print "Something didn't work, probably because the string wasn't latin_1 encoded. See error:\n\t", print ude except UnicodeEncodeError as uee: print "ERROR. Couldn't print the str.decode('latin_1') because the console is set to an encoding that doesn't understand some character in the string. See error:\n\t", print uee try: x = str.decode('utf_8') print "str.decode('utf_8') =",x validStrings+= ((x, " decoded with utf_8 into unicode"),) try: print "str.decode('utf_8').encode('latin_1') =",str.decode('utf_8').encode('latin_1') except UnicodeDecodeError as ude: print "str.decode('utf_8').encode('latin_1') didn't work. The string was decoded into unicode using the utf_8 encoding, but couldn't be encoded into latin_1. See error:\n\t", validStrings+= ((x, " decoded with utf_8 into unicode and encoded into latin_1"),) print ude except UnicodeDecodeError as ude: print "str.decode('utf_8') didn't work, probably because the string wasn't utf_8 encoded. See error:\n\t", print ude except UnicodeEncodeError as uee: print "ERROR. Couldn't print the str.decode('utf_8') because the console is set to an encoding that doesn't understand some character in the string. See error:\n\t",uee print print "Printing information about each character in the original string." for char in str: try: print "\t'" + char + "' = original char {0} repr(char)={1}".format(type(char), repr(char)) except UnicodeDecodeError as ude: print "\t'?' = original char {0} repr(char)={1} ERROR PRINTING: {2}".format(type(char), repr(char), ude) except UnicodeEncodeError as uee: print "\t'?' = original char {0} repr(char)={1} ERROR PRINTING: {2}".format(type(char), repr(char), uee) print uee try: x = unicode(char) print "\t'" + x + "' = unicode(char) {1} repr(unicode(char))={2}".format(x, type(x), repr(x)) except UnicodeDecodeError as ude: print "\t'?' = unicode(char) ERROR: {0}".format(ude) except UnicodeEncodeError as uee: print "\t'?' = unicode(char) {0} repr(char)={1} ERROR PRINTING: {2}".format(type(x), repr(x), uee) try: x = char.decode('latin_1') print "\t'" + x + "' = char.decode('latin_1') {1} repr(char.decode('latin_1'))={2}".format(x, type(x), repr(x)) except UnicodeDecodeError as ude: print "\t'?' = char.decode('latin_1') ERROR: {0}".format(ude) except UnicodeEncodeError as uee: print "\t'?' = char.decode('latin_1') {0} repr(char)={1} ERROR PRINTING: {2}".format(type(x), repr(x), uee) try: x = char.decode('utf_8') print "\t'" + x + "' = char.decode('utf_8') {1} repr(char.decode('utf_8'))={2}".format(x, type(x), repr(x)) except UnicodeDecodeError as ude: print "\t'?' = char.decode('utf_8') ERROR: {0}".format(ude) except UnicodeEncodeError as uee: print "\t'?' = char.decode('utf_8') {0} repr(char)={1} ERROR PRINTING: {2}".format(type(x), repr(x), uee) print x = 'ó' encodingDemo(x) Much thanks for the answers below and especially to @John Machin for answering so thoroughly.

Read the article
Use ASP.NET 4 Browser Definitions with ASP.NET 3.5

- by Stephen Walther

We updated the browser definitions files included with ASP.NET 4 to include information on recent browsers and devices such as Google Chrome and the iPhone. You can use these browser definition files with earlier versions of ASP.NET such as ASP.NET 3.5. The updated browser definition files, and instructions for installing them, can be found here: http://aspnet.codeplex.com/releases/view/41420 The changes in the browser definition files can cause backwards compatibility issues when you upgrade an ASP.NET 3.5 web application to ASP.NET 4. If you encounter compatibility issues, you can install the old browser definition files in your ASP.NET 4 application. The old browser definition files are included in the download file referenced above. What’s New in the ASP.NET 4 Browser Definition Files The complete set of browsers supported by the new ASP.NET 4 browser definition files is represented by the following figure: If you look carefully at the figure, you’ll notice that we added browser definitions for several types of recent browsers such as Internet Explorer 8, Firefox 3.5, Google Chrome, Opera 10, and Safari 4. Furthermore, notice that we now include browser definitions for several of the most popular mobile devices: BlackBerry, IPhone, IPod, and Windows Mobile (IEMobile). The mobile devices appear in the figure with a purple background color. To improve performance, we removed a whole lot of outdated browser definitions for old cell phones and mobile devices. We also cleaned up the information contained in the browser files. Here are some of the browser features that you can detect: Are you a mobile device? <%=Request.Browser.IsMobileDevice %> Are you an IPhone? <%=Request.Browser.MobileDeviceModel == "IPhone" %> What version of JavaScript do you support? <%=Request.Browser["javascriptversion"] %> What layout engine do you use? <%=Request.Browser["layoutEngine"] %> Here’s what you would get if you displayed the value of these properties using Internet Explorer 8: Here’s what you get when you use Google Chrome: Testing Browser Settings When working with browser definition files, it is useful to have some way to test the capability information returned when you request a page with different browsers. You can use the following method to return the HttpBrowserCapabilities the corresponds to a particular user agent string and set of browser headers: public HttpBrowserCapabilities GetBrowserCapabilities(string userAgent, NameValueCollection headers) { HttpBrowserCapabilities browserCaps = new HttpBrowserCapabilities(); Hashtable hashtable = new Hashtable(180, StringComparer.OrdinalIgnoreCase); hashtable[string.Empty] = userAgent; // The actual method uses client target browserCaps.Capabilities = hashtable; var capsFactory = new System.Web.Configuration.BrowserCapabilitiesFactory(); capsFactory.ConfigureBrowserCapabilities(headers, browserCaps); capsFactory.ConfigureCustomCapabilities(headers, browserCaps); return browserCaps; } At the end of this blog entry, there is a link to download a simple Visual Studio 2008 project – named Browser Definition Test -- that uses this method to display capability information for arbitrary user agent strings. For example, if you enter the user agent string for an iPhone then you get the results in the following figure: The Browser Definition Test application enables you to submit a user-agent string and display a table of browser capabilities information. The browser definition files contain sample user-agent strings for each browser definition. I got the iPhone user-agent string from the comments in the iphone.browser file. Enumerating Browser Definitions Someone asked in the comments whether or not there is a way to enumerate all of the browser definitions. You can do this if you ware willing to use a little reflection and read a private property. The browser definition files in the config\browsers folder get parsed into a class named BrowserCapabilitesFactory. After you run the aspnet_regbrowsers tool, you can see the source for this class in the config\browser folder by opening a file named BrowserCapsFactory.cs. The BrowserCapabilitiesFactoryBase class has a protected property named BrowserElements that represents a Hashtable of all of the browser definitions. Here's how you can read this protected property and display the ID for all of the browser definitions: var propInfo = typeof(BrowserCapabilitiesFactory).GetProperty("BrowserElements", BindingFlags.NonPublic | BindingFlags.Instance); Hashtable browserDefinitions = (Hashtable)propInfo.GetValue(new BrowserCapabilitiesFactory(), null); foreach (var key in browserDefinitions.Keys) { Response.Write("" + key); } If you run this code using Visual Studio 2008 then you get the following results: You get a huge number of outdated browsers and devices. In all, 449 browser definitions are listed. If you run this code using Visual Studio 2010 then you get the following results: In the case of Visual Studio 2010, all the old browsers and devices have been removed and you get only 19 browser definitions. Conclusion The updated browser definition files included in ASP.NET 4 provide more accurate information for recent browsers and devices. If you would like to test the new browser definitions with different user-agent strings then I recommend that you download the Browser Definition Test project: Browser Definition Test Project

Read the article
Anatomy of a .NET Assembly - CLR metadata 2

- by Simon Cooper

Before we look any further at the CLR metadata, we need a quick diversion to understand how the metadata is actually stored. Encoding table information As an example, we'll have a look at a row in the TypeDef table. According to the spec, each TypeDef consists of the following: Flags specifying various properties of the class, including visibility. The name of the type. The namespace of the type. What type this type extends. The field list of this type. The method list of this type. How is all this data actually represented? Offset & RID encoding Most assemblies don't need to use a 4 byte value to specify heap offsets and RIDs everywhere, however we can't hard-code every offset and RID to be 2 bytes long as there could conceivably be more than 65535 items in a heap or more than 65535 fields or types defined in an assembly. So heap offsets and RIDs are only represented in the full 4 bytes if it is required; in the header information at the top of the #~ stream are 3 bits indicating if the #Strings, #GUID, or #Blob heaps use 2 or 4 bytes (the #US stream is not accessed from metadata), and the rowcount of each table. If the rowcount for a particular table is greater than 65535 then all RIDs referencing that table throughout the metadata use 4 bytes, else only 2 bytes are used. Coded tokens Not every field in a table row references a single predefined table. For example, in the TypeDef extends field, a type can extend another TypeDef (a type in the same assembly), a TypeRef (a type in a different assembly), or a TypeSpec (an instantiation of a generic type). A token would have to be used to let us specify the table along with the RID. Tokens are always 4 bytes long; again, this is rather wasteful of space. Cutting the RID down to 2 bytes would make each token 3 bytes long, which isn't really an optimum size for computers to read from memory or disk. However, every use of a token in the metadata tables can only point to a limited subset of the metadata tables. For the extends field, we only need to be able to specify one of 3 tables, which we can do using 2 bits: 0x0: TypeDef 0x1: TypeRef 0x2: TypeSpec We could therefore compress the 4-byte token that would otherwise be needed into a coded token of type TypeDefOrRef. For each type of coded token, the least significant bits encode the table the token points to, and the rest of the bits encode the RID within that table. We can work out whether each type of coded token needs 2 or 4 bytes to represent it by working out whether the maximum RID of every table that the coded token type can point to will fit in the space available. The space available for the RID depends on the type of coded token; a TypeOrMethodDef coded token only needs 1 bit to specify the table, leaving 15 bits available for the RID before a 4-byte representation is needed, whereas a HasCustomAttribute coded token can point to one of 18 different tables, and so needs 5 bits to specify the table, only leaving 11 bits for the RID before 4 bytes are needed to represent that coded token type. For example, a 2-byte TypeDefOrRef coded token with the value 0x0321 has the following bit pattern: 0 3 2 1 0000 0011 0010 0001 The first two bits specify the table - TypeRef; the other bits specify the RID. Because we've used the first two bits, we've got to shift everything along two bits: 000000 1100 1000 This gives us a RID of 0xc8. If any one of the TypeDef, TypeRef or TypeSpec tables had more than 16383 rows (2^14 - 1), then 4 bytes would need to be used to represent all TypeDefOrRef coded tokens throughout the metadata tables. Lists The third representation we need to consider is 1-to-many references; each TypeDef refers to a list of FieldDef and MethodDef belonging to that type. If we were to specify every FieldDef and MethodDef individually then each TypeDef would be very large and a variable size, which isn't ideal. There is a way of specifying a list of references without explicitly specifying every item; if we order the MethodDef and FieldDef tables by the owning type, then the field list and method list in a TypeDef only have to be a single RID pointing at the first FieldDef or MethodDef belonging to that type; the end of the list can be inferred by the field list and method list RIDs of the next row in the TypeDef table. Going back to the TypeDef If we have a look back at the definition of a TypeDef, we end up with the following reprensentation for each row: Flags - always 4 bytes Name - a #Strings heap offset. Namespace - a #Strings heap offset. Extends - a TypeDefOrRef coded token. FieldList - a single RID to the FieldDef table. MethodList - a single RID to the MethodDef table. So, depending on the number of entries in the heaps and tables within the assembly, the rows in the TypeDef table can be as small as 14 bytes, or as large as 24 bytes. Now we've had a look at how information is encoded within the metadata tables, in the next post we can see how they are arranged on disk.

Read the article
Firefox or Chrome - how to force a specific encoding for a page

- by Mike

Hi, I am accessing an intranet site built by amateurs, that was constructed to be "best viewed by IE" (arghhh!). The site is in portuguese. All accented letters are jammed and do not appear as they should. As I create sites myself, I know that the best way to build a site in portuguese and other latin languages is to use the "charset=iso-8859-1" on the page's HTML encoding. This will ensure cross-browser and platforms compatibility. But I have no way to change this, because I am a visitor on this site. I don't know the encoding they are using. What I ask is: is there a way I can force my browser (Chrome or Firefox) to recode the page using the correct charset? I need this to work on Ubuntu.

Read the article
My windows keyboard is being "clever" with the quote keys - how can I stop it?

- by Marcin

I'm using windows 7 on a laptop. On the laptop keyboard, for some reason, the quote key (which has both double and single quote on it) is doing some "clever" annoying things: When I press single-quote (or double-quote), windows doesn't send any characters until I press it twice (resulting in '' or "") When I press it before a vowel, I get some kind of accented character. As I usually only write English, this is annoying. The backtick/tilde key is subject to similar behaviour. I have not attempted to set up my computer to process anything other than English. My keyboard appears to be (in so far as these things are standard on laptops) a standard US qwerty keyboard. How can I stop this happening?

Read the article
Creating a constant Dictionary in C#

- by David Schmitt

What is the most efficient way to create a constant (never changes at runtime) mapping of strings to ints? I've tried using a const Dictionary, but that didn't work out. I could implement a immutable wrapper with appropriate semantics, but that still doesn't seem totally right. For those who have asked, I'm implementing IDataErrorInfo in a generated class and am looking for a way to make the columnName lookup into my array of descriptors. I wasn't aware (typo when testing! d'oh!) that switch accepts strings, so that's what I'm gonna use. Thanks!

Read the article
Help with string manipulation function

- by MusiGenesis

I have a set of strings that contain within them one or more question marks delimited by a comma, a comma plus one or more spaces, or potentially both. So these strings are all possible: BOB AND ? BOB AND ?,?,?,?,? BOB AND ?, ?, ? ,? BOB AND ?,? , ?,? ?, ? ,? AND BOB I need to replace the question marks with @P#, so that the above samples would become: BOB AND @P1 BOB AND @P1,@P2,@P3,@P4,@P5 BOB AND @P1,@P2,@P3,@P4 BOB AND @P1,@P2,@P3,@P4 @P1,@P2,@P3 AND BOB What's the best way to do this without regex or Linq?

Read the article
Regex to match a whole string only if it lacks a given substring/suffix

- by Ivan Krechetov

I've searched for questions like this, but all the cases I found were solved in a problem-specific manner, like using !g in vi to negate the regex matches, or matching other things, without a regex negation. Thus, I'm interested in a “pure” solution to this: Having a set of strings I need to filter them with a regular expression matcher so that it only leaves (matches) the strings lacking a given substring. For example, filtering out "Foo" in: Boo Foo Bar FooBar BooFooBar Baz Would result in: Boo Bar Baz I tried constructing it with negative look aheads/behinds (?!regex)/(?<!regex), but couldn't figure it out. Is that even possible?

Read the article
Render string to texture in Android and OpenGL ES

- by Eddie Ringle

I've googled around everywhere, but cannot find much for rendering strings to textures and then displaying that texture on a quad on the screen. Can someone provide a run-down on the process or provide good resources that describe how? Is rendering strings to textures even the best method for displaying text in an Android OpenGL ES app? EDIT: Okay, so LabelMaker interferes with alpha blending, the texture (created from a PNG with a transparent background) now has a solid black background, rather than a transparent background. If I comment out all the LabelMaker-related code, it works fine.

Read the article
Sending and receiving a TMemoryStream using IdTCPClient and IdTCPServer

- by Martin Melka

I found Remy Lebeau's chat demo of IdTCP components in XE2 and I wanted to play with it a little bit. (It can be found here) I would like to send a picture using these components and the best approach seems to be using TMemoryStream. If I send strings, the connection works fine, the strings are transmitted successfully, however when I change it to Stream instead, it doesn't work. Here is the code: Server procedure TMainForm.IdTCPServerExecute(AContext: TIdContext); var rcvdMsg: string; ms:TMemoryStream; begin // This commented code is working, it receives and sends strings. // rcvdMsg:=AContext.Connection.IOHandler.ReadLn; // LogMessage('<ServerExec> '+rcvdMsg); // // TResponseSync.SendResponse(AContext, rcvdMsg); try ms:=TMemoryStream.Create; AContext.Connection.IOHandler.ReadStream(ms); ms.SaveToFile('c:\networked.bmp'); except LogMessage('Failed to receive',clred); end; end; Client procedure TfrmMain.Button1Click(Sender: TObject); var ms: TMemoryStream; bmp: TBitmap; pic: TPicture; s: string; begin // Again, this code is working for sending strings. // s:=edMsg.Text; // Client.IOHandler.WriteLn(s); ms:=TMemoryStream.Create; pic:=TPicture.Create; pic.LoadFromFile('c:\Back.png'); bmp:=TBitmap.Create; bmp.Width:=pic.Width; bmp.Height:=pic.Height; bmp.Canvas.Draw(0,0,pic.Graphic); bmp.SaveToStream(ms); ms.Position:=0; Client.IOHandler.Write(ms); ms.Free; end; When I try to send the stream from the client, nothing observable happens (breakpoint in the OnExecute doesn't fire). However, when closing the programs(after sending the MemoryStream), two things happen: If the Client is closed first, only then does the except part get processed (the log displays the 'Failed to receive' error. However, even if I place a breakpoint on the first line of the try-except block, it somehow gets skipped and only the error is displayed). If the Server is closed first, the IDE doesn't change back from debug, Client doesn't change its state to disconnected (as it normally does when server disconnects) and after the Client is closed as well, an Access Violation error from the Server app appears. I guess this means that there is a thread of the Server still running and maintaining the connection. But no matter how much time i give it, it never completes the task of receiving the MemoryStream. Note: The server uses IdSchedulerOfThreadDefault and IdAntiFreeze, if that matters. As I can't find any reliable source of help for the revamped Indy 10 (it all appears to apply for the older Indy 10, or even Indy 9), I hope you can tell me what is wrong. Thanks - ANSWER - SERVER procedure TMainForm.IdTCPServerExecute(AContext: TIdContext); var size: integer; ms:TMemoryStream; begin try ms:=TMemoryStream.Create; size:=AContext.Connection.IOHandler.ReadLongInt; AContext.Connection.IOHandler.ReadStream(ms, size); ms.SaveToFile('c:\networked.bmp'); except LogMessage('Failed to receive',clred); end; end; CLIENT procedure TfrmMain.Button1Click(Sender: TObject); var ms: TMemoryStream; bmp: TBitmap; pic: TPicture; begin ms:=TMemoryStream.Create; pic:=TPicture.Create; pic.LoadFromFile('c:\Back.png'); bmp:=TBitmap.Create; bmp.Width:=pic.Width; bmp.Height:=pic.Height; bmp.Canvas.Draw(0,0,pic.Graphic); bmp.SaveToStream(ms); ms.Position:=0; Client.IOHandler.Write(ms, 0, True); ms.Free; end;

Read the article
jQuery: shorten string length to fit a set width.

- by Marius

Hello there, I have a table, and in each cell I want to place strings, but they are much wider than the cell width. To prevent line break, I would like to shorten the strings to fit the cell, and append '...' at end to indicate that the string is much longer. The table has about 40 rows and has to be done to each cell, so its important that its a quick. Should I use JS/jQuery for this? How would I do it? Thank you for your time. Kind regards, Marius

Read the article
WinForms - Localization - UI controls positions different in additional Culture

- by binball

Hi, I did my UI settings.Original language is English. After that I set Localizable property to True. Copied original resx file to frmMain.de-De.resx (for example). Translated all strings. Everything works. But now I would like to change positions of controls. After that changes are visible only for original/primary Culture (En). When I change Culture to de-De then UI controls are on the "old positions"(?!) Is this normal behaviour? :O I'm unable to change controls positions on my form after localization? Can someone explain me this and give some best solution. I really have to change UI design but I don't want to manual copy all translated strings again. If my description is not clear then I can post source code, just please let me know. I use VS 2008. Greetz!

Read the article
how to remove listview items in android

- by kavitha

Hi all,, Can somebody please give me an example code of removing all ListView items and replacing with new items. I tried replacing the adapter items.Still no results. my code is at first i am calling populateList(){ results -populated arraylist with strings ArrayAdapter<String> adapter = new ArrayAdapter<String>(this, android.R.layout.simple_list_item_1, results); listview.setAdapter(adapter); adapter.notifyDataSetChanged(); listview.setOnItemClickListener(this); } // now populating list again repopulateList(){ results1-populated arraylist with strings ArrayAdapter<String> adapter1 = new ArrayAdapter<String>(this, android.R.layout.simple_list_item_1, results1); listview.setAdapter(adapter1); adapter1.notifyDataSetChanged(); listview.setOnItemClickListener(this); } Here replpulateList() method will add to listview items. It doent remove/replace all listview items. Please Help.

Read the article
WPF - Render text in Viewport3D

- by eWolf

I want to present up to 300 strings (just a few words) in a Viewport3D - fast! I want to render them on different Z positions and zoom in and out fluently. The ways I have found so far to render text in a Viewport3D: Put a TextBlock in a Viewport2DVisual3D. This guy's PlanarText class. The same guy's SolidText class. Create my own 2D panel and align TextBlocks on it. Call InvalidateArrange() every time I update the camera position. All of these are extremely slow and far apart from zooming fluently even with 10 strings only. Does anyone have a solution for this handy? It's got to be possible to render some text in a Viewport3D without waiting seconds!

Read the article
pysqlite2: ProgrammingError - You must not use 8-bit bytestrings

- by rfkrocktk

I'm currently persisting filenames in a sqlite database for my own purposes. Whenever I try to insert a file that has a special character (like é etc.), it throws the following error: pysqlite2.dbapi2.ProgrammingError: You must not use 8-bit bytestrings unless you use a text_factory that can interpret 8-bit bytestrings (like text_factory = str). It is highly recommended that you instead just switch your application to Unicode strings. When I do "switch my application over to Unicode strings" by wrapping the value sent to pysqlite with the unicode method like: unicode(filename), it throws this error: UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 66: ordinal not in range(128) Is there something I can do to get rid of this? Modifying all of my files to conform isn't an option.

Read the article
Regex: replace inner string

- by AllenG

I'm working with X12 EDI Files (Specifialy 835s for those of you in Health Care), and I have a particular vendor who's using a non-HIPAA compliant version (3090, I think). The problem is that in a particular segment (PLB- again, for those who care) they're sending a code which is no longer supported by the HIPAA Standard. I need to locate the specific code, and update it with a corrected code. I think a Regex would be best for this, but I'm still very new to Regex, and I'm not sure where to begin. My current methodology is to turn the file into an array of strings, find the array that starts with "PLB", break that into an array of strings, find the code, and change it. As you can guess, that's very verbose code for something which should be (I'd think) fairly simple. Here's a sample of what I'm looking for: ~PLB|1902841224|20100228|49KC15X078001104|.08~ And here's what I want to change it to: ~PLB|1902841224|20100228|CSKC15X078001104|.08~ Any suggestions?

Read the article
Searching a large list of words in another large list

- by Christian

I have a list of 1,000,000 strings with a maximum length of 256 with protein names. Every string has an associated ID. I have another list of 4,000,000,000 strings with a maximum length of 256 with words out of articles and every word has an ID. I want to find all matches between the list of protein names and the list of words of the articles. Which algorithm should I use? Should I use some prebuild API? It would be good if the algorithm runs on a normal PC without special hardware.

Read the article
django internationalization doesn't work

- by xRobot

I have: * created translation strings in the template and in the application view. * run this command: django-admin.py makemessages -l it and the file it/LC_MESSAGES/django.po has been created * translated strings in the django.po file. * run this command: django-admin.py compilemessages and I receive: processing file django.po in /home/jobber/Desktop/library/books/locale/it/LC_MESSAGES * set this in settings.py: LANGUAGE_CODE = 'it' TEMPLATE_CONTEXT_PROCESSORS = ( "django.core.context_processors.auth", "django.core.context_processors.debug", "django.core.context_processors.i18n", "django.core.context_processors.media", ) USE_I18N = True MIDDLEWARE_CLASSES = ( 'django.middleware.common.CommonMiddleware', 'django.middleware.locale.LocaleMiddleware', 'django.contrib.sessions.middleware.SessionMiddleware', 'django.contrib.auth.middleware.AuthenticationMiddleware', ) but.... translation doesn't work !! I always see english text. Why ?

Read the article

Search Results

Search found 4702 results on 189 pages for 'accented strings'.

Page 57/189 | < Previous Page | 53 54 55 56 57 58 59 60 61 62 63 64 | Next Page >

- by Thijs

- by Matt Huggins

- by Steve B.

- by Bono

- by MKO

- by Jonline

- by Nathan Spears

- by Nathan Spears

- by Stephen Walther

- by Simon Cooper

- by Mike

- by Marcin

- by David Schmitt

- by MusiGenesis

- by Ivan Krechetov

- by Eddie Ringle

- by Martin Melka

- by Marius

- by binball

- by kavitha

- by eWolf

- by rfkrocktk

- by AllenG

- by Christian

- by xRobot

< Previous Page | 53 54 55 56 57 58 59 60 61 62 63 64 | Next Page >