Search Results

Search found 13534 results on 542 pages for 'python'.

Page 27/542 | < Previous Page | 23 24 25 26 27 28 29 30 31 32 33 34  | Next Page >

  • Learning python in one weekend

    - by permalac
    Hello, I'm planing to learn python this weekend. I have some programing skills and should not be difficult if I take the good path. I'm a sysadmin and I've been looking at python scripts since long ago, but now I would like to make python my scripting language, so I'll do the effort and next monday I will start using it. Any advice on how to trace my own path to good python skills? thanks. PS: I will use vim, ypython and http://diveintopython.org

    Read the article

  • python sqlite fails

    - by lakshmipathi
    My program uses sqlite3 plus python. It works fine with python 2.6.2 I moved it another machine and installed 2.6.4 and running the program gave me this error File "", line 1, in File "/opt/python-2.6.4/lib/python2.6/sqlite3/init.py", line 24, in from dbapi2 import * File "/opt/python-2.6.4/lib/python2.6/sqlite3/dbapi2.py", line 27, in from _sqlite3 import * ImportError: No module named _sqlite3

    Read the article

  • A Question about using jython when run a receving socket in python

    - by abusemind
    Hi, I have not a lot of knowledge of python and network programming. Currently I am trying to implement a simple application which can receive a text message sent by the user, fetch some information from the google search api, and return the results via text message to the user. This application will continue to listening to the users messages and reply immediately. How I get the text short message sent by the user? It's a program named fetion from the mobile supplier in China. The client side fetion, just like a instant communication tool, can send/receive messages to/from other people who are using mobile to receive/send SMS. I am using a open source python program that simulates the fetion program. So basically I can use this python program to communate with others who using cell phone via SMS. My core program is based on java, so I need to take this python program into java environment. I am using jython, and now I am available to send messages to users by some lines of java codes. But the real question is the process of receving from users via SMS. In python code, a new thread is created to continuously listen to the user. It should be OK in Python, but when I run the similar process in Jython, the following exception occurs: Exception in thread Thread:Traceback (most recent call last): File "D:\jython2.5.1\Lib\threading.py", line 178, in _Thread__bootstrap self.run() File "<iostream>", line 1389, in run File "<iostream>", line 1207, in receive File "<iostream>", line 1207, in receive File "<iostream>", line 150, in recv File "D:\jython2.5.1\Lib\select.py", line 223, in native_select pobj.register(fd, POLLIN) File "D:\jython2.5.1\Lib\select.py", line 104, in register raise _map_exception(jlx) error: (20000, 'socket must be in non-blocking mode') The line 150 in the python code is as follows: def recv(self,timeout=False): if self.login_type == "HTTP": time.sleep(10) return self.get_offline_msg() pass else: if timeout: infd,outfd,errfd = select([self.__sock,],[],[],timeout)//<---line 150 here else: infd,outfd,errfd = select([self.__sock,],[],[]) if len(infd) != 0: ret = self.__tcp_recv() num = len(ret) d_print(('num',),locals()) if num == 0: return ret if num == 1: return ret[0] for r in ret: self.queue.put(r) d_print(('r',),locals()) if not self.queue.empty(): return self.queue.get() else: return "TimeOut" Because of I am not very familiar with python, especially the socket part, and also new in Jython use, I really need your help or only advice or explanation. Thank you very much!

    Read the article

  • Why does Python Array Module Process Strings and Lists Differently?

    - by Casey
    I'm having trouble understanding the result of the following statements: >>> from array import array >>> array('L',[0xff,0xff,0xff,0xff]) array('L', [255L, 255L, 255L, 255L]) >>> from array import array >>> array('L','\xff\xff\xff\xff') Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: string length not a multiple of item size

    Read the article

  • problem with python script

    - by hidayat
    I want to run a csh file from a python scrip, example, #!/usr/bin/python import os os.system("source path/to/file.csh") and I want this file to run in the same shell as I am running the python script, because the file.csh script is settings some environment variables that I need. Does anyone know how to do this in Python?

    Read the article

  • Python-How to call up a function from another module?

    - by user2691540
    I am making a project where I need several modules which are imported into one main module to make a pizza ordering service. Upon finishing this (to a working standard) I decided to re-code it so that the whole order is completed with one tkinter window, now instead of using input(), I use tkinter.Entry() etc. etc. To do this I had to use functions for each step so it is broken up nicely, e.g. the code asks if the user wants pickup or delivery, the user clicks a button which sets some variables and one of the buttons is then configured to say "continue" and the command leads to the next step of the pizza ordering process e.g. getting the name. The problem I have is that when I get past the last function of the first module the configured button has a command to go to a function in the second module, but it says that the command is not defined??? I have tried my way around this but cannot import the configured button variable into the next module, and anything else I tried gave no result, it simply doesn't go to the next module after the first module is done. I have made the main tkinter window in the main module and have it that it will mainloop after importing the other modules so shouldn't the function I want to call upon be defined? How can I get from one function to the next if the latter is in a seperate module? Is this possible or do I have to rethink my approach and if so how? Ok then, have made some more code to show what my problem is, this isn't what I am actually using but it's a lot shorter and has the same issue: this is the main module: import tkinter mainwindow = tkinter.Tk() # here i set the window to a certain size etc. import mod1 import mod2 mainwindow.mainloop() this is mod1: import tkinter def button1(): label.destroy() button1.destroy() button2.config(text = "continue", command = func2) def button2(): label.destroy() button1.destroy() button2.config(text = "continue", command = func2) label = tkinter.Label(text = "example label") button1 = tkinter.Button(text = "button1", command = button1) button2 = tkinter.Button(text = "button2", command = button2) label.pack() button1.pack() button2.pack() this is mod2: def func2(): button2.destroy() print ("haha it works...") I still get the problem that func2 is not defined? Thanks in advance

    Read the article

  • Problem installing MySQLdb on windows - Can't find python

    - by aldux
    I'm trying to install the module mySQLdb on a windows vista 64 (amd) machine. I've installed python on a different folder other than suggested by Python installer. When I try to install the .exe mySQLdb installer, it can't find python 2.5 and it halts the installation. Is there anyway to supply the installer with the correct python location (even thou the registry and path are right)?

    Read the article

  • Why does Python array moduel handle strings and lists differently?

    - by Casey
    I'm having trouble understanding the result of the following statements: >>> from array import array >>> array('L',[0xff,0xff,0xff,0xff]) array('L', [255L, 255L, 255L, 255L]) >>> from array import array >>> array('L','\xff\xff\xff\xff') Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: string length not a multiple of item size

    Read the article

  • call a server side python script from javascript

    - by alex
    how to call a server side python script from javascript. if test.py is the python script file in server, and if the parameter to be passed to python is another url , then how this can be executed from javascript, how the return string from python script is obtained to the javascript.

    Read the article

  • Mutable global variables don't get hide in python functions, right?

    - by aXqd
    Please see the following code: def good(): foo[0] = 9 # why this foo isn't local variable who hides the global one def bad(): foo = [9, 2, 3] # foo is local, who hides the global one for func in [good, bad]: foo = [1,2,3] print('Before "{}": {}'.format(func.__name__, foo)) func() print('After "{}": {}'.format(func.__name__, foo)) The result is as below: # python3 foo.py Before "good": [1, 2, 3] After "good": [9, 2, 3] Before "bad" : [1, 2, 3] After "bad" : [1, 2, 3]

    Read the article

  • python: html writer?

    - by Bin Chen
    With jquery it's very easy to insert some element inside another element using the selector technology, I am wondering if there is any python library that can do things similar with jquery, the reason is I want server side python program to produce the static pages, which needs to parse the html and insert something into it. Or other alternative, not in python language at all? EDIT: To be clear, I want to use python to write below program: h = html.parse('temp.html') h.find('#idnum').html('<b>my html generated</b>') h.close()

    Read the article

  • String replacement on a whole text file in Python 3.x?

    - by SkippyFire
    How can I replace a string with another string, within a given text file. Do I just loop through readline() and run the replacement while saving out to a new file? Or is there a better way? I'm thinking that I could read the whole thing into memory, but I'm looking for a more elegant solution... Thanks in advance

    Read the article

  • Where is python Language used

    - by Mirage
    I am a web developer and usually use php/JS/mysql. I have heard lot about python. I have no idea where is python used and why it is used. Just like php/asp/cold fusion/.net/ are used to build websites C, C++ , Java are used to build software or desktop apps Where does python stands from those langages WHich is the thing whic can be done by python but not with those common languages

    Read the article

  • Carrot (Python) [errno 10054] An existing connection was forcibly closed by the remote host

    - by Meditation
    Hi all, We are using Carrot in our Python project. I wrote a Python script acting as the consumer of the message queue. I invoked this Python script using command line shell in Windows 7 as python consumer.py However, after a while, the running session was aborted and the error is: [errno 10054] An existing connection was forcibly closed by the remote host The producer session is still running fine on the Linux server. Just wondering how can I fix this and have a long running consumer session on Windows Thanks in advance.

    Read the article

  • Is there a way to accept '%' as part of input that works both in python 2.6 & 3.0?

    - by bug11
    In 2.6, if I needed to accept input that allowed a percent sign (such as "foo % bar"), I used raw_input() which worked as expected. In 3.0, input() accomplishes that same (with raw_input() having left the building). As an exercise, I'm hoping that I can have a backward-compatible version that will work with both 2.6 and 3.0. When I use input() in 2.6 and enter "foo % bar", the following error is returned: File "<string>", line 1, in <module> NameError: name "foo" is not defined ...which is expected. Anyway to to accomplish acceptance of input containing a percent sign that works in both 2.6 and 3.0? Thx.

    Read the article

  • Python bindings for a vala library

    - by celil
    I am trying to create python bindings to a vala library using the following IBM tutorial as a reference. My initial directory has the following two files: test.vala using GLib; namespace Test { public class Test : Object { public int sum(int x, int y) { return x + y; } } } test.override %% headers #include <Python.h> #include "pygobject.h" #include "test.h" %% modulename test %% import gobject.GObject as PyGObject_Type %% ignore-glob *_get_type %% and try to build the python module source test_wrap.c using the following code build.sh #/usr/bin/env bash valac test.vala -CH test.h python /usr/share/pygobject/2.0/codegen/h2def.py test.h > test.defs pygobject-codegen-2.0 -o test.override -p test test.defs > test_wrap.c However, the last command fails with an error $ ./build.sh Traceback (most recent call last): File "/usr/share/pygobject/2.0/codegen/codegen.py", line 1720, in <module> sys.exit(main(sys.argv)) File "/usr/share/pygobject/2.0/codegen/codegen.py", line 1672, in main o = override.Overrides(arg) File "/usr/share/pygobject/2.0/codegen/override.py", line 52, in __init__ self.handle_file(filename) File "/usr/share/pygobject/2.0/codegen/override.py", line 84, in handle_file self.__parse_override(buf, startline, filename) File "/usr/share/pygobject/2.0/codegen/override.py", line 96, in __parse_override command = words[0] IndexError: list index out of range Is this a bug in pygobject, or is something wrong with my setup? What is the best way to call code written in vala from python? EDIT: Removing the extra line fixed the current problem, but now as I proceed to build the python module, I am facing another problem. Adding the following C file to the existing two in the directory: test_module.c #include <Python.h> void test_register_classes (PyObject *d); extern PyMethodDef test_functions[]; DL_EXPORT(void) inittest(void) { PyObject *m, *d; init_pygobject(); m = Py_InitModule("test", test_functions); d = PyModule_GetDict(m); test_register_classes(d); if (PyErr_Occurred ()) { Py_FatalError ("can't initialise module test"); } } and building with the following script build.sh #/usr/bin/env bash valac test.vala -CH test.h python /usr/share/pygobject/2.0/codegen/h2def.py test.h > test.defs pygobject-codegen-2.0 -o test.override -p test test.defs > test_wrap.c CFLAGS="`pkg-config --cflags pygobject-2.0` -I/usr/include/python2.6/ -I." LDFLAGS="`pkg-config --libs pygobject-2.0`" gcc $CFLAGS -fPIC -c test.c gcc $CFLAGS -fPIC -c test_wrap.c gcc $CFLAGS -fPIC -c test_module.c gcc $LDFLAGS -shared test.o test_wrap.o test_module.o -o test.so python -c 'import test; exit()' results in an error: $ ./build.sh ***INFO*** The coverage of global functions is 100.00% (1/1) ***INFO*** The coverage of methods is 100.00% (1/1) ***INFO*** There are no declared virtual proxies. ***INFO*** There are no declared virtual accessors. ***INFO*** There are no declared interface proxies. Traceback (most recent call last): File "<string>", line 1, in <module> ImportError: ./test.so: undefined symbol: init_pygobject Where is the init_pygobject symbol defined? What have I missed linking to?

    Read the article

  • SQLite, python, unicode, and non-utf data

    - by Nathan Spears
    I started by trying to store strings in sqlite using python, and got the message: sqlite3.ProgrammingError: You must not use 8-bit bytestrings unless you use a text_factory that can interpret 8-bit bytestrings (like text_factory = str). It is highly recommended that you instead just switch your application to Unicode strings. Ok, I switched to Unicode strings. Then I started getting the message: sqlite3.OperationalError: Could not decode to UTF-8 column 'tag_artist' with text 'Sigur Rós' when trying to retrieve data from the db. More research and I started encoding it in utf8, but then 'Sigur Rós' starts looking like 'Sigur Rós' note: My console was set to display in 'latin_1' as @John Machin pointed out. What gives? After reading this, describing exactly the same situation I'm in, it seems as if the advice is to ignore the other advice and use 8-bit bytestrings after all. I didn't know much about unicode and utf before I started this process. I've learned quite a bit in the last couple hours, but I'm still ignorant of whether there is a way to correctly convert 'ó' from latin-1 to utf-8 and not mangle it. If there isn't, why would sqlite 'highly recommend' I switch my application to unicode strings? I'm going to update this question with a summary and some example code of everything I've learned in the last 24 hours so that someone in my shoes can have an easy(er) guide. If the information I post is wrong or misleading in any way please tell me and I'll update, or one of you senior guys can update. Summary of answers Let me first state the goal as I understand it. The goal in processing various encodings, if you are trying to convert between them, is to understand what your source encoding is, then convert it to unicode using that source encoding, then convert it to your desired encoding. Unicode is a base and encodings are mappings of subsets of that base. utf_8 has room for every character in unicode, but because they aren't in the same place as, for instance, latin_1, a string encoded in utf_8 and sent to a latin_1 console will not look the way you expect. In python the process of getting to unicode and into another encoding looks like: str.decode('source_encoding').encode('desired_encoding') or if the str is already in unicode str.encode('desired_encoding') For sqlite I didn't actually want to encode it again, I wanted to decode it and leave it in unicode format. Here are four things you might need to be aware of as you try to work with unicode and encodings in python. The encoding of the string you want to work with, and the encoding you want to get it to. The system encoding. The console encoding. The encoding of the source file Elaboration: (1) When you read a string from a source, it must have some encoding, like latin_1 or utf_8. In my case, I'm getting strings from filenames, so unfortunately, I could be getting any kind of encoding. Windows XP uses UCS-2 (a Unicode system) as its native string type, which seems like cheating to me. Fortunately for me, the characters in most filenames are not going to be made up of more than one source encoding type, and I think all of mine were either completely latin_1, completely utf_8, or just plain ascii (which is a subset of both of those). So I just read them and decoded them as if they were still in latin_1 or utf_8. It's possible, though, that you could have latin_1 and utf_8 and whatever other characters mixed together in a filename on Windows. Sometimes those characters can show up as boxes, other times they just look mangled, and other times they look correct (accented characters and whatnot). Moving on. (2) Python has a default system encoding that gets set when python starts and can't be changed during runtime. See here for details. Dirty summary ... well here's the file I added: \# sitecustomize.py \# this file can be anywhere in your Python path, \# but it usually goes in ${pythondir}/lib/site-packages/ import sys sys.setdefaultencoding('utf_8') This system encoding is the one that gets used when you use the unicode("str") function without any other encoding parameters. To say that another way, python tries to decode "str" to unicode based on the default system encoding. (3) If you're using IDLE or the command-line python, I think that your console will display according to the default system encoding. I am using pydev with eclipse for some reason, so I had to go into my project settings, edit the launch configuration properties of my test script, go to the Common tab, and change the console from latin-1 to utf-8 so that I could visually confirm what I was doing was working. (4) If you want to have some test strings, eg test_str = "ó" in your source code, then you will have to tell python what kind of encoding you are using in that file. (FYI: when I mistyped an encoding I had to ctrl-Z because my file became unreadable.) This is easily accomplished by putting a line like so at the top of your source code file: # -*- coding: utf_8 -*- If you don't have this information, python attempts to parse your code as ascii by default, and so: SyntaxError: Non-ASCII character '\xf3' in file _redacted_ on line 81, but no encoding declared; see http://www.python.org/peps/pep-0263.html for details Once your program is working correctly, or, if you aren't using python's console or any other console to look at output, then you will probably really only care about #1 on the list. System default and console encoding are not that important unless you need to look at output and/or you are using the builtin unicode() function (without any encoding parameters) instead of the string.decode() function. I wrote a demo function I will paste into the bottom of this gigantic mess that I hope correctly demonstrates the items in my list. Here is some of the output when I run the character 'ó' through the demo function, showing how various methods react to the character as input. My system encoding and console output are both set to utf_8 for this run: '?' = original char <type 'str'> repr(char)='\xf3' '?' = unicode(char) ERROR: 'utf8' codec can't decode byte 0xf3 in position 0: unexpected end of data 'ó' = char.decode('latin_1') <type 'unicode'> repr(char.decode('latin_1'))=u'\xf3' '?' = char.decode('utf_8') ERROR: 'utf8' codec can't decode byte 0xf3 in position 0: unexpected end of data Now I will change the system and console encoding to latin_1, and I get this output for the same input: 'ó' = original char <type 'str'> repr(char)='\xf3' 'ó' = unicode(char) <type 'unicode'> repr(unicode(char))=u'\xf3' 'ó' = char.decode('latin_1') <type 'unicode'> repr(char.decode('latin_1'))=u'\xf3' '?' = char.decode('utf_8') ERROR: 'utf8' codec can't decode byte 0xf3 in position 0: unexpected end of data Notice that the 'original' character displays correctly and the builtin unicode() function works now. Now I change my console output back to utf_8. '?' = original char <type 'str'> repr(char)='\xf3' '?' = unicode(char) <type 'unicode'> repr(unicode(char))=u'\xf3' '?' = char.decode('latin_1') <type 'unicode'> repr(char.decode('latin_1'))=u'\xf3' '?' = char.decode('utf_8') ERROR: 'utf8' codec can't decode byte 0xf3 in position 0: unexpected end of data Here everything still works the same as last time but the console can't display the output correctly. Etc. The function below also displays more information that this and hopefully would help someone figure out where the gap in their understanding is. I know all this information is in other places and more thoroughly dealt with there, but I hope that this would be a good kickoff point for someone trying to get coding with python and/or sqlite. Ideas are great but sometimes source code can save you a day or two of trying to figure out what functions do what. Disclaimers: I'm no encoding expert, I put this together to help my own understanding. I kept building on it when I should have probably started passing functions as arguments to avoid so much redundant code, so if I can I'll make it more concise. Also, utf_8 and latin_1 are by no means the only encoding schemes, they are just the two I was playing around with because I think they handle everything I need. Add your own encoding schemes to the demo function and test your own input. One more thing: there are apparently crazy application developers making life difficult in Windows. #!/usr/bin/env python # -*- coding: utf_8 -*- import os import sys def encodingDemo(str): validStrings = () try: print "str =",str,"{0} repr(str) = {1}".format(type(str), repr(str)) validStrings += ((str,""),) except UnicodeEncodeError as ude: print "Couldn't print the str itself because the console is set to an encoding that doesn't understand some character in the string. See error:\n\t", print ude try: x = unicode(str) print "unicode(str) = ",x validStrings+= ((x, " decoded into unicode by the default system encoding"),) except UnicodeDecodeError as ude: print "ERROR. unicode(str) couldn't decode the string because the system encoding is set to an encoding that doesn't understand some character in the string." print "\tThe system encoding is set to {0}. See error:\n\t".format(sys.getdefaultencoding()), print ude except UnicodeEncodeError as uee: print "ERROR. Couldn't print the unicode(str) because the console is set to an encoding that doesn't understand some character in the string. See error:\n\t", print uee try: x = str.decode('latin_1') print "str.decode('latin_1') =",x validStrings+= ((x, " decoded with latin_1 into unicode"),) try: print "str.decode('latin_1').encode('utf_8') =",str.decode('latin_1').encode('utf_8') validStrings+= ((x, " decoded with latin_1 into unicode and encoded into utf_8"),) except UnicodeDecodeError as ude: print "The string was decoded into unicode using the latin_1 encoding, but couldn't be encoded into utf_8. See error:\n\t", print ude except UnicodeDecodeError as ude: print "Something didn't work, probably because the string wasn't latin_1 encoded. See error:\n\t", print ude except UnicodeEncodeError as uee: print "ERROR. Couldn't print the str.decode('latin_1') because the console is set to an encoding that doesn't understand some character in the string. See error:\n\t", print uee try: x = str.decode('utf_8') print "str.decode('utf_8') =",x validStrings+= ((x, " decoded with utf_8 into unicode"),) try: print "str.decode('utf_8').encode('latin_1') =",str.decode('utf_8').encode('latin_1') except UnicodeDecodeError as ude: print "str.decode('utf_8').encode('latin_1') didn't work. The string was decoded into unicode using the utf_8 encoding, but couldn't be encoded into latin_1. See error:\n\t", validStrings+= ((x, " decoded with utf_8 into unicode and encoded into latin_1"),) print ude except UnicodeDecodeError as ude: print "str.decode('utf_8') didn't work, probably because the string wasn't utf_8 encoded. See error:\n\t", print ude except UnicodeEncodeError as uee: print "ERROR. Couldn't print the str.decode('utf_8') because the console is set to an encoding that doesn't understand some character in the string. See error:\n\t",uee print print "Printing information about each character in the original string." for char in str: try: print "\t'" + char + "' = original char {0} repr(char)={1}".format(type(char), repr(char)) except UnicodeDecodeError as ude: print "\t'?' = original char {0} repr(char)={1} ERROR PRINTING: {2}".format(type(char), repr(char), ude) except UnicodeEncodeError as uee: print "\t'?' = original char {0} repr(char)={1} ERROR PRINTING: {2}".format(type(char), repr(char), uee) print uee try: x = unicode(char) print "\t'" + x + "' = unicode(char) {1} repr(unicode(char))={2}".format(x, type(x), repr(x)) except UnicodeDecodeError as ude: print "\t'?' = unicode(char) ERROR: {0}".format(ude) except UnicodeEncodeError as uee: print "\t'?' = unicode(char) {0} repr(char)={1} ERROR PRINTING: {2}".format(type(x), repr(x), uee) try: x = char.decode('latin_1') print "\t'" + x + "' = char.decode('latin_1') {1} repr(char.decode('latin_1'))={2}".format(x, type(x), repr(x)) except UnicodeDecodeError as ude: print "\t'?' = char.decode('latin_1') ERROR: {0}".format(ude) except UnicodeEncodeError as uee: print "\t'?' = char.decode('latin_1') {0} repr(char)={1} ERROR PRINTING: {2}".format(type(x), repr(x), uee) try: x = char.decode('utf_8') print "\t'" + x + "' = char.decode('utf_8') {1} repr(char.decode('utf_8'))={2}".format(x, type(x), repr(x)) except UnicodeDecodeError as ude: print "\t'?' = char.decode('utf_8') ERROR: {0}".format(ude) except UnicodeEncodeError as uee: print "\t'?' = char.decode('utf_8') {0} repr(char)={1} ERROR PRINTING: {2}".format(type(x), repr(x), uee) print x = 'ó' encodingDemo(x) Much thanks for the answers below and especially to @John Machin for answering so thoroughly.

    Read the article

  • SQLite, python, unicode, and non-utf data

    - by Nathan Spears
    I started by trying to store strings in sqlite using python, and got the message: sqlite3.ProgrammingError: You must not use 8-bit bytestrings unless you use a text_factory that can interpret 8-bit bytestrings (like text_factory = str). It is highly recommended that you instead just switch your application to Unicode strings. Ok, I switched to Unicode strings. Then I started getting the message: sqlite3.OperationalError: Could not decode to UTF-8 column 'tag_artist' with text 'Sigur Rós' when trying to retrieve data from the db. More research and I started encoding it in utf8, but then 'Sigur Rós' starts looking like 'Sigur Rós' note: My console was set to display in 'latin_1' as @John Machin pointed out. What gives? After reading this, describing exactly the same situation I'm in, it seems as if the advice is to ignore the other advice and use 8-bit bytestrings after all. I didn't know much about unicode and utf before I started this process. I've learned quite a bit in the last couple hours, but I'm still ignorant of whether there is a way to correctly convert 'ó' from latin-1 to utf-8 and not mangle it. If there isn't, why would sqlite 'highly recommend' I switch my application to unicode strings? I'm going to update this question with a summary and some example code of everything I've learned in the last 24 hours so that someone in my shoes can have an easy(er) guide. If the information I post is wrong or misleading in any way please tell me and I'll update, or one of you senior guys can update. Summary of answers Let me first state the goal as I understand it. The goal in processing various encodings, if you are trying to convert between them, is to understand what your source encoding is, then convert it to unicode using that source encoding, then convert it to your desired encoding. Unicode is a base and encodings are mappings of subsets of that base. utf_8 has room for every character in unicode, but because they aren't in the same place as, for instance, latin_1, a string encoded in utf_8 and sent to a latin_1 console will not look the way you expect. In python the process of getting to unicode and into another encoding looks like: str.decode('source_encoding').encode('desired_encoding') or if the str is already in unicode str.encode('desired_encoding') For sqlite I didn't actually want to encode it again, I wanted to decode it and leave it in unicode format. Here are four things you might need to be aware of as you try to work with unicode and encodings in python. The encoding of the string you want to work with, and the encoding you want to get it to. The system encoding. The console encoding. The encoding of the source file Elaboration: (1) When you read a string from a source, it must have some encoding, like latin_1 or utf_8. In my case, I'm getting strings from filenames, so unfortunately, I could be getting any kind of encoding. Windows XP uses UCS-2 (a Unicode system) as its native string type, which seems like cheating to me. Fortunately for me, the characters in most filenames are not going to be made up of more than one source encoding type, and I think all of mine were either completely latin_1, completely utf_8, or just plain ascii (which is a subset of both of those). So I just read them and decoded them as if they were still in latin_1 or utf_8. It's possible, though, that you could have latin_1 and utf_8 and whatever other characters mixed together in a filename on Windows. Sometimes those characters can show up as boxes, other times they just look mangled, and other times they look correct (accented characters and whatnot). Moving on. (2) Python has a default system encoding that gets set when python starts and can't be changed during runtime. See here for details. Dirty summary ... well here's the file I added: \# sitecustomize.py \# this file can be anywhere in your Python path, \# but it usually goes in ${pythondir}/lib/site-packages/ import sys sys.setdefaultencoding('utf_8') This system encoding is the one that gets used when you use the unicode("str") function without any other encoding parameters. To say that another way, python tries to decode "str" to unicode based on the default system encoding. (3) If you're using IDLE or the command-line python, I think that your console will display according to the default system encoding. I am using pydev with eclipse for some reason, so I had to go into my project settings, edit the launch configuration properties of my test script, go to the Common tab, and change the console from latin-1 to utf-8 so that I could visually confirm what I was doing was working. (4) If you want to have some test strings, eg test_str = "ó" in your source code, then you will have to tell python what kind of encoding you are using in that file. (FYI: when I mistyped an encoding I had to ctrl-Z because my file became unreadable.) This is easily accomplished by putting a line like so at the top of your source code file: # -*- coding: utf_8 -*- If you don't have this information, python attempts to parse your code as ascii by default, and so: SyntaxError: Non-ASCII character '\xf3' in file _redacted_ on line 81, but no encoding declared; see http://www.python.org/peps/pep-0263.html for details Once your program is working correctly, or, if you aren't using python's console or any other console to look at output, then you will probably really only care about #1 on the list. System default and console encoding are not that important unless you need to look at output and/or you are using the builtin unicode() function (without any encoding parameters) instead of the string.decode() function. I wrote a demo function I will paste into the bottom of this gigantic mess that I hope correctly demonstrates the items in my list. Here is some of the output when I run the character 'ó' through the demo function, showing how various methods react to the character as input. My system encoding and console output are both set to utf_8 for this run: '?' = original char <type 'str'> repr(char)='\xf3' '?' = unicode(char) ERROR: 'utf8' codec can't decode byte 0xf3 in position 0: unexpected end of data 'ó' = char.decode('latin_1') <type 'unicode'> repr(char.decode('latin_1'))=u'\xf3' '?' = char.decode('utf_8') ERROR: 'utf8' codec can't decode byte 0xf3 in position 0: unexpected end of data Now I will change the system and console encoding to latin_1, and I get this output for the same input: 'ó' = original char <type 'str'> repr(char)='\xf3' 'ó' = unicode(char) <type 'unicode'> repr(unicode(char))=u'\xf3' 'ó' = char.decode('latin_1') <type 'unicode'> repr(char.decode('latin_1'))=u'\xf3' '?' = char.decode('utf_8') ERROR: 'utf8' codec can't decode byte 0xf3 in position 0: unexpected end of data Notice that the 'original' character displays correctly and the builtin unicode() function works now. Now I change my console output back to utf_8. '?' = original char <type 'str'> repr(char)='\xf3' '?' = unicode(char) <type 'unicode'> repr(unicode(char))=u'\xf3' '?' = char.decode('latin_1') <type 'unicode'> repr(char.decode('latin_1'))=u'\xf3' '?' = char.decode('utf_8') ERROR: 'utf8' codec can't decode byte 0xf3 in position 0: unexpected end of data Here everything still works the same as last time but the console can't display the output correctly. Etc. The function below also displays more information that this and hopefully would help someone figure out where the gap in their understanding is. I know all this information is in other places and more thoroughly dealt with there, but I hope that this would be a good kickoff point for someone trying to get coding with python and/or sqlite. Ideas are great but sometimes source code can save you a day or two of trying to figure out what functions do what. Disclaimers: I'm no encoding expert, I put this together to help my own understanding. I kept building on it when I should have probably started passing functions as arguments to avoid so much redundant code, so if I can I'll make it more concise. Also, utf_8 and latin_1 are by no means the only encoding schemes, they are just the two I was playing around with because I think they handle everything I need. Add your own encoding schemes to the demo function and test your own input. One more thing: there are apparently crazy application developers making life difficult in Windows. #!/usr/bin/env python # -*- coding: utf_8 -*- import os import sys def encodingDemo(str): validStrings = () try: print "str =",str,"{0} repr(str) = {1}".format(type(str), repr(str)) validStrings += ((str,""),) except UnicodeEncodeError as ude: print "Couldn't print the str itself because the console is set to an encoding that doesn't understand some character in the string. See error:\n\t", print ude try: x = unicode(str) print "unicode(str) = ",x validStrings+= ((x, " decoded into unicode by the default system encoding"),) except UnicodeDecodeError as ude: print "ERROR. unicode(str) couldn't decode the string because the system encoding is set to an encoding that doesn't understand some character in the string." print "\tThe system encoding is set to {0}. See error:\n\t".format(sys.getdefaultencoding()), print ude except UnicodeEncodeError as uee: print "ERROR. Couldn't print the unicode(str) because the console is set to an encoding that doesn't understand some character in the string. See error:\n\t", print uee try: x = str.decode('latin_1') print "str.decode('latin_1') =",x validStrings+= ((x, " decoded with latin_1 into unicode"),) try: print "str.decode('latin_1').encode('utf_8') =",str.decode('latin_1').encode('utf_8') validStrings+= ((x, " decoded with latin_1 into unicode and encoded into utf_8"),) except UnicodeDecodeError as ude: print "The string was decoded into unicode using the latin_1 encoding, but couldn't be encoded into utf_8. See error:\n\t", print ude except UnicodeDecodeError as ude: print "Something didn't work, probably because the string wasn't latin_1 encoded. See error:\n\t", print ude except UnicodeEncodeError as uee: print "ERROR. Couldn't print the str.decode('latin_1') because the console is set to an encoding that doesn't understand some character in the string. See error:\n\t", print uee try: x = str.decode('utf_8') print "str.decode('utf_8') =",x validStrings+= ((x, " decoded with utf_8 into unicode"),) try: print "str.decode('utf_8').encode('latin_1') =",str.decode('utf_8').encode('latin_1') except UnicodeDecodeError as ude: print "str.decode('utf_8').encode('latin_1') didn't work. The string was decoded into unicode using the utf_8 encoding, but couldn't be encoded into latin_1. See error:\n\t", validStrings+= ((x, " decoded with utf_8 into unicode and encoded into latin_1"),) print ude except UnicodeDecodeError as ude: print "str.decode('utf_8') didn't work, probably because the string wasn't utf_8 encoded. See error:\n\t", print ude except UnicodeEncodeError as uee: print "ERROR. Couldn't print the str.decode('utf_8') because the console is set to an encoding that doesn't understand some character in the string. See error:\n\t",uee print print "Printing information about each character in the original string." for char in str: try: print "\t'" + char + "' = original char {0} repr(char)={1}".format(type(char), repr(char)) except UnicodeDecodeError as ude: print "\t'?' = original char {0} repr(char)={1} ERROR PRINTING: {2}".format(type(char), repr(char), ude) except UnicodeEncodeError as uee: print "\t'?' = original char {0} repr(char)={1} ERROR PRINTING: {2}".format(type(char), repr(char), uee) print uee try: x = unicode(char) print "\t'" + x + "' = unicode(char) {1} repr(unicode(char))={2}".format(x, type(x), repr(x)) except UnicodeDecodeError as ude: print "\t'?' = unicode(char) ERROR: {0}".format(ude) except UnicodeEncodeError as uee: print "\t'?' = unicode(char) {0} repr(char)={1} ERROR PRINTING: {2}".format(type(x), repr(x), uee) try: x = char.decode('latin_1') print "\t'" + x + "' = char.decode('latin_1') {1} repr(char.decode('latin_1'))={2}".format(x, type(x), repr(x)) except UnicodeDecodeError as ude: print "\t'?' = char.decode('latin_1') ERROR: {0}".format(ude) except UnicodeEncodeError as uee: print "\t'?' = char.decode('latin_1') {0} repr(char)={1} ERROR PRINTING: {2}".format(type(x), repr(x), uee) try: x = char.decode('utf_8') print "\t'" + x + "' = char.decode('utf_8') {1} repr(char.decode('utf_8'))={2}".format(x, type(x), repr(x)) except UnicodeDecodeError as ude: print "\t'?' = char.decode('utf_8') ERROR: {0}".format(ude) except UnicodeEncodeError as uee: print "\t'?' = char.decode('utf_8') {0} repr(char)={1} ERROR PRINTING: {2}".format(type(x), repr(x), uee) print x = 'ó' encodingDemo(x) Much thanks for the answers below and especially to @John Machin for answering so thoroughly.

    Read the article

  • What's the best way to create a static utility class in python? Is using metaclasses code smell?

    - by rsimp
    Ok so I need to create a bunch of utility classes in python. Normally I would just use a simple module for this but I need to be able to inherit in order to share common code between them. The common code needs to reference the state of the module using it so simple imports wouldn't work well. I don't like singletons, and classes that use the classmethod decorator do not have proper support for python properties. One pattern I see used a lot is creating an internal python class prefixed with an underscore and creating a single instance which is then explicitly imported or set as the module itself. This is also used by fabric to create a common environment object (fabric.api.env). I've realized another way to accomplish this would be with metaclasses. For example: #util.py class MetaFooBase(type): @property def file_path(cls): raise NotImplementedError def inherited_method(cls): print cls.file_path #foo.py from util import * import env class MetaFoo(MetaFooBase): @property def file_path(cls): return env.base_path + "relative/path" def another_class_method(cls): pass class Foo(object): __metaclass__ = MetaFoo #client.py from foo import Foo file_path = Foo.file_path I like this approach better than the first pattern for a few reasons: First, instantiating Foo would be meaningless as it has no attributes or methods, which insures this class acts like a true single interface utility, unlike the first pattern which relies on the underscore convention to dissuade client code from creating more instances of the internal class. Second, sub-classing MetaFoo in a different module wouldn't be as awkward because I wouldn't be importing a class with an underscore which is inherently going against its private naming convention. Third, this seems to be the closest approximation to a static class that exists in python, as all the meta code applies only to the class and not to its instances. This is shown by the common convention of using cls instead of self in the class methods. As well, the base class inherits from type instead of object which would prevent users from trying to use it as a base for other non-static classes. It's implementation as a static class is also apparent when using it by the naming convention Foo, as opposed to foo, which denotes a static class method is being used. As much as I think this is a good fit, I feel that others might feel its not pythonic because its not a sanctioned use for metaclasses which should be avoided 99% of the time. I also find most python devs tend to shy away from metaclasses which might affect code reuse/maintainability. Is this code considered code smell in the python community? I ask because I'm creating a pypi package, and would like to do everything I can to increase adoption.

    Read the article

< Previous Page | 23 24 25 26 27 28 29 30 31 32 33 34  | Next Page >