Search Results

Search found 15629 results on 626 pages for 'mod python'.

Page 82/626 | < Previous Page | 78 79 80 81 82 83 84 85 86 87 88 89 | Next Page >

Converting python collaborative filtering code to use Map Reduce

- by Neil Kodner

Using Python, I'm computing cosine similarity across items. given event data that represents a purchase (user,item), I have a list of all items 'bought' by my users. Given this input data (user,item) X,1 X,2 Y,1 Y,2 Z,2 Z,3 I build a python dictionary {1: ['X','Y'], 2 : ['X','Y','Z'], 3 : ['Z']} From that dictionary, I generate a bought/not bought matrix, also another dictionary(bnb). {1 : [1,1,0], 2 : [1,1,1], 3 : [0,0,1]} From there, I'm computing similarity between (1,2) by calculating cosine between (1,1,0) and (1,1,1), yielding 0.816496 I'm doing this by: items=[1,2,3] for item in items: for sub in items: if sub >= item: #as to not calculate similarity on the inverse sim = coSim( bnb[item], bnb[sub] ) I think the brute force approach is killing me and it only runs slower as the data gets larger. Using my trusty laptop, this calculation runs for hours when dealing with 8500 users and 3500 items. I'm trying to compute similarity for all items in my dict and it's taking longer than I'd like it to. I think this is a good candidate for MapReduce but I'm having trouble 'thinking' in terms of key/value pairs. Alternatively, is the issue with my approach and not necessarily a candidate for Map Reduce?

Read the article
Replicating SQL's 'Join' in Python

- by Daniel Mathews

I'm in the process of trying to switch from R to Python (mainly issues around general flexibility). With Numpy, matplotlib and ipython, I've am able to cover all my use cases save for merging 'datasets'. I would like to simulate SQL's join by clause (inner, outer, full) purely in python. R handles this with the 'merge' function. I've tried the numpy.lib.recfunctions join_by, but it critical issues with duplicates along the 'key': join_by(key, r1, r2, jointype='inner', r1postfix='1', r2postfix='2', defaults=None, usemask=True, asrecarray=False) Join arrays r1 and r2 on key key. The key should be either a string or a sequence of string corresponding to the fields used to join the array. An exception is raised if the key field cannot be found in the two input arrays. Neither r1 nor r2 should have any duplicates along key: the presence of duplicates will make the output quite unreliable. Note that duplicates are not looked for by the algorithm. source: http://presbrey.mit.edu:1234/numpy.lib.recfunctions.html Any pointers or help will be most appreciated!

Read the article
Decoding tcp packets using python

- by mikip

Hello I am trying to decode data received over a tcp connection. The packets are small, no more than 100 bytes. However when there is a lot of them I receive some of the the packets joined together. Is there a way to prevent this. I am using python I have tried to separate the packets, my source is below. The packets start with STX byte and end with ETX bytes, the byte following the STX is the packet length, (packet lengths less than 5 are invalid) the checksum is the last bytes before the ETX def decode(data): while True: start = data.find(STX) if start == -1: #no stx in message pkt = '' data = '' break #stx found , next byte is the length pktlen = ord(data[1]) #check message ends in ETX (pktken -1) or checksum invalid if pktlen < 5 or data[pktlen-1] != ETX or checksum_valid(data[start:pktlen]) == False: print "Invalid Pkt" data = data[start+1:] continue else: pkt = data[start:pktlen] data = data[pktlen:] break return data , pkt I use it like this #process reports try: data = sock.recv(256) except: continue else: while data: data, pkt = decode(data) if pkt: process(pkt) Also if there are multiple packets in the data stream, is it best to return the packets as a collection of lists or just return the first packet I am not that familiar with python, only C, is this method OK. Any advice would be most appreciated. Thanks in advance Thanks

Read the article
Reading UTF-8 XML and writing it to a file with Python

- by Harri

I'm trying to parse UTF-8 XML file and save some parts of it to another file. Problem is, that this is my first Python script ever and I'm totally confused about the character encoding problems I'm finding. My script fails immediately when it tries to write non-ascii character to a file, but it can print it to command prompt (at least in some level) Here's the XML (from the parts that matter at least, it's a *.resx file which contains UI strings) <?xml version="1.0" encoding="utf-8"?> <root> <resheader name="foo"> <value>bar</value> </resheader> <data name="lorem" xml:space="preserve"> <value>ipsum öä</value> </data> </root> And here's my python script from xml.dom.minidom import parse names = [] values = [] def getStrings(path): dom = parse(path) data = dom.getElementsByTagName("data") for i in range(len(data)): name = data[i].getAttribute("name") names.append(name) value = data[i].getElementsByTagName("value") values.append(value[0].firstChild.nodeValue.encode("utf-8")) def writeToFile(): with open("uiStrings-fi.py", "w") as f: for i in range(len(names)): line = names[i] + '="'+ values[i] + '"' #varName='varValue' f.write(line) f.write("\n") getStrings("ResourceFile.fi-FI.resx") writeToFile() And here's the traceback: Traceback (most recent call last): File "GenerateLanguageFiles.py", line 24, in writeToFile() File "GenerateLanguageFiles.py", line 19, in writeToFile line = names[i] + '="'+ values[i] + '"' #varName='varValue' UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 2: ordinal not in ran ge(128) How should I fix my script so it would read and write UTF-8 characters properly? The files I'm trying to generate would be used in test automation with Robots Framework.

Read the article
Python: puzzling behaviour inside httplib

- by Anna

I have added one line ( import pdb; pdb.set_trace() ) to httplib's HTTPConnection.putheader, so I can see what's going on inside. httplib.py, line 489: def putheader(self, header, value): """Send a request header line to the server. For example: h.putheader('Accept', 'text/html') """ import pdb; pdb.set_trace() if self.__state != _CS_REQ_STARTED: raise CannotSendHeader() str = '%s: %s' % (header, value) self._output(str) then ran this from the interpreter import urllib2 urllib2.urlopen('http://www.ioerror.us/ip/headers') ... and as expected PDB kicks in: > c:\python26\lib\httplib.py(858)putheader() -> if self.__state != _CS_REQ_STARTED: (Pdb) in PDB I have the luxury of evaluating expressions on the fly, so I have tried to enter self.__state: (Pdb) self.__state *** AttributeError: HTTPConnection instance has no attribute '__state' Alas, there is no __state of this instance. However when I enter step, the debugger gets past the if self.__state != _CS_REQ_STARTED: line without a problem. Why is this happening? If the self.__state doesn't exist python would have to raise an exception as it did when I entered the expression. Python version: 2.6.4 on win32

Read the article
Base class deleted before subclass during python __del__ processing

- by Oddthinking

Context I am aware that if I ask a question about Python destructors, the standard argument will be to use contexts instead. Let me start by explaining why I am not doing that. I am writing a subclass to logging.Handler. When an instance is closed, it posts a sentinel value to a Queue.Queue. If it doesn't, a second thread will be left running forever, waiting for Queue.Queue.get() to complete. I am writing this with other developers in mind, so I don't want a failure to call close() on a handler object to cause the program to hang. Therefore, I am adding a check in __del__() to ensure the object was closed properly. I understand circular references may cause it to fail in some circumstances. There's not a lot I can do about that. Problem Here is some simple example code: explicit_delete = True class Base: def __del__(self): print "Base class cleaning up." class Sub(Base): def __del__(self): print "Sub-class cleaning up." Base.__del__(self) x = Sub() if explicit_delete: del x print "End of thread" When I run this I get, as expected: Sub-class cleaning up. Base class cleaning up. End of thread If I set explicit_delete to False in the first line, I get: End of thread Sub-class cleaning up. Exception AttributeError: "'NoneType' object has no attribute '__del__'" in <bound method Sub.__del__ of <__main__.Sub instance at 0x00F0B698>> ignored It seems the definition of Base is removed before the x._del_() is called. The Python Documentation on _del_() warns that the subclass needs to call the base-class to get a clean deletion, but here that appears to be impossible. Can you see where I made a bad step?

Read the article
Python template engine

- by jturo

Hello there, Could it be possible if somebody could help me get started in writing a python template engine? I'm new to python and as I learn the language I've managed to write a little MVC framework running in its own light-weight-WSGI-like server. I've managed to write a script that finds and replaces keys for values: (Obviously this is not how my script is structured or implemented. this is just an example) from string import Template html = '<html>\n' html += ' <head>\n' html += ' <title>This is so Cool : In Controller HTML</title>\n' html += ' </head>\n' html += ' <body>\n' html += ' Home | <a href="/hi">Hi ${name}</a>\n' html += ' </body>\n' html += '<html>' Template(html).safe_substitute(dict(name = 'Arturo')) My next goal is to implement custom statements, modifiers, functions, etc (like 'for' loop) but i don't really know if i should use another module that i don't know about. I thought of regular expressions but i kind feel like that wouldn't be an efficient way of doing it Any help is appreciated, and i'm sure this will be of use to other people too. Thank you

Read the article
S3 Backup Memory Usage in Python

- by danpalmer

I currently use WebFaction for my hosting with the basic package that gives us 80MB of RAM. This is more than adequate for our needs at the moment, apart from our backups. We do our own backups to S3 once a day. The backup process is this: dump the database, tar.gz all the files into one backup named with the correct date of the backup, upload to S3 using the python library provided by Amazon. Unfortunately, it appears (although I don't know this for certain) that either my code for reading the file or the S3 code is loading the entire file in to memory. As the file is approximately 320MB (for today's backup) it is using about 320MB just for the backup. This causes WebFaction to quit all our processes meaning the backup doesn't happen and our site goes down. So this is the question: Is there any way to not load the whole file in to memory, or are there any other python S3 libraries that are much better with RAM usage. Ideally it needs to be about 60MB at the most! If this can't be done, how can I split the file and upload separate parts? Thanks for your help. This is the section of code (in my backup script) that caused the processes to be quit: filedata = open(filename, 'rb').read() content_type = mimetypes.guess_type(filename)[0] if not content_type: content_type = 'text/plain' print 'Uploading to S3...' response = connection.put(BUCKET_NAME, 'daily/%s' % filename, S3.S3Object(filedata), {'x-amz-acl': 'public-read', 'Content-Type': content_type})

Read the article
python 'with' statement

- by Stephen

Hi, I'm using Python 2.5. I'm trying to use this 'with' statement. from __future__ import with_statement a = [] with open('exampletxt.txt','r') as f: while True: a.append(f.next().strip().split()) print a The contents of 'exampletxt.txt' are simple: a b In this case, I get the error: Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/tmp/python-7036sVf.py", line 5, in <module> a.append(f.next().strip().split()) StopIteration And if I replace f.next() with f.read(), it seems to be caught in an infinite loop. I wonder if I have to write a decorator class that accepts the iterator object as an argument, and define an __exit__ method for it? I know it's more pythonic to use a for-loop for iterators, but I wanted to implement a while loop within a generator that's called by a for-loop... something like def g(f): while True: x = f.next() if test(x): a = x elif test(x): b = f.next() yield [a,x,b] a = [] with open(filename) as f: for x in g(f): a.append(x)

Read the article
deepcopy and python - tips to avoid using it?

- by blackkettle

Hi, I have a very simple python routine that involves cycling through a list of roughly 20,000 latitude,longitude coordinates and calculating the distance of each point to a reference point. def compute_nearest_points( lat, lon, nPoints=5 ): """Find the nearest N points, given the input coordinates.""" points = session.query(PointIndex).all() oldNearest = [] newNearest = [] for n in xrange(nPoints): oldNearest.append(PointDistance(None,None,None,99999.0,99999.0)) newNearest.append(obj2) #This is almost certainly an inappropriate use of deepcopy # but how SHOULD I be doing this?!?! for point in points: distance = compute_spherical_law_of_cosines( lat, lon, point.avg_lat, point.avg_lon ) k = 0 for p in oldNearest: if distance < p.distance: newNearest[k] = PointDistance( point.point, point.kana, point.english, point.avg_lat, point.avg_lon, distance=distance ) break else: newNearest[k] = deepcopy(oldNearest[k]) k += 1 for j in range(k,nPoints-1): newNearest[j+1] = deepcopy(oldNearest[j]) oldNearest = deepcopy(newNearest) #We're done, now print the result for point in oldNearest: print point.station, point.english, point.distance return I initially wrote this in C, using the exact same approach, and it works fine there, and is basically instantaneous for nPoints<=100. So I decided to port it to python because I wanted to use SqlAlchemy to do some other stuff. I first ported it without the deepcopy statements that now pepper the method, and this caused the results to be 'odd', or partially incorrect, because some of the points were just getting copied as references(I guess? I think?) -- but it was still pretty nearly as fast as the C version. Now with the deepcopy calls added, the routine does it's job correctly, but it has incurred an extreme performance penalty, and now takes several seconds to do the same job. This seems like a pretty common job, but I'm clearly not doing it the pythonic way. How should I be doing this so that I still get the correct results but don't have to include deepcopy everywhere?

Read the article
Speed vs security vs compatibility over methods to do string concatenation in Python

- by Cawas

Similar questions have been brought (good speed comparison there) on this same subject. Hopefully this question is different and updated to Python 2.6 and 3.0. So far I believe the faster and most compatible method (among different Python versions) is the plain simple + sign: text = "whatever" + " you " + SAY But I keep hearing and reading it's not secure and / or advisable. I'm not even sure how many methods are there to manipulate strings! I could count only about 4: There's interpolation and all its sub-options such as % and format and then there's the simple ones, join and +. Finally, the new approach to string formatting, which is with format, is certainly not good for backwards compatibility at same time making % not good for forward compatibility. But should it be used for every string manipulation, including every concatenation, whenever we restrict ourselves to 3.x only? Well, maybe this is more of a wiki than a question, but I do wish to have an answer on which is the proper usage of each string manipulation method. And which one could be generally used with each focus in mind (best all around for compatibility, for speed and for security). Thanks.

Read the article
[Python] name 'OptionGroup' is not defined

- by Cawas

Ok, so I made this rookie mistake below, but in my defense I was led to it thanks to how the help about this subject is on python docs, which states how to use optparse. It is actually an error under the gigantic tutorial section. In contrast and to my offense, I may be one of the very few stupid people who can't read very well and pay close attention on what I do. But since this took me so long to discover, I wanted to "document" it here: Python 2.6.1 (r261:67515, Feb 11 2010, 00:51:29) >>> from optparse import OptionParser >>> outputGroup = OptionGroup(parser, 'Output handling') Traceback (most recent call last): File "<stdin>", line 1, in <module> NameError: name 'OptionGroup' is not defined This is strictly done with examples found on the docs, and you can't find anything about it anywhere, be it that long long docs page, google or stackoverflow. Plus, reading optparse.py shows OptionGroup is there, so that adds to the confusion. I bet it will take less than 1 minute for someone to spot my error. For that I'll only add proper tags and / or modify the title later on. :)

Read the article
Python OOP and lists

- by Mikk

Hi, I'm new to Python and it's OOP stuff and can't get it to work. Here's my code: class Tree: root = None; data = []; def __init__(self, equation): self.root = equation; def appendLeft(self, data): self.data.insert(0, data); def appendRight(self, data): self.data.append(data); def calculateLeft(self): result = []; for item in (self.getLeft()): if (type(item) == type(self)): data = item.calculateLeft(); else: data = item; result.append(item); return result; def getLeft(self): return self.data; def getRight(self): data = self.data; data.reverse(); return data; tree2 = Tree("*"); tree2.appendRight(44); tree2.appendLeft(20); tree = Tree("+"); tree.appendRight(4); tree.appendLeft(10); tree.appendLeft(tree2); print(tree.calculateLeft()); It looks like tree2 and tree are sharing list "data"? At the moment I'd like it to output something like [[20,44], 10, 4], but when I tree.appendLeft(tree2) I get RuntimeError: maximum recursion depth exceeded, and when i even won't appendLeft(tree2) it outputs [10, 20, 44, 4] (!!!). What am I missing here? I'm using Portable Python 3.0.1. Thank you

Read the article
How to write a custom solution using a python package, modules etc

- by morpheous

I am writing a packacge foobar which consists of the modules alice, bob, charles and david. From my understanding of Python packages and modules, this means I will create a folder foobar, with the following subdirectories and files (please correct if I am wrong) foobar/ __init__.py alice/alice.py bob/bob.py charles/charles.py david/david.py The package should be executable, so that in addition to making the modules alice, bob etc available as 'libraries', I should also be able to use foobar in a script like this: python foobar --args=someargs Question1: Can a package be made executable and used in a script like I described above? Question 2 The various modules will use code that I want to refactor into a common library. Does that mean creating a new sub directory 'foobar/common' and placing common.py in that folder? Question 3 How will the modules foo import the common module ? Is it 'from foobar import common' or can I not use this since these modules are part of the package? Question 4 I want to add logic for when the foobar package is being used in a script (assuming this can be done - I have only seen it done for modules) The code used is something like: if __name__ == "__main__": dosomething() where (in which file) would I put this logic ?

Read the article
python histogram one-liner

- by mykhal

there are many ways, how to code histogram in Python. by histogram, i mean function, counting objects in an interable, resulting in the count table (i.e. dict). e.g.: >>> L = 'abracadabra' >>> histogram(L) {'a': 5, 'b': 2, 'c': 1, 'd': 1, 'r': 2} it can be written like this: def histogram(L): d = {} for x in L: if x in d: d[x] += 1 else: d[x] = 1 return d ..however, there are much less ways, how do this in a single expression. if we had "dict comprehensions" in python, we would write: >>> { x: L.count(x) for x in set(L) } but we don't have them, so we have to write: >>> dict([(x, L.count(x)) for x in set(L)]) however, this approach may yet be readable, but is not efficient - L is walked-through multiple times, so this won't work for single-life generators.. the function should iterate well also through gen(), where: def gen(): for x in L: yield x we can go with reduce (R.I.P.): >>> reduce(lambda d,x: dict(d, x=d.get(x,0)+1), L, {}) # wrong! oops, does not work, the key name is 'x', not x :( i ended with: >>> reduce(lambda d,x: dict(d.items() + [(x, d.get(x, 0)+1)]), L, {}) (in py3k, we would have to write list(d.items()) instead of d.items(), but it's hypothethical, since there is no reduce there) please beat me with a better one-liner, more readable! ;)

Read the article
Python DictReader - Skipping rows with missing columns?

- by victorhooi

heya, I have a Excel .CSV file I'm attempting to read in with DictReader. All seems to be well, except it seems to omit rows, specifically those with missing columns. Our input looks like: mail,givenName,sn,lorem,ipsum,dolor,telephoneNumber [email protected],ian,bay,3424,8403,2535,+65(2)34523534545 [email protected],mike,gibson,3424,8403,2535,+65(2)34523534545 [email protected],ross,martin,,,,+65(2)34523534545 [email protected],david,connor,,,,+65(2)34523534545 [email protected],chris,call,3424,8403,2535,+65(2)34523534545 So some of the rows have missing lorem/ipsum/dolor columns, and it's just a string of commas for those. We're reading it in with: def read_gd_dump(input_file="blah 20100423.csv"): gd_extract = csv.DictReader(open('blah 20100423.csv'), restval='missing', dialect='excel') return dict([(row['something'], row) for row in gd_extract]) And I checked that "something" (the key for our dict) isn't one of the missing columns, I had originally suspected it might be that. It's one of the columns after that. However, DictReader seems to completely skip over the rows. I tried setting restval to something, didn't seem to make any difference. I can't seem to find anything in Python's CSV docs (http://docs.python.org/library/csv.html) that would explain this behaviour, but I may have misread something. Any ideas? Thanks, Victor

Read the article
spam and dirty words comment post filtering in python (django)

- by sintaloo

Hi All, My basic question is how to filter spam and dirty words in a comment post system under python (django). I have a collection of phrases (approximately 3000 phrases) to be filtered. Question (1), are there any existing open source python (or django) package/module/plugin which can handle this job? I knew there was one called Akismet. But from what I understood, it will not solve my problem. Akismet is just a web service and filter the words dictionary defined by Akismet. But I have my own collection of words. Please correct me if I am wrong. Question (2), If there is no such open source package I can use, how to create my own one? The only thing I can think of it's to use regular expression and join all the word phrases with 'or' in a regular expression. but I have 3000 phrases, I think it won't work in term of performance and filter every comment post. any suggestions where should I start from? Thank you very much for your help and time.

Read the article
use proxy in python to fetch a webpage

- by carmao

I am trying to write a function in Python to use a public anonymous proxy and fetch a webpage, but I got a rather strange error. The code (I have Python 2.4): import urllib2 def get_source_html_proxy(url, pip, timeout): # timeout in seconds (maximum number of seconds willing for the code to wait in # case there is a proxy that is not working, then it gives up) proxy_handler = urllib2.ProxyHandler({'http': pip}) opener = urllib2.build_opener(proxy_handler) opener.addheaders = [('User-agent', 'Mozilla/5.0')] urllib2.install_opener(opener) req=urllib2.Request(url) sock=urllib2.urlopen(req) timp=0 # a counter that is going to measure the time until the result (webpage) is # returned while 1: data = sock.read(1024) timp=timp+1 if len(data) < 1024: break timpLimita=50000000 * timeout if timp==timpLimita: # 5 millions is about 1 second break if timp==timpLimita: print IPul + ": Connection is working, but the webpage is fetched in more than 50 seconds. This proxy returns the following IP: " + str(data) return str(data) else: print "This proxy " + IPul + "= good proxy. " + "It returns the following IP: " + str(data) return str(data) # Now, I call the function to test it for one single proxy (IP:port) that does not support user and password (a public high anonymity proxy) #(I put a proxy that I know is working - slow, but is working) rez=get_source_html_proxy("http://www.whatismyip.com/automation/n09230945.asp", "93.84.221.248:3128", 50) print rez The error: Traceback (most recent call last): File "./public_html/cgi-bin/teste5.py", line 43, in ? rez=get_source_html_proxy("http://www.whatismyip.com/automation/n09230945.asp", "93.84.221.248:3128", 50) File "./public_html/cgi-bin/teste5.py", line 18, in get_source_html_proxy sock=urllib2.urlopen(req) File "/usr/lib64/python2.4/urllib2.py", line 130, in urlopen return _opener.open(url, data) File "/usr/lib64/python2.4/urllib2.py", line 358, in open response = self._open(req, data) File "/usr/lib64/python2.4/urllib2.py", line 376, in _open '_open', req) File "/usr/lib64/python2.4/urllib2.py", line 337, in _call_chain result = func(*args) File "/usr/lib64/python2.4/urllib2.py", line 573, in lambda r, proxy=url, type=type, meth=self.proxy_open: \ File "/usr/lib64/python2.4/urllib2.py", line 580, in proxy_open if '@' in host: TypeError: iterable argument required I do not know why the character "@" is an issue (I have no such in my code. Should I have?) Thanks in advance for your valuable help.

Read the article
Ignore case in Python strings

- by Paul Oyster

What is the easiest way to compare strings in Python, ignoring case? Of course one can do (str1.lower() <= str2.lower()), etc., but this created two additional temporary strings (with the obvious alloc/g-c overheads). I guess I'm looking for an equivalent to C's stricmp(). [Some more context requested, so I'll demonstrate with a trivial example:] Suppose you want to sort a looong list of strings. You simply do theList.sort(). This is O(n * log(n)) string comparisons and no memory management (since all strings and list elements are some sort of smart pointers). You are happy. Now, you want to do the same, but ignore the case (let's simplify and say all strings are ascii, so locale issues can be ignored). You can do theList.sort(key=lambda s: s.lower()), but then you cause two new allocations per comparison, plus burden the garbage-collector with the duplicated (lowered) strings. Each such memory-management noise is orders-of-magnitude slower than simple string comparison. Now, with an in-place stricmp()-like function, you do: theList.sort(cmp=stricmp) and it is as fast and as memory-friendly as theList.sort(). You are happy again. The problem is any Python-based case-insensitive comparison involves implicit string duplications, so I was expecting to find a C-based comparisons (maybe in module string). Could not find anything like that, hence the question here. (Hope this clarifies the question).

Read the article
Speedup writing C programs using a subset of the Python syntax

- by psihodelia

I am constantly trying to optimize my time. Writing a C code takes a lot of time and requires much more keyboard touches than say writing a Python program. However, in order to speed up the time required to create a C program, one can automatize many things. I'd like to write my programs using smth. like Python but with C semantics. It means, all keywords are C keywords, but syntax is optimized. For example, this C code: #include "dsplib.h" #include "coeffs.h" #define MODULENAME "dsplib" #define NUM_SAMPLES 320 typedef float t_Vec; typedef struct s_Inter { char *pc_Name; struct s_Inter *px_Next; }t_Inter; typedef struct s_DspLibControl { t_Vec f_Y; }t_DspLibControl; void v_DspLibName(void) { printf("Module: %s", MODULENAME); printf("\n"); } int v_DspLibInitInterControl(t_DspLibControl *px_Con) { int y; px_Con->f_Y = 0.0; for(int i=0;i<10;i++) { y += i * i; } return y; } in optimized pythonized version can look like: include dsplib, coeffs define MODULENAME="dsplib", NUM_SAMPLES=320 typedef float t_Vec typedef struct s_Inter: char *pc_Name struct s_Inter *px_Next t_Inter typedef struct s_DspLibControl: t_Vec f_Y t_DspLibControl v_DspLibName(): printf("Module: %s", MODULENAME); printf("\n") int v_DspLibInitInterControl(t_DspLibControl *px_Con): int y px_Con->f_Y = 0.0 for int i=0;i<10;i++: y += i * i return y My question is: Do you know any VIM script, which allows to translate an original pythonized C code into a standard C code? For example, one is writing a C code but uses pythonized syntax, once she decides to translate pythonized blocks into standard C, she selects such blocks and press some key. And she doesn't save such pythonized code of course, VIM translates it into standard C.

Read the article
Magic Methods in Python

- by dArignac

Howdy, I'm kind of new to Python and I wonder if there is a way to create something like the magic methods in PHP (http://www.php.net/manual/en/language.oop5.overloading.php#language.oop5.overloading.methods) My aim is to ease the access of child classes in my model. I basically have a parent class that has n child classes. These classes have three values, a language key, a translation key and a translation value. The are describing a kind of generic translation handling. The parent class can have translations for different translation key each in different languages. E.g. the key "title" can be translated into german and english and the key "description" too (and so far and so on) I don't want to get the child classes and filter by the set values (at least I want but not explicitly, the concrete implementation behind the magic method would do this). I want to call parent_class.title['de'] # or also possible maybe parent_class.title('de') for getting the translation of title in german (de). So there has to be a magic method that takes the name of the called method and their params (as in PHP). As far as I dug into Python this is only possible with simple attributes (_getattr_, _setattr_) or with setting/getting directly within the class (_getitem_, _setitem_) which both do not fit my needs. Maybe there is a solution for this? Please help! Thanks in advance!

Read the article
Python line file iteration and strange characters

- by muckabout

I have a huge gzipped text file which I need to read, line by line. I go with the following: for i, line in enumerate(codecs.getreader('utf-8')(gzip.open('file.gz'))): print i, line At some point late in the file, the python output diverges from the file. This is because lines are getting broken due to weird special characters that python thinks are newlines. When I open the file in 'vim', they are correct, but the suspect characters are formatted weirdly. Is there something I can do to fix this? I've tried other codecs including utf-16, latin-1. I've also tried with no codec. I looked at the file using 'od'. Sure enough, there are \n characters where they shouldn't be. But, the "wrong" ones are prepended by a weird character. I think there's some encoding here with some characters being 2-bytes, but the trailing byte being a \n if not viewed properly. If I replace: gzip.open('file.gz') With: os.popen('zcat file.gz') It works fine (and actually, quite faster). But, I'd like to know where I'm going wrong.

Read the article
Managing logs/warnings in Python extensions

- by Dimitri Tcaciuc

TL;DR version: What do you use for configurable (and preferably captured) logging inside your C++ bits in a Python project? Details follow. Say you have a a few compiled .so modules that may need to do some error checking and warn user of (partially) incorrect data. Currently I'm having a pretty simplistic setup where I'm using logging framework from Python code and log4cxx library from C/C++. log4cxx log level is defined in a file (log4cxx.properties) and is currently fixed and I'm thinking how to make it more flexible. Couple of choices that I see: One way to control it would be to have a module-wide configuration call. # foo/__init__.py import sys from _foo import import bar, baz, configure_log configure_log(sys.stdout, WARNING) # tests/test_foo.py def test_foo(): # Maybe a custom context to change the logfile for # the module and restore it at the end. with CaptureLog(foo) as log: assert foo.bar() == 5 assert log.read() == "124.24 - foo - INFO - Bar returning 5" Have every compiled function that does logging accept optional log parameters. # foo.c int bar(PyObject* x, PyObject* logfile, PyObject* loglevel) { LoggerPtr logger = default_logger("foo"); if (logfile != Py_None) logger = file_logger(logfile, loglevel); ... } # tests/test_foo.py def test_foo(): with TemporaryFile() as logfile: assert foo.bar(logfile=logfile, loglevel=DEBUG) == 5 assert logfile.read() == "124.24 - foo - INFO - Bar returning 5" Some other way? Second one seems to be somewhat cleaner, but it requires function signature alteration (or using kwargs and parsing them). First one is.. probably somewhat awkward but sets up entire module in one go and removes logic from each individual function. What are your thoughts on this? I'm all ears to alternative solutions as well. Thanks,

Read the article
Coding the Python way

- by Aaron Moodie

I've just spent the last half semester at Uni learning python. I've really enjoyed it, and was hoping for a few tips on how to write more 'pythonic' code. This is the __init__ class from a recent assignment I did. At the time I wrote it, I was trying to work out how I could re-write this using lambdas, or in a neater, more efficient way, but ran out of time. def __init__(self, dir): def _read_files(_, dir, files): for file in files: if file == "classes.txt": class_list = readtable(dir+"/"+file) for item in class_list: Enrol.class_info_dict[item[0]] = item[1:] if item[1] in Enrol.classes_dict: Enrol.classes_dict[item[1]].append(item[0]) else: Enrol.classes_dict[item[1]] = [item[0]] elif file == "subjects.txt": subject_list = readtable(dir+"/"+file) for item in subject_list: Enrol.subjects_dict[item[0]] = item[1] elif file == "venues.txt": venue_list = readtable(dir+"/"+file) for item in venue_list: Enrol.venues_dict[item[0]] = item[1:] elif file.endswith('.roll'): roll_list = readlines(dir+"/"+file) file = os.path.splitext(file)[0] Enrol.class_roll_dict[file] = roll_list for item in roll_list: if item in Enrol.enrolled_dict: Enrol.enrolled_dict[item].append(file) else: Enrol.enrolled_dict[item] = [file] try: os.path.walk(dir, _read_files, None) except: print "There was a problem reading the directory" As you can see, it's a little bulky. If anyone has the time or inclination, I'd really appreciate a few tips on some python best-practices. Thanks.

Read the article
Downloading a picture via urllib and python.

- by Mike

So I'm trying to make a Python script that downloads webcomics and puts them in a folder on my desktop. I've found a few similar programs on here that do something similar, but nothing quite like what I need. The one that I found most similar is right here (http://bytes.com/topic/python/answers/850927-problem-using-urllib-download-images). I tried using this code: >>> import urllib >>> image = urllib.URLopener() >>> image.retrieve("http://www.gunnerkrigg.com//comics/00000001.jpg","00000001.jpg") ('00000001.jpg', <httplib.HTTPMessage instance at 0x1457a80>) I then searched my computer for a file "00000001.jpg", but all I found was the cached picture of it. I'm not even sure it saved the file to my computer. Once I understand how to get the file downloaded, I think I know how to handle the rest. Essentially just use a for loop and split the string at the '00000000'.'jpg' and increment the '00000000' up to the largest number, which I would have to somehow determine. Any reccomendations on the best way to do this or how to download the file correctly? Thanks!

Read the article

< Previous Page | 78 79 80 81 82 83 84 85 86 87 88 89 | Next Page >