Search Results

Search found 13693 results on 548 pages for 'python metaprogramming'.

Page 220/548 | < Previous Page | 216 217 218 219 220 221 222 223 224 225 226 227 | Next Page >

Fast JSON serialization (and comparison with Pickle) for cluster computing in Python?

- by user248237

I have a set of data points, each described by a dictionary. The processing of each data point is independent and I submit each one as a separate job to a cluster. Each data point has a unique name, and my cluster submission wrapper simply calls a script that takes a data point's name and a file describing all the data points. That script then accesses the data point from the file and performs the computation. Since each job has to load the set of all points only to retrieve the point to be run, I wanted to optimize this step by serializing the file describing the set of points into an easily retrievable format. I tried using JSONpickle, using the following method, to serialize a dictionary describing all the data points to file: def json_serialize(obj, filename, use_jsonpickle=True): f = open(filename, 'w') if use_jsonpickle: import jsonpickle json_obj = jsonpickle.encode(obj) f.write(json_obj) else: simplejson.dump(obj, f, indent=1) f.close() The dictionary contains very simple objects (lists, strings, floats, etc.) and has a total of 54,000 keys. The json file is ~20 Megabytes in size. It takes ~20 seconds to load this file into memory, which seems very slow to me. I switched to using pickle with the same exact object, and found that it generates a file that's about 7.8 megabytes in size, and can be loaded in ~1-2 seconds. This is a significant improvement, but it still seems like loading of a small object (less than 100,000 entries) should be faster. Aside from that, pickle is not human readable, which was the big advantage of JSON for me. Is there a way to use JSON to get similar or better speed ups? If not, do you have other ideas on structuring this? (Is the right solution to simply "slice" the file describing each event into a separate file and pass that on to the script that runs a data point in a cluster job? It seems like that could lead to a proliferation of files). thanks.

Read the article
Most efficent way to create all possible combinations of four lists in Python?

- by Baresi

I have four different lists. headers, descriptions, short_descriptions and misc. I want to combine these into all the possible ways to print out: header\n description\n short_description\n misc like if i had (i'm skipping short_description and misc in this example for obvious reasons) headers = ['Hello there', 'Hi there!'] description = ['I like pie', 'Ho ho ho'] ... I want it to print out like: Hello there I like pie ... Hello there Ho ho ho ... Hi there! I like pie ... Hi there! Ho ho ho ... What would you say is the best/cleanest/most efficent way to do this? Is for-nesting the only way to go?

Read the article
Python script, runs well, but not perfectly, debugging help.

- by S1syphus

What it does (sort of)... or is meant to, the script reads from a csv file that contains information on sound files and create a play list exactly 60 minutes long. An example csv, contains: their title, duration (in seconds), minium total time to be played (in minutes) An example is: Soundfoo,120,10 Soundbar,30,6 Sounddev,60,20 Soundrandom,15,8 The script works out the minimum instances of plays, take 'Soundfoo' for example, the length of each sample is 120 seconds and the minimum time to be played is 10 minutes, so basic maths 10*60/120 gives the number of instances the song is to be played, in this case 5. It is meant to take minimum number of instances and spread out equally from each other; so there will never be a period where for example Soundbar is played twice in a row. Then if the minium instances of each song has been used, and there is still time with in the 60 min, how is it possible to tell it to go back and fill the time by selecting each sound and including it till the 60 min is filled while remaining sparsely populated. Heres the issue(s)! The script fails to calculate the actual time require to play all the sounds in a file and the total time of the playlist, the thing is tho it doesn't get it wrong all the time maybe 3/5 times, even if I run it on the same csv file it will give me different answers. Here is the file I shall run the script on e for sake of ease to see the issue: Sound1,60,10 Sound2,60,10 Sound3,60,10 Sound4,60,10 Sound5,60,10 Sound6,60,10 I'll do it three times and post the results: 1 Required playtime in minutes: 60 Actual time in minutes to play all required ads: 62 Total playtime in minutes: 62.0 2 Required playtime in minutes: 60 Actual time in minutes to play all required ads: 71 Total playtime in minutes: 71.0 3 Required playtime in minutes: 60 Actual time in minutes to play all required ads: 60 Total playtime in minutes: 60.0 Relevant Code: pastebin.com/demkBXk6 And finally... in context: http://pastebin.com/demkBXk6 If you made it down to here, thanks for staying and reading, kudos.

Read the article
Python.expat can't parse XML file with bad symbols. How to go around?

- by culebrón

I'm trying to parse an XML file with expat, and here's the line where I get bad token exception: <tag k="name" v="???????????????????????????????????????????????????????????????????" /> xml.parsers.expat.ExpatError: not well-formed (invalid token): line 610127, column 37 The symbols in hex look like: \xd1? Seems like someone wrote this string (Russian alfabet) hitting backspace a few times. I set parser.returns_unicode = True, but this didn't help. The 1st line is <?xml version="1.0" encoding="UTF-8"?>. I work with a bz2 file. (bz2.BZ2File) How can I parse the file?

Read the article
Can I filter a django model with a python list?

- by Rhubarb

Say I have a model object 'Person' defined, which has a field called 'Name'. And I have a list of people: l = ['Bob','Dave','Jane'] I would like to return a list of all Person records where the first name is not in the list of names defined in l. What is the most pythonic way of doing this?

Read the article
How do I sort this list in Python, if my date is in a String?

- by alex

[{'date': '2010-04-01', 'people': 1047, 'hits': 4522}, {'date': '2010-04-03', 'people': 617, 'hits': 2582}, {'date': '2010-04-02', 'people': 736, 'hits': 3277}] Suppose I have this list. How do I sort by "date", which is an item in the dictionary. But, "date" is a string...

Read the article
What is the semantics of 'is' operator in Python?

- by bodacydo

How does is operator determine if two objects are the same? How does it work? I can't find it documented.

Read the article
How to print gantt-charts generated on web using python?

- by ady

I want to print or save gantt-chart(in pdf format). These charts are generated on web after a particular input. Our chart is a plug-in for Trac. I have used Genshi library to generate charts.

Read the article
Which is the best python library to make REST request like PUT, GET, DELETE, POST and how ?

- by Parthiv

Hi, I am bit confuse over set of libraries of pythons to connect with REST enabled web services. I have tried httplib, urllib and urllib2. I want to know how can methods like PUT, GET, POST, DELETE can be achieved using this library. Regards, Parthiv

Read the article
A UnicodeDecodeError that occurs with json in python on Windows, but not Mac.

- by ventolin

On windows, I have the following problem: >>> string = "Don´t Forget To Breathe" >>> import json,os,codecs >>> f = codecs.open("C:\\temp.txt","w","UTF-8") >>> json.dump(string,f) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "C:\Python26\lib\json\__init__.py", line 180, in dump for chunk in iterable: File "C:\Python26\lib\json\encoder.py", line 294, in _iterencode yield encoder(o) UnicodeDecodeError: 'utf8' codec can't decode bytes in position 3-5: invalid data (Notice the non-ascii apostrophe in the string.) However, my friend, on his mac (also using python2.6), can run through this like a breeze: > string = "Don´t Forget To Breathe" > import json,os,codecs > f = codecs.open("/tmp/temp.txt","w","UTF-8") > json.dump(string,f) > f.close(); open('/tmp/temp.txt').read() '"Don\\u00b4t Forget To Breathe"' Why is this? I've also tried using UTF-16 and UTF-32 with json and codecs, but to no avail.

Read the article
Why is my Python OpenGL render2DTexture function so slow?

- by Barakat

SOLVED: The problem was actually using time.time() every CPU cycle to see whether the next frame should be drawn or not. The time it takes to execute time.time() was having an impact on the FPS. I made this function for drawing 2D textures as images in a 2D view in my OpenGL application. After doing some testing I found that it takes up 1-2 fps per texture. I know I am probably doing something wrong in this code. Any ideas? I am limiting the FPS to 60. Edit: When I disable the texture rendering it adds about 15% fps back. When I disabled text rendering it adds about 15% fps back. When i disable both barely any fps is consumed anymore. IE: 20 out of 60 fps with both on. 30 out of 60 when one is disabled. 58 out of 60 when both are disabled. When rendering the text on a button ( the control I'm using to test this ), it only "prepares" the text when the button label is set. Updated code, still running at the same speed but still works the same: def render2DTexture( self, texture, rect, texrect ): glEnable( GL_TEXTURE_2D ) glBindTexture( GL_TEXTURE_2D, texture ) glBegin( GL_QUADS ) glTexCoord2f( texrect.left, texrect.bottom ) glVertex2i( rect.left, self.windowSize[1] - rect.top ) glTexCoord2f( texrect.right, texrect.bottom ) glVertex2i( rect.left + rect.right, self.windowSize[1] - rect.top ) glTexCoord2f( texrect.right, texrect.top ) glVertex2i( rect.left + rect.right, self.windowSize[1] - ( rect.top + rect.bottom ) ) glTexCoord2f( texrect.left, texrect.top ) glVertex2i( rect.left, self.windowSize[1] - ( rect.top + rect.bottom ) ) glEnd() glDisable( GL_TEXTURE_2D ) def prepareText( self, text, fontFace, color ): self.loadFont( fontFace ) bmp = self.fonts[ fontFace ].render( text, 1, color ) return ( pygame.image.tostring( bmp, 'RGBA', 1 ), bmp.get_width(), bmp.get_height() ) def renderText( self, pText, position ): glRasterPos2i( position[0], self.windowSize[1] - ( position[1] + pText[2] ) ) glDrawPixels( pText[1], pText[2], GL_RGBA, GL_UNSIGNED_BYTE, pText[0] )

Read the article
How can I turn off registry redirection on Python?

- by Shady

My program is trying to create an key on the HKLM\Software\Microsoft\Shared Tools\MSCONFIG\startupreg\test\ but instead the key is created on the HKLM\Wow6432node\Software\Microsoft\Shared Tools\MSCONFIG\startupreg\test\ and don't work properly... Why? How can I solve it?

Read the article
Python - compare nested lists and append matches to new list?

- by Seafoid

Hi, I wish to compare to nested lists of unequal length. I am interested only in a match between the first element of each sub list. Should a match exist, I wish to add the match to another list for subsequent transformation into a tab delimited file. Here is an example of what I am working with: x = [['1', 'a', 'b'], ['2', 'c', 'd']] y = [['1', 'z', 'x'], ['4', 'z', 'x']] match = [] def find_match(): for i in x: for j in y: if i[1] == j[1]: match.append(j) return match This results in a series of empty lists. Is it better to use tuples and/or tuples of tuples for the purposes of comparison? Any help is greatly appreciated. Regards, Seafoid.

Read the article
How can I dispatch Firefox or Google Chrome with Python?

- by Shady

How can I do this with Firefox or Google Chrome? ie = win32com.client.Dispatch('InternetExplorer.Application') ie.visible = 1 ie.navigate('http://google.com') Is there a way to do it? ps: I need to use the ReadyState with it... for example while (ie.ReadyState != 4):, or in other words, I need some command that wait until the page loads completely until do the next command, that's why I need the dispatch, that currently work very good with IE

Read the article
How do I insert data from a Python dictionary to MySQL?

- by NJTechie

I manipulated some data from MySQL and the resulting dictionary "data" (print data) displays something like this : {'1': ['1', 'K', abc, 'xyz', None, None, datetime.date(2009, 6, 18)], '2': ['2', 'K', efg, 'xyz', None, None, None, None], '3': ['3', 'K', ijk, 'xyz', None, None, None, datetime.date(2010, 2, 5, 16, 31, 2)]} How do I create a table and insert these values in a MySQL table? In other words, how do I dump them to MySQL or CSV? Not sure how to deal with datetime.date and None values. Any help is appreciated.

Read the article
how to read specific number of floats from file in python?

- by sahel

I am reading a text file from the web. The file starts with some header lines containing the number of data points, followed the actual vertices (3 coordinates each). The file looks like: # comment HEADER TEXT POINTS 6 float 1.1 2.2 3.3 4.4 5.5 6.6 7.7 8.8 9.9 1.1 2.2 3.3 4.4 5.5 6.6 7.7 8.8 9.9 POLYGONS the line starting with the word POINTS contains the number of vertices (in this case we have 3 vertices per line, but that could change) This is how I am reading it right now: ur=urlopen("http://.../file.dat") j=0 contents = [] while 1: line = ur.readline() if not line: break else: line=line.lower() if 'points' in line : myline=line.strip() word=myline.split() node_number=int(word[1]) node_type=word[2] while 'polygons' not in line : line = ur.readline() line=line.lower() myline=line.split() i=0 while(i<len(myline)): contents[j]=float(myline[i]) i=i+1 j=j+1 How can I read a specified number of floats instead of reading line by line as strings and converting to floating numbers? Instead of ur.readline() I want to read the specified number of elements in the file Any suggestion is welcome..

Read the article
Storing simulation results in a persistent manner for Python?

- by Az

Background: I'm running multiple simuations on a set of data. For each session, I'm allocating projects to students. The difference between each session is that I'm randomising the order of the students such that all the students get a shot at being assigned a project they want. I was writing out some of the allocations in a spreadsheet (i.e. Excel) and it basically looked like this (tiny snapshot, actual table extends to a few thousand sessions, roughly 100 students). | | Session 1 | Session 2 | Session 3 | |----------|-----------|-----------|-----------| |Stu1 |Proj_AA |Proj_AB |Proj_AB | |----------|-----------|-----------|-----------| |Stu2 |Proj_AB |Proj_AA |Proj_AC | |----------|-----------|-----------|-----------| |Stu3 |Proj_AC |Proj_AC |Proj_AA | |----------|-----------|-----------|-----------| Now, the code that deals with the allocation currently stores a session in an object. The next time the allocation is run, the object is over-written. Thus what I'd really like to do is to store all the allocation results. This is important since I later need to derive from the data, information such as: which project Stu1 got assigned to the most or perhaps how popular Proj_AC was (how many times it was assigned / number of sessions). Question(s): What methods can I possibly use to basically store such session information persistently? Basically, each session output needs to add itself to the repository after ending and before beginning the next allocation cycle. One solution that was suggested by a friend was mapping these results to a relational database using SQLAlchemy. I kind of like the idea since this does give me an opportunity to delve into databases. Now the database structure I was recommended was: |----------|-----------|-----------| |Session |Student |Project | |----------|-----------|-----------| |1 |Stu1 |Proj_AA | |----------|-----------|-----------| |1 |Stu2 |Proj_AB | |----------|-----------|-----------| |1 |Stu3 |Proj_AC | |----------|-----------|-----------| |2 |Stu1 |Proj_AB | |----------|-----------|-----------| |2 |Stu2 |Proj_AA | |----------|-----------|-----------| |2 |Stu3 |Proj_AC | |----------|-----------|-----------| |3 |Stu1 |Proj_AB | |----------|-----------|-----------| |3 |Stu2 |Proj_AC | |----------|-----------|-----------| |3 |Stu3 |Proj_AA | |----------|-----------|-----------| Here, it was suggested that I make the Session and Student columns a composite key. That way I can access a specific record for a particular student for a particular session. Or I can merely get the entire allocation run for a particular session. Questions: Is the idea a good one? How does one implement and query a composite key using SQLAlchemy? What happens to the database if a particular student is not assigned a project (happens if all projects that he wants are taken)? In the code, if a student is not assigned a project, instead of a proj_id he simply gets None for that field/object. I apologise for asking multiple questions but since these are closely-related, I thought I'd ask them in the same space.

Read the article
How would I merged nested dictionaries in a list in python?

- by Kevin

for example if i had the result [{'Germany': {"Luge - Men's Singles": 'Gold'}}, {'Germany': {"Luge - Men's Singles": 'Silver'}}, {'Italy': {"Luge - Men's Singles": 'Bronze'}}] [{'Germany': {"Luge - Women's Singles": 'Gold'}}, {'Austria': {"Luge - Women's Singles": 'Silver'}}, {'Germany': {"Luge - Women's Singles": 'Bronze'}}] [{'Austria': {'Luge - Doubles': 'Gold'}}, {'Latvia': {'Luge - Doubles': 'Silver'}}, {'Germany': {'Luge - Doubles': 'Bronze'}}] how would I sort this so that all of the events germany and so on had won could be under one single title. i.e germany would be germany:Luge - Men's Singles: Gold, Silver, Luge - Women's Singles: Gold, Bronze, Luge - Doubles: Bronze. thanks for any help

Read the article
What is the fastest way to scale and display an image in Python?

- by Knut Eldhuset

I am required to display a two dimensional numpy.array of int16 at 20fps or so. Using Matplotlib's imshow chokes on anything above 10fps. There obviously are some issues with scaling and interpolation. I should add that the dimensions of the array are not known, but will probably be around thirty by four hundred. These are data from a sensor that are supposed to have a real-time display, so the data has to be re-sampled on the fly.

Read the article
How do you use scripting language (PHP, Python, etc) to improve your productivity?

- by Edwin

Hi, I'm a Delphi developer on the Windows platform, recently read the PHP tutorial at W3CSchools, it looks interesting. We all know scripting languages are very good at web site development, but I also want to utilize it to improve my productivity or get some tedious tasks done quickly, maybe some quick-and-dirty string/file processing? How do you usually do with scripting languages apart from software development? And we need a responsive, decent IDE/editor in order to gain productivity when writing scripts for this purpose? Thanks for in advance!

Read the article
Python. How do you set the image attributes?

- by manny

I'm using PIL. I tries using : img.info = {'Buyer':'Text','Copyright':'Text2'} This is not working. Is there an alternate way to do it?

Read the article
What is the best artificial-intelligence library for Python?

- by alex

I know of NLTK. What else is there that complements this library? Or can do AI? NLTK is great because I can learn it with the book that it came out. Is there a library for AI just like this?

Read the article
Python: saving objects and using pickle. Error using pickle.dump

- by Peterstone

Hello I have an Error and I don´t the reason: >>> class Fruits:pass ... >>> banana = Fruits() >>> banana.color = 'yellow' >>> banana.value = 30 >>> import pickle >>> filehandler = open("Fruits.obj",'w') >>> pickle.dump(banana,filehandler) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "C:\Python31\lib\pickle.py", line 1354, in dump Pickler(file, protocol, fix_imports=fix_imports).dump(obj) TypeError: must be str, not bytes >>> I don´t know how to solve this error because I don´t understand it. Thank you so much.

Read the article
Python Turtle Graphics, how to plot functions over an interval?

- by TheDragonAce

I need to plot a function over a specified interval. The function is f1, which is shown below in the code, and the interval is [-7, -3]; [-1, 1]; [3, 7] with a step of .01. When I execute the program, nothing is drawn. Any ideas? import turtle from math import sqrt wn = turtle.Screen() wn.bgcolor("white") wn.title("Plotting") mypen = turtle.Turtle() mypen.shape("classic") mypen.color("black") mypen.speed(10) while True: try: def f1(x): return 2 * sqrt((-abs(abs(x)-1)) * abs(3 - abs(x))/((abs(x)-1)*(3-abs(x)))) * \ (1 + abs(abs(x)-3)/(abs(x)-3))*sqrt(1-(x/7)**2)+(5+0.97*(abs(x-0.5)+abs(x+0.5))-\ 3*(abs(x-0.75)+abs(x+0.75)))*(1+abs(1-abs(x))/(1-abs(x))) mypen.penup() step=.01 startf11=-7 stopf11=-3 startf12=-1 stopf12=1 startf13=3 stopf13=7 def f11 (startf11,stopf11,step): rc=[] y = f1(startf11) while y<=stopf11: rc.append(startf11) #y+=step mypen.setpos(f1(startf11)*25,y*25) mypen.dot() def f12 (startf12,stopf12,step): rc=[] y = f1(startf12) while y<=stopf12: rc.append(startf12) #y+=step mypen.setpos(f1(startf12)*25, y*25) mypen.dot() def f13 (startf13,stopf13,step): rc=[] y = f1(startf13) while y<=stopf13: rc.append(startf13) #y+=step mypen.setpos(f1(startf13)*25, y*25) mypen.dot() f11(startf11,stopf11,step) f12(startf12,stopf12,step) f13(startf13,stopf13,step) except ZeroDivisionError: continue

Read the article
python: how to jump to a particular line in a huge text file?

- by photographer

Are there any alternatives to the code below: startFromLine = 141978 # or whatever line I need to jump to urlsfile = open(filename, "rb", 0) linesCounter = 1 for line in urlsfile: if linesCounter > startFromLine: DoSomethingWithThisLine(line) linesCounter += 1 if I'm processing a huge text file (~15MB) with lines of unknown but different length, and need to jump to a particular line which number I know in advance? I feel bad by processing them one by one when I know I could ignore at least first half of the file. Looking for more elegant solution if there is any.

Read the article

< Previous Page | 216 217 218 219 220 221 222 223 224 225 226 227 | Next Page >