Search Results

Search found 16680 results on 668 pages for 'python datetime'.

Page 241/668 | < Previous Page | 237 238 239 240 241 242 243 244 245 246 247 248  | Next Page >

  • Fast JSON serialization (and comparison with Pickle) for cluster computing in Python?

    - by user248237
    I have a set of data points, each described by a dictionary. The processing of each data point is independent and I submit each one as a separate job to a cluster. Each data point has a unique name, and my cluster submission wrapper simply calls a script that takes a data point's name and a file describing all the data points. That script then accesses the data point from the file and performs the computation. Since each job has to load the set of all points only to retrieve the point to be run, I wanted to optimize this step by serializing the file describing the set of points into an easily retrievable format. I tried using JSONpickle, using the following method, to serialize a dictionary describing all the data points to file: def json_serialize(obj, filename, use_jsonpickle=True): f = open(filename, 'w') if use_jsonpickle: import jsonpickle json_obj = jsonpickle.encode(obj) f.write(json_obj) else: simplejson.dump(obj, f, indent=1) f.close() The dictionary contains very simple objects (lists, strings, floats, etc.) and has a total of 54,000 keys. The json file is ~20 Megabytes in size. It takes ~20 seconds to load this file into memory, which seems very slow to me. I switched to using pickle with the same exact object, and found that it generates a file that's about 7.8 megabytes in size, and can be loaded in ~1-2 seconds. This is a significant improvement, but it still seems like loading of a small object (less than 100,000 entries) should be faster. Aside from that, pickle is not human readable, which was the big advantage of JSON for me. Is there a way to use JSON to get similar or better speed ups? If not, do you have other ideas on structuring this? (Is the right solution to simply "slice" the file describing each event into a separate file and pass that on to the script that runs a data point in a cluster job? It seems like that could lead to a proliferation of files). thanks.

    Read the article

  • Most efficent way to create all possible combinations of four lists in Python?

    - by Baresi
    I have four different lists. headers, descriptions, short_descriptions and misc. I want to combine these into all the possible ways to print out: header\n description\n short_description\n misc like if i had (i'm skipping short_description and misc in this example for obvious reasons) headers = ['Hello there', 'Hi there!'] description = ['I like pie', 'Ho ho ho'] ... I want it to print out like: Hello there I like pie ... Hello there Ho ho ho ... Hi there! I like pie ... Hi there! Ho ho ho ... What would you say is the best/cleanest/most efficent way to do this? Is for-nesting the only way to go?

    Read the article

  • Python script, runs well, but not perfectly, debugging help.

    - by S1syphus
    What it does (sort of)... or is meant to, the script reads from a csv file that contains information on sound files and create a play list exactly 60 minutes long. An example csv, contains: their title, duration (in seconds), minium total time to be played (in minutes) An example is: Soundfoo,120,10 Soundbar,30,6 Sounddev,60,20 Soundrandom,15,8 The script works out the minimum instances of plays, take 'Soundfoo' for example, the length of each sample is 120 seconds and the minimum time to be played is 10 minutes, so basic maths 10*60/120 gives the number of instances the song is to be played, in this case 5. It is meant to take minimum number of instances and spread out equally from each other; so there will never be a period where for example Soundbar is played twice in a row. Then if the minium instances of each song has been used, and there is still time with in the 60 min, how is it possible to tell it to go back and fill the time by selecting each sound and including it till the 60 min is filled while remaining sparsely populated. Heres the issue(s)! The script fails to calculate the actual time require to play all the sounds in a file and the total time of the playlist, the thing is tho it doesn't get it wrong all the time maybe 3/5 times, even if I run it on the same csv file it will give me different answers. Here is the file I shall run the script on e for sake of ease to see the issue: Sound1,60,10 Sound2,60,10 Sound3,60,10 Sound4,60,10 Sound5,60,10 Sound6,60,10 I'll do it three times and post the results: 1 Required playtime in minutes: 60 Actual time in minutes to play all required ads: 62 Total playtime in minutes: 62.0 2 Required playtime in minutes: 60 Actual time in minutes to play all required ads: 71 Total playtime in minutes: 71.0 3 Required playtime in minutes: 60 Actual time in minutes to play all required ads: 60 Total playtime in minutes: 60.0 Relevant Code: pastebin.com/demkBXk6 And finally... in context: http://pastebin.com/demkBXk6 If you made it down to here, thanks for staying and reading, kudos.

    Read the article

  • Why is my Python OpenGL render2DTexture function so slow?

    - by Barakat
    SOLVED: The problem was actually using time.time() every CPU cycle to see whether the next frame should be drawn or not. The time it takes to execute time.time() was having an impact on the FPS. I made this function for drawing 2D textures as images in a 2D view in my OpenGL application. After doing some testing I found that it takes up 1-2 fps per texture. I know I am probably doing something wrong in this code. Any ideas? I am limiting the FPS to 60. Edit: When I disable the texture rendering it adds about 15% fps back. When I disabled text rendering it adds about 15% fps back. When i disable both barely any fps is consumed anymore. IE: 20 out of 60 fps with both on. 30 out of 60 when one is disabled. 58 out of 60 when both are disabled. When rendering the text on a button ( the control I'm using to test this ), it only "prepares" the text when the button label is set. Updated code, still running at the same speed but still works the same: def render2DTexture( self, texture, rect, texrect ): glEnable( GL_TEXTURE_2D ) glBindTexture( GL_TEXTURE_2D, texture ) glBegin( GL_QUADS ) glTexCoord2f( texrect.left, texrect.bottom ) glVertex2i( rect.left, self.windowSize[1] - rect.top ) glTexCoord2f( texrect.right, texrect.bottom ) glVertex2i( rect.left + rect.right, self.windowSize[1] - rect.top ) glTexCoord2f( texrect.right, texrect.top ) glVertex2i( rect.left + rect.right, self.windowSize[1] - ( rect.top + rect.bottom ) ) glTexCoord2f( texrect.left, texrect.top ) glVertex2i( rect.left, self.windowSize[1] - ( rect.top + rect.bottom ) ) glEnd() glDisable( GL_TEXTURE_2D ) def prepareText( self, text, fontFace, color ): self.loadFont( fontFace ) bmp = self.fonts[ fontFace ].render( text, 1, color ) return ( pygame.image.tostring( bmp, 'RGBA', 1 ), bmp.get_width(), bmp.get_height() ) def renderText( self, pText, position ): glRasterPos2i( position[0], self.windowSize[1] - ( position[1] + pText[2] ) ) glDrawPixels( pText[1], pText[2], GL_RGBA, GL_UNSIGNED_BYTE, pText[0] )

    Read the article

  • Can I filter a django model with a python list?

    - by Rhubarb
    Say I have a model object 'Person' defined, which has a field called 'Name'. And I have a list of people: l = ['Bob','Dave','Jane'] I would like to return a list of all Person records where the first name is not in the list of names defined in l. What is the most pythonic way of doing this?

    Read the article

  • How do you use scripting language (PHP, Python, etc) to improve your productivity?

    - by Edwin
    Hi, I'm a Delphi developer on the Windows platform, recently read the PHP tutorial at W3CSchools, it looks interesting. We all know scripting languages are very good at web site development, but I also want to utilize it to improve my productivity or get some tedious tasks done quickly, maybe some quick-and-dirty string/file processing? How do you usually do with scripting languages apart from software development? And we need a responsive, decent IDE/editor in order to gain productivity when writing scripts for this purpose? Thanks for in advance!

    Read the article

  • Python.expat can't parse XML file with bad symbols. How to go around?

    - by culebrón
    I'm trying to parse an XML file with expat, and here's the line where I get bad token exception: <tag k="name" v="???????????????????????????????????????????????????????????????????" /> xml.parsers.expat.ExpatError: not well-formed (invalid token): line 610127, column 37 The symbols in hex look like: \xd1? Seems like someone wrote this string (Russian alfabet) hitting backspace a few times. I set parser.returns_unicode = True, but this didn't help. The 1st line is <?xml version="1.0" encoding="UTF-8"?>. I work with a bz2 file. (bz2.BZ2File) How can I parse the file?

    Read the article

  • A UnicodeDecodeError that occurs with json in python on Windows, but not Mac.

    - by ventolin
    On windows, I have the following problem: >>> string = "Don´t Forget To Breathe" >>> import json,os,codecs >>> f = codecs.open("C:\\temp.txt","w","UTF-8") >>> json.dump(string,f) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "C:\Python26\lib\json\__init__.py", line 180, in dump for chunk in iterable: File "C:\Python26\lib\json\encoder.py", line 294, in _iterencode yield encoder(o) UnicodeDecodeError: 'utf8' codec can't decode bytes in position 3-5: invalid data (Notice the non-ascii apostrophe in the string.) However, my friend, on his mac (also using python2.6), can run through this like a breeze: > string = "Don´t Forget To Breathe" > import json,os,codecs > f = codecs.open("/tmp/temp.txt","w","UTF-8") > json.dump(string,f) > f.close(); open('/tmp/temp.txt').read() '"Don\\u00b4t Forget To Breathe"' Why is this? I've also tried using UTF-16 and UTF-32 with json and codecs, but to no avail.

    Read the article

  • Python - compare nested lists and append matches to new list?

    - by Seafoid
    Hi, I wish to compare to nested lists of unequal length. I am interested only in a match between the first element of each sub list. Should a match exist, I wish to add the match to another list for subsequent transformation into a tab delimited file. Here is an example of what I am working with: x = [['1', 'a', 'b'], ['2', 'c', 'd']] y = [['1', 'z', 'x'], ['4', 'z', 'x']] match = [] def find_match(): for i in x: for j in y: if i[1] == j[1]: match.append(j) return match This results in a series of empty lists. Is it better to use tuples and/or tuples of tuples for the purposes of comparison? Any help is greatly appreciated. Regards, Seafoid.

    Read the article

  • Storing simulation results in a persistent manner for Python?

    - by Az
    Background: I'm running multiple simuations on a set of data. For each session, I'm allocating projects to students. The difference between each session is that I'm randomising the order of the students such that all the students get a shot at being assigned a project they want. I was writing out some of the allocations in a spreadsheet (i.e. Excel) and it basically looked like this (tiny snapshot, actual table extends to a few thousand sessions, roughly 100 students). | | Session 1 | Session 2 | Session 3 | |----------|-----------|-----------|-----------| |Stu1 |Proj_AA |Proj_AB |Proj_AB | |----------|-----------|-----------|-----------| |Stu2 |Proj_AB |Proj_AA |Proj_AC | |----------|-----------|-----------|-----------| |Stu3 |Proj_AC |Proj_AC |Proj_AA | |----------|-----------|-----------|-----------| Now, the code that deals with the allocation currently stores a session in an object. The next time the allocation is run, the object is over-written. Thus what I'd really like to do is to store all the allocation results. This is important since I later need to derive from the data, information such as: which project Stu1 got assigned to the most or perhaps how popular Proj_AC was (how many times it was assigned / number of sessions). Question(s): What methods can I possibly use to basically store such session information persistently? Basically, each session output needs to add itself to the repository after ending and before beginning the next allocation cycle. One solution that was suggested by a friend was mapping these results to a relational database using SQLAlchemy. I kind of like the idea since this does give me an opportunity to delve into databases. Now the database structure I was recommended was: |----------|-----------|-----------| |Session |Student |Project | |----------|-----------|-----------| |1 |Stu1 |Proj_AA | |----------|-----------|-----------| |1 |Stu2 |Proj_AB | |----------|-----------|-----------| |1 |Stu3 |Proj_AC | |----------|-----------|-----------| |2 |Stu1 |Proj_AB | |----------|-----------|-----------| |2 |Stu2 |Proj_AA | |----------|-----------|-----------| |2 |Stu3 |Proj_AC | |----------|-----------|-----------| |3 |Stu1 |Proj_AB | |----------|-----------|-----------| |3 |Stu2 |Proj_AC | |----------|-----------|-----------| |3 |Stu3 |Proj_AA | |----------|-----------|-----------| Here, it was suggested that I make the Session and Student columns a composite key. That way I can access a specific record for a particular student for a particular session. Or I can merely get the entire allocation run for a particular session. Questions: Is the idea a good one? How does one implement and query a composite key using SQLAlchemy? What happens to the database if a particular student is not assigned a project (happens if all projects that he wants are taken)? In the code, if a student is not assigned a project, instead of a proj_id he simply gets None for that field/object. I apologise for asking multiple questions but since these are closely-related, I thought I'd ask them in the same space.

    Read the article

  • Are there any libraries to allow Python or Ruby to get info from SVN?

    - by Mike Trpcic
    I'm looking for plugins that will allow my codebase to interact with, browse, and poll an SVN server for information about a repository. Trac can do this, but I was hoping there was an easy-to-use library available to accomplish the task, rather than trolling through the Trac codebase. Googling for this returns mostly vague results about storing your code in and SVN repository, which is far from what I'm looking for.

    Read the article

  • How would I merged nested dictionaries in a list in python?

    - by Kevin
    for example if i had the result [{'Germany': {"Luge - Men's Singles": 'Gold'}}, {'Germany': {"Luge - Men's Singles": 'Silver'}}, {'Italy': {"Luge - Men's Singles": 'Bronze'}}] [{'Germany': {"Luge - Women's Singles": 'Gold'}}, {'Austria': {"Luge - Women's Singles": 'Silver'}}, {'Germany': {"Luge - Women's Singles": 'Bronze'}}] [{'Austria': {'Luge - Doubles': 'Gold'}}, {'Latvia': {'Luge - Doubles': 'Silver'}}, {'Germany': {'Luge - Doubles': 'Bronze'}}] how would I sort this so that all of the events germany and so on had won could be under one single title. i.e germany would be germany:Luge - Men's Singles: Gold, Silver, Luge - Women's Singles: Gold, Bronze, Luge - Doubles: Bronze. thanks for any help

    Read the article

  • How can I convert data encoded in WE8MSWIN1252 to utf8 for use in Python scripts?

    - by James Dean
    This data comes from an Oracle database and is extracted to flatfiles in encoding 'WE8MSWIN1252'. I want to parse the data and do some analysis. I want to see the text fields but do not need to publish the results to any other system so if some characters do not get converted perfectly I do not have a problem with that. I just do not want my parsing to fail with a decode error which is what I get if I use: inputFile = codecs.open( dataFileName, "r", "utf-8'")

    Read the article

  • How can I dispatch Firefox or Google Chrome with Python?

    - by Shady
    How can I do this with Firefox or Google Chrome? ie = win32com.client.Dispatch('InternetExplorer.Application') ie.visible = 1 ie.navigate('http://google.com') Is there a way to do it? ps: I need to use the ReadyState with it... for example while (ie.ReadyState != 4):, or in other words, I need some command that wait until the page loads completely until do the next command, that's why I need the dispatch, that currently work very good with IE

    Read the article

  • how to read specific number of floats from file in python?

    - by sahel
    I am reading a text file from the web. The file starts with some header lines containing the number of data points, followed the actual vertices (3 coordinates each). The file looks like: # comment HEADER TEXT POINTS 6 float 1.1 2.2 3.3 4.4 5.5 6.6 7.7 8.8 9.9 1.1 2.2 3.3 4.4 5.5 6.6 7.7 8.8 9.9 POLYGONS the line starting with the word POINTS contains the number of vertices (in this case we have 3 vertices per line, but that could change) This is how I am reading it right now: ur=urlopen("http://.../file.dat") j=0 contents = [] while 1: line = ur.readline() if not line: break else: line=line.lower() if 'points' in line : myline=line.strip() word=myline.split() node_number=int(word[1]) node_type=word[2] while 'polygons' not in line : line = ur.readline() line=line.lower() myline=line.split() i=0 while(i<len(myline)): contents[j]=float(myline[i]) i=i+1 j=j+1 How can I read a specified number of floats instead of reading line by line as strings and converting to floating numbers? Instead of ur.readline() I want to read the specified number of elements in the file Any suggestion is welcome..

    Read the article

  • What is the fastest way to scale and display an image in Python?

    - by Knut Eldhuset
    I am required to display a two dimensional numpy.array of int16 at 20fps or so. Using Matplotlib's imshow chokes on anything above 10fps. There obviously are some issues with scaling and interpolation. I should add that the dimensions of the array are not known, but will probably be around thirty by four hundred. These are data from a sensor that are supposed to have a real-time display, so the data has to be re-sampled on the fly.

    Read the article

  • python: how to jump to a particular line in a huge text file?

    - by photographer
    Are there any alternatives to the code below: startFromLine = 141978 # or whatever line I need to jump to urlsfile = open(filename, "rb", 0) linesCounter = 1 for line in urlsfile: if linesCounter > startFromLine: DoSomethingWithThisLine(line) linesCounter += 1 if I'm processing a huge text file (~15MB) with lines of unknown but different length, and need to jump to a particular line which number I know in advance? I feel bad by processing them one by one when I know I could ignore at least first half of the file. Looking for more elegant solution if there is any.

    Read the article

< Previous Page | 237 238 239 240 241 242 243 244 245 246 247 248  | Next Page >