Search Results

Search found 16680 results on 668 pages for 'python datetime'.

Page 135/668 | < Previous Page | 131 132 133 134 135 136 137 138 139 140 141 142  | Next Page >

  • Python: Catching / blocking SIGINT during system call

    - by danben
    I've written a web crawler that I'd like to be able to stop via the keyboard. I don't want the program to die when I interrupt it; it needs to flush its data to disk first. I also don't want to catch KeyboardInterruptedException, because the persistent data could be in an inconsistent state. My current solution is to define a signal handler that catches SIGINT and sets a flag; each iteration of the main loop checks this flag before processing the next url. However, I've found that if the system happens to be executing socket.recv() when I send the interrupt, I get this: ^C Interrupted; stopping... // indicates my interrupt handler ran Traceback (most recent call last): File "crawler_test.py", line 154, in <module> main() ... File "/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/socket.py", line 397, in readline data = recv(1) socket.error: [Errno 4] Interrupted system call and the process exits completely. Why does this happen? Is there a way I can prevent the interrupt from affecting the system call?

    Read the article

  • python mongokit Connection() AssertionError

    - by zalew
    just installed mongokit and can't figure out why I get AssertionError python console: >>> from mongokit import Connection >>> c = Connection() Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/local/lib/python2.6/dist-packages/mongokit-0.5.3-py2.6.egg/mongokit/connection.py", line 35, in __init__ super(Connection, self).__init__(*args, **kwargs) File "build/bdist.linux-i686/egg/pymongo/connection.py", line 169, in __init__ File "build/bdist.linux-i686/egg/pymongo/connection.py", line 338, in __find_master File "build/bdist.linux-i686/egg/pymongo/connection.py", line 226, in __master File "build/bdist.linux-i686/egg/pymongo/database.py", line 220, in command File "build/bdist.linux-i686/egg/pymongo/collection.py", line 356, in find_one File "build/bdist.linux-i686/egg/pymongo/cursor.py", line 485, in next File "build/bdist.linux-i686/egg/pymongo/cursor.py", line 461, in _refresh File "build/bdist.linux-i686/egg/pymongo/cursor.py", line 429, in __send_message File "build/bdist.linux-i686/egg/pymongo/helpers.py", line 98, in _unpack_response AssertionError >>> mongodb console: Wed Mar 31 10:27:34 connection accepted from 127.0.0.1:60480 #30 Wed Mar 31 10:27:34 end connection 127.0.0.1:60480 db 1.5 pymongo 1.5 (tested also on 1.4.) mongokit 0.5.3 (also 0.5.2)

    Read the article

  • Multiple Unpacking Assignment in Python when you don't know the sequence length

    - by doug
    The textbook examples of multiple unpacking assignment are something like: import numpy as NP M = NP.arange(5) a, b, c, d, e = M # so of course, a = 0, b = 1, etc. M = NP.arange(20).reshape(5, 4) # numpy 5x4 array a, b, c, d, e = M # here, a = M[0,:], b = M[1,:], etc. (ie, a single row of M is assigned each to a through e) (My Q is not numpy specfic; indeed, i would prefer a pure python solution.) W/r/t the piece of code i'm looking at now, i see two complications on that straightforward scenario: i usually won't know the shape of M; and i want to unpack a certain number of items (definitely less than all items) and i want to put the remainder into a single container so back to the 5x4 array above, what i would very much like to be able to do is, for instance, assign the first three rows of M to a, b, and c respectively (exactly as above) and the rest of the rows (i have no idea how many there will be, just some positive integer) to a single container, all_the_rest = []. I'm not sure if i have explained this clearly; in any event, if i get feedback i'll promptly edit my Question.

    Read the article

  • Python: slow read & write for millions of small files

    - by Jami
    I am building directory tree which has tons of subdirectories and files. The total directory count is somewhere along 256^32 subdirectories with 256 files in each end which are only a few bytes long. I did this so I would have fast access to these files (since i'm not searching and i'm just directly accessing then via a known file path) I have a python script that builds this filesystem and reads & writes those files. The problem is that when I reach more than 1Gb of total filesize, the read and write methods become extremely slow. Here's the function I have that reads the contents of a file (the file contains an integer string), adds a certain number to it, then writes it back to the original file. def addInFile(path, scoreToAdd): num = scoreToAdd try: shutil.copyfile(path, '/tmp/tmp.txt') fp = open('/tmp/tmp.txt', 'r') num += int(fp.readlines()[0]) fp.close() except: pass fp = open('/tmp/tmp.txt', 'w') fp.write(str(num)) fp.close() shutil.copyfile('/tmp/tmp.txt', path) I previously tried performing linux console commands but it was slower. I copy the file to a temporary file first then access/modify it then copy it back because i found this was faster than directly accessing the file. I think the cause of the slowdown is because there're tons of files. performing this function 1000 times sometimes reach 1 minute now, but before (when there were only a few files, 1000 calls was performed for only less than 1 second) How do you suggest I fix this?

    Read the article

  • Python and a "time value of money" problem.

    - by jamieb
    (I asked this question earlier today, but I did a poor job of explaining myself. Let me try again) I have a client who is an industrial maintenance company. They sell service agreements that are prepaid 20 hour blocks of a technician's time. Some of their larger customers might burn through that agreement in two weeks while customers with fewer problems might go eight months on that same contract. I would like to use Python to help model projected sales revenue and determine how many billable hours per month that they'll be on the hook for. If each customer only ever bought a single service contract (never renewed) it would be easy to figure sales as monthly_revenue = contract_value * qty_contracts_sold. Billable hours would also be easy: billable_hrs = hrs_per_contract * qty_contracts_sold. However, how do I account for renewals? Assuming that 90% (or some other arbitrary amount) of customers renew, then their monthly revenue ought to grow geometrically. Another important variable is how long the average customer burns through a contract. How do I determine what the revenue and billable hours will be 3, 6, or 12 months from now, based on various renewal and burn rates? I assume that I'd use some type of recursive function but math was never one of my strong points. Any suggestions please? Edit: I'm thinking that the best way to approach this is to think of it as a "time value of money" problem. I've retitled the question as such. The problem is probably a lot more common if you think of "monthly sales" as something similar to annuity payments.

    Read the article

  • Python script to calculate aded combinations from a dictionary

    - by dayde
    I am trying to write a script that will take a dictionary of items, each containing properties of values from 0 - 10, and add the various elements to select which combination of items achieve the desired totals. I also need the script to do this, using only items that have the same "slot" in common. For example: item_list = { 'item_1': {'slot': 'top', 'prop_a': 2, 'prop_b': 0, 'prop_c': 2, 'prop_d': 1 }, 'item_2': {'slot': 'top', 'prop_a': 5, 'prop_b': 0, 'prop_c': 1, 'prop_d':-1 }, 'item_3': {'slot': 'top', 'prop_a': 2, 'prop_b': 5, 'prop_c': 2, 'prop_d':-2 }, 'item_4': {'slot': 'mid', 'prop_a': 5, 'prop_b': 5, 'prop_c':-5, 'prop_d': 0 }, 'item_5': {'slot': 'mid', 'prop_a':10, 'prop_b': 0, 'prop_c':-5, 'prop_d': 0 }, 'item_6': {'slot': 'mid', 'prop_a':-5, 'prop_b': 2, 'prop_c': 3, 'prop_d': 5 }, 'item_7': {'slot': 'bot', 'prop_a': 1, 'prop_b': 3, 'prop_c':-4, 'prop_d': 4 }, 'item_8': {'slot': 'bot', 'prop_a': 2, 'prop_b': 2, 'prop_c': 0, 'prop_d': 0 }, 'item_9': {'slot': 'bot', 'prop_a': 3, 'prop_b': 1, 'prop_c': 4, 'prop_d':-4 }, } The script would then need to select which combinations from the "item_list" dict that using 1 item per "slot" that would achieve a desired result when added. For example, if the desired result was: 'prop_a': 3, 'prop_b': 3, 'prop_c': 8, 'prop_d': 0, the script would select 'item_2', 'item_6', and 'item_9', along with any other combination that worked. 'item_2': {'slot': 'top', 'prop_a': 5, 'prop_b': 0, 'prop_c': 1, 'prop_d':-1 } 'item_6': {'slot': 'mid', 'prop_a':-5, 'prop_b': 2, 'prop_c': 3, 'prop_d': 5 } 'item_9': {'slot': 'bot', 'prop_a': 3, 'prop_b': 1, 'prop_c': 4, 'prop_d':-4 } 'total': 'prop_a': 3, 'prop_b': 3, 'prop_c': 8, 'prop_d': 0 Any ideas how to accomplish this? It does not need to be in python, or even a thorough script, but just an explanation on how to do this in theory would be enough for me. I have tried working out looping through every combination, but that seems to very quickly get our of hand and unmanageable. The actual script will need to do this for about 1,000 items using 20 different "slots", each with 8 properties. Thanks for the help!

    Read the article

  • Python IOError: Not a gzipped file (Gzip and Blowfish Encrypt/Compress)

    - by notbad.jpeg
    I'm having some problems with python's built-in library gzip. Looked through almost every other stack question about it, and none of them seem to work. MY PROBLEM IS THAT WHEN I TRY TO DECOMPRESS I GET THE IOError I'm Getting: Traceback (most recent call last): File "mymodule.py", line 61, in return gz.read() File "/usr/lib/python2.7/gzip.py", line 245, readself._read(readsize) File "/usr/lib/python2.7/gzip.py", line 287, in _readself._read_gzip_header() File "/usr/lib/python2.7/gzip.py", line 181, in _read_gzip_header raise IOError, 'Not a gzipped file'IOError: Not a gzipped file This is my code to send it over SMB, it might not make sense why i do things, but it's normally in a while loop and memory efficient, I just simplified it. buffer = cStringIO.StringIO(output) #output is from a subprocess call small_buffer = cStringIO.StringIO() small_string = buffer.read() #need a string to write to buffer gzip_obj = gzip.GzipFile(fileobj=small_buffer,compresslevel=6, mode='wb') gzip_obj.write(small_string) compressed_str = small_buffer.getvalue() blowfish = Blowfish.new('abcd', Blowfish.MODE_ECB) remainder = '|'*(8 - (len(compressed_str) % 8)) compressed_str += remainder encrypted = blowfish.encrypt(compressed_str) #i send it over smb, then retrieve it later Then this is the code that retrieves it: #buffer is a cStringIO object filled with data from smb retrieval decrypter = Blowfish.new('abcd', Blowfish.MODE_ECB) value = buffer.getvalue() decrypted = decrypter.decrypt(value) buff = cStringIO.StringIO(decrypted) buff.seek(0) gz = gzip.GzipFile(fileobj=buff) return gz.read() Here's the problem return gz.read()

    Read the article

  • Need help running Python app as service in Ubuntu with Upstart

    - by GreeenGuru
    I have written a logging application in Python that is meant to start at boot, but I've been unable to start the app with Ubuntu's Upstart init daemon. When run from the terminal with sudo /usr/local/greeenlog/main.pyw, the application works perfectly. Here is what I've tried for the Upstart job: /etc/init/greeenlog.conf # greeenlog description "I log stuff." start on startup stop on shutdown script exec /usr/local/greeenlog/main.pyw end script My application starts one child thread, in case that is important. I've tried the job with the expect fork stanza without any change in the results. I've also tried this with sudo and without the script statements (just a lone exec statement). In all cases, after boot, running status greeenlog returns greeenlog stop/waiting and running start greeenlog returns: start: Rejected send message, 1 matched rules; type="method_call", sender=":1.61" (uid=1000 pid=2496 comm="start) interface="com.ubuntu.Upstart0_6.Job" member="Start" error name="(unset)" requested_reply=0 destination="com.ubuntu.Upstart" (uid=0 pid=1 comm="/sbin/init")) Can anyone see what I'm doing wrong? I appreciate any help you can give. Thanks.

    Read the article

  • Extracting a .app from a zip file in Python, using ZipFile

    - by Yakattak
    I'm trying to extract new revisions of Chromium.app from their snapshots, and I can download the file fine, but when it comes to extracting it, ZipFile either extracts the chrome-mac folder within as a file, says that directories don't exist, etc. I am very new to python, so these errors make little sense to me. Here is what I have so far. import urllib2 response = urllib2.urlopen('http://build.chromium.org/buildbot/snapshots/chromium-rel-mac/LATEST') latestRev = response.read() print latestRev # we have the revision, now we need to download the zip and extract it latestZip = urllib2.urlopen('http://build.chromium.org/buildbot/snapshots/chromium-rel-mac/%i/chrome-mac.zip' % (int(latestRev)), '~/Desktop/ChromiumUpdate/%i-update' % (int(latestRev))) #declare some vars that hold paths n shit workingDir = '/Users/slehan/Desktop/ChromiumUpdate/' chromiumZipPath = '%s%i-update.zip' % (workingDir, (int(latestRev))) chromiumAppPath = 'chrome-mac/' #the path of the chromium executable within the zip file chromiumAppExtracted = '%s/Chromium.app' % (workingDir) # path of the extracted executable output = open(chromiumZipPath, 'w') #delete any current file there output.write(latestZip.read()) output.close() # we have the .zip now we need to extract the Chromium.app file, it's in ziproot/chrome-mac/Chromium.app import zipfile, os zippedFile = open(chromiumZipPath) zippedChromium = zipfile.ZipFile(zippedFile, 'r') zippedChromium.extract(chromiumAppPath, workingDir) #print zippedChromium.namelist() zippedChromium.close() #zippedChromium.close() Any ideas?

    Read the article

  • Finding specific words in a file (Python language)

    - by Caroline Yi
    I have to write a program in python where the user is given a menu with four different "word games". There is a file called dictionary.txt and one of the games requires the user to input a) the number of letters in a word and b) a letter to exclude from the words being searched in the dictionary (dictionary.txt has the whole dictionary). Then the program prints the words that follow the user's requirements. My question is how on earth do I open the file and search for words with a certain length in that file. I only have a basic code which only asks the user for inputs. I'm am very new at this please help :( this is what I have up to the first option. The others are fine and I know how to break the loop but this specific one is really giving me trouble. I have tried everything and I just keep getting errors. Honestly, I only took this class because someone said it would be fun. It is, but recently I've really been falling behind and I have no idea what to do now. This is an intro level course so please be nice I've never done this before until now :( print print "Choose Which Game You Want to Play" print "a) Find words with only one vowel and excluding a specific letter." print "b) Find words containing all but one of a set of letters." print "c) Find words containing a specific character string." print "d) Find words containing state abbreviations." print "e) Find US state capitals that start with months." print "q) Quit." print choice = raw_input("Enter a choice: ") choice = choice.lower() print choice while choice != "q": if choice == "a": #wordlen = word length user is looking for.s wordlen = raw_input("Please enter the word length you are looking for: ") wordlen = int(wordlen) print wordlen #letterex = letter user wishes to exclude. letterex = raw_input("Please enter the letter you'd like to exclude: ") letterex = letterex.lower() print letterex

    Read the article

  • Python subprocess: 64 bit windows server PIPE doesn't exist :(

    - by Spaceman1861
    I have a GUI that launches selected python scripts and runs it in cmd next to the gui window. I am able to get my launcher to work on my (windows xp 32 bit) laptop but when I upload it to the server(64bit windows iss7) I am running into some issues. The script runs, to my knowledge but spits back no information into the cmd window. My script is a bit of a Frankenstein that I have hacked and slashed together to get it to work I am fairly certain that this is a very bad example of the subprocess module. Just wondering if i could get a hand :). My question is how do i have to alter my code to work on a 64bit windows server. :) from Tkinter import * import pickle,subprocess,errno,time,sys,os PIPE = subprocess.PIPE if subprocess.mswindows: from win32file import ReadFile, WriteFile from win32pipe import PeekNamedPipe import msvcrt else: import select import fcntl def recv_some(p, t=.1, e=1, tr=5, stderr=0): if tr < 1: tr = 1 x = time.time()+t y = [] r = '' pr = p.recv if stderr: pr = p.recv_err while time.time() < x or r: r = pr() if r is None: if e: raise Exception(message) else: break elif r: y.append(r) else: time.sleep(max((x-time.time())/tr, 0)) return ''.join(y) def send_all(p, data): while len(data): sent = p.send(data) if sent is None: raise Exception(message) data = buffer(data, sent) The code above isn't mine def Run(): print filebox.get(0) location = filebox.get(0) location = location.__str__().replace(listbox.get(ANCHOR).__str__(),"") theTime = time.asctime(time.localtime(time.time())) lastbox.delete(0, END) lastbox.insert(END,theTime) for line in CookieCont: if listbox.get(ANCHOR) in line and len(line) > 4: line[4] = theTime else: "Fill In the rip Details to record the time" if __name__ == '__main__': if sys.platform == 'win32' or sys.platform == 'win64': shell, commands, tail = ('cmd', ('cd "'+location+'"',listbox.get(ANCHOR).__str__()), '\r\n') else: return "Please use contact admin" a = Popen(shell, stdin=PIPE, stdout=PIPE) print recv_some(a) for cmd in commands: send_all(a, cmd + tail) print recv_some(a) send_all(a, 'exit' + tail) print recv_some(a, e=0) The Code above is mine :)

    Read the article

  • Internationalizing a Python 2.6 application via Babel

    - by Malcolm
    We're evaluating Babel 0.9.5 [1] under Windows for use with Python 2.6 and have the following questions that we we've been unable to answer through reading the documentation or googling. 1) I would like to use an _ like abbreviation for ungettext. Is there a concencus on whether one should use n_ or N_ for this? n_ does not appear to work. Babel does not extract text. N_ appears to partially work. Babel extracts text like it does for gettext, but does not format for ngettext (missing plural argument and msgstr[ n ].) 2) Is there a way to set the initial msgstr fields like the following when creating a POT file? I suspect there may be a way to do this via Babel cfg files, but I've been unable to find documentation on the Babel cfg file format. "Project-Id-Version: PROJECT VERSION\n" "Language-Team: en_US \n" 3) Is there a way to preserve 'obsolete' msgid/msgstr's in our PO files? When I use the Babel update command, newly created obsolete strings are marked with #~ prefixes, but existing obsolete message strings get deleted. Thanks, Malcolm [1] http://babel.edgewall.org/

    Read the article

  • Python virtualenv questions

    - by orokusaki
    I'm using VirtualEnv on Windows XP. I'm wondering if I have my brain wrapped around it correctly. I ran virtualenv ENV and it created C:\WINDOWS\system32\ENV. I then changed my PATH variable to include C:\WINDOWS\system32\ENV\Scripts instead of C:\Python27\Scripts. Then, I checked out Django into C:\WINDOWS\system32\ENV\Lib\site-packages\django-trunk, updated my PYTHON_PATH variable to point the new Django directory, and continued to easy_install other things (which of course go into my new C:\WINDOWS\system32\ENV\Lib\site-packages directory). I understand why I should use VirtualEnv so I can run multiple versions of Django, and other libraries on the same machine, but does this mean that to switch between environments I have to basically change my PATH and PYTHON_PATH variable? So, I go from developing one Django project which uses Django 1.2 in an environment called ENV and then change my PATH and such so that I can use an environment called ENV2 which has the dev version of Django? Is that basically it, or is there some better way to automatically do all this (I could update my path in Python code, but that would require me to write machine-specific code in my application)? Also, how does this process compare to using VirtualEnv on Linux (I'm quite the beginner at Linux).

    Read the article

  • how to implement a really efficient bitvector sorting in python

    - by xiao
    Hello guys! Actually this is an interesting topic from programming pearls, sorting 10 digits telephone numbers in a limited memory with an efficient algorithm. You can find the whole story here What I am interested in is just how fast the implementation could be in python. I have done a naive implementation with the module bitvector. The code is as following: from BitVector import BitVector import timeit import random import time import sys def sort(input_li): return sorted(input_li) def vec_sort(input_li): bv = BitVector( size = len(input_li) ) for i in input_li: bv[i] = 1 res_li = [] for i in range(len(bv)): if bv[i]: res_li.append(i) return res_li if __name__ == "__main__": test_data = range(int(sys.argv[1])) print 'test_data size is:', sys.argv[1] random.shuffle(test_data) start = time.time() sort(test_data) elapsed = (time.time() - start) print "sort function takes " + str(elapsed) start = time.time() vec_sort(test_data) elapsed = (time.time() - start) print "sort function takes " + str(elapsed) start = time.time() vec_sort(test_data) elapsed = (time.time() - start) print "vec_sort function takes " + str(elapsed) I have tested from array size 100 to 10,000,000 in my macbook(2GHz Intel Core 2 Duo 2GB SDRAM), the result is as following: test_data size is: 1000 sort function takes 0.000274896621704 vec_sort function takes 0.00383687019348 test_data size is: 10000 sort function takes 0.00380706787109 vec_sort function takes 0.0371489524841 test_data size is: 100000 sort function takes 0.0520560741425 vec_sort function takes 0.374383926392 test_data size is: 1000000 sort function takes 0.867373943329 vec_sort function takes 3.80475401878 test_data size is: 10000000 sort function takes 12.9204008579 vec_sort function takes 38.8053860664 What disappoints me is that even when the test_data size is 100,000,000, the sort function is still faster than vec_sort. Is there any way to accelerate the vec_sort function?

    Read the article

  • Catch a thread's exception in the caller thread in Python

    - by Mikee
    Hi Everyone, I'm very new to Python and multithreaded programming in general. Basically, I have a script that will copy files to another location. I would like this to be placed in another thread so I can output "...." to indicate that the script is still running. The problem that I am having is that if the files cannot be copied it will throw an exception. This is ok if running in the main thread; however, having the following code does not work: try: threadClass = TheThread(param1, param2, etc.) threadClass.start() ##### **Exception takes place here** except: print "Caught an exception" In the thread class itself, I tried to re-throw the exception, but it does not work. I have seen people on here ask similar questions, but they all seem to be doing something more specific than what I am trying to do (and I don't quite understand the solutions offered). I have seen people mention the usage of sys.exc_info(), however I do not know where or how to use it. All help is greatly appreciated! EDIT: The code for the thread class is below: class TheThread(threading.Thread): def __init__(self, sourceFolder, destFolder): threading.Thread.__init__(self) self.sourceFolder = sourceFolder self.destFolder = destFolder def run(self): try: shul.copytree(self.sourceFolder, self.destFolder) except: raise

    Read the article

  • Simple continuously running XMPP client in python

    - by tom
    I'm using python-xmpp to send jabber messages. Everything works fine except that every time I want to send messages (every 15 minutes) I need to reconnect to the jabber server, and in the meantime the sending client is offline and cannot receive messages. So I want to write a really simple, indefinitely running xmpp client, that is online the whole time and can send (and receive) messages when required. My trivial (non-working) approach: import time import xmpp class Jabber(object): def __init__(self): server = 'example.com' username = 'bot' passwd = 'password' self.client = xmpp.Client(server) self.client.connect(server=(server, 5222)) self.client.auth(username, passwd, 'bot') self.client.sendInitPresence() self.sleep() def sleep(self): self.awake = False delay = 1 while not self.awake: time.sleep(delay) def wake(self): self.awake = True def auth(self, jid): self.client.getRoster().Authorize(jid) self.sleep() def send(self, jid, msg): message = xmpp.Message(jid, msg) message.setAttr('type', 'chat') self.client.send(message) self.sleep() if __name__ == '__main__': j = Jabber() time.sleep(3) j.wake() j.send('[email protected]', 'hello world') time.sleep(30) The problem here seems to be that I cannot wake it up. My best guess is that I need some kind of concurrency. Is that true, and if so how would I best go about that? EDIT: After looking into all the options concerning concurrency, I decided to go with twisted and wokkel. If I could, I would delete this post.

    Read the article

  • Difference in regex between Python and Rubular?

    - by Rosarch
    In Rubular, I have created a regular expression: (Prerequisite|Recommended): (\w|-| )* It matches the bolded: Recommended: good comfort level with computers and some of the arts. Summer. 2 credits. Prerequisite: pre-freshman standing or permission of instructor. Credit may not be applied toward engineering degree. S-U grades only. Here is a use of the regex in Python: note_re = re.compile(r'(Prerequisite|Recommended): (\w|-| )*', re.IGNORECASE) def prereqs_of_note(note): match = note_re.match(note) if not match: return None return match.group(0) Unfortunately, the code returns None instead of a match: >>> import prereqs >>> result = prereqs.prereqs_of_note("Summer. 2 credits. Prerequisite: pre-fres hman standing or permission of instructor. Credit may not be applied toward engi neering degree. S-U grades only.") >>> print result None What am I doing wrong here?

    Read the article

  • Python Multiword Index

    - by Manab Chetia
    index = {'Michael': [['mj.com',1], ['Nine.com',9],['i.com', 34]], / 'Jackson': [['One.com',4],['mj.com', 2],['Nine.com', 10], ['i.com', 45]], / 'Thriller' : [['Seven.com', 7], ['Ten.com',10], ['One.com', 5], ['mj.com',3]} # In this dictionary (index), for eg: 'KEYWORD': # [['THE LINK in which KEYWORD is present,'POSITION # of KEYWORD in the page specified by link']] eg: Michael is present in MJ.com, NINE.com, and i.com at positions 1, 9, 34 of respective pages. Please help me with a python procedure which takes index and KEYWORDS as input. When i enter 'MICHAEL'. The result should be: >>['mj.com', 'nine.com', 'i.com'] When I enter 'MICHAEL JACKSON'. The result should be : >>['mj.com', 'Nine.com'] as 'Michael' and 'Jackson' are present at 'mj.com' and 'nine.com' consecutively i.e. in positions (1,2) & (9,10) respectively. The result should not show 'i.com' even though it contains both KEYWORDS but they are not placed consecutively. When I enter 'MICHAEL JACKSON THRILLER', the result should be ['mj.com'] as the 3 words 'MICHAEL', 'JACKSON', 'THRILLER' are placed consecutively in 'mj.com' ie positions (1, 2, 3) respectively. If I enter 'THRILLER JACKSON' or 'THRILLER FEDERER', the result should be NONE.

    Read the article

  • Python: Most efficient way to concatenate and rearrange files

    - by user300890
    Hi, I am reading from several files, each file is divided into 2 pieces, first a header section of a few thousand lines followed by a body of a few thousand. My problem is I need to concatenate these files into one file where all the headers are on the top followed by the body. Currently I am using two loops; one to pull out all the headers and write them, and the second to write the body of each file (I also include a tmp_count variable to limit the number of lines to be loading into memory before dumping to file). This is pretty slow - about 6min for 13gb file. Can anyone tell me how to optimize this or if there is a faster way to do this in python ? Thanks! Here is my code: def cat_files_sam(final_file_name,work_directory_master,file_count): final_file = open(final_file_name,"w") if len(file_count) > 1: file_count=sort_output_files(file_count) # only for @ headers for bowtie_file in file_count: #print bowtie_file tmp_list = [] tmp_count = 0 for line in open(os.path.join(work_directory_master,bowtie_file)): if line.startswith("@"): if tmp_count == 1000000: final_file.writelines(tmp_list) tmp_list = [] tmp_count = 0 tmp_list.append(line) tmp_count += 1 else: final_file.writelines(tmp_list) break for bowtie_file in file_count: #print bowtie_file tmp_list = [] tmp_count = 0 for line in open(os.path.join(work_directory_master,bowtie_file)): if line.startswith("@"): continue if tmp_count == 1000000: final_file.writelines(tmp_list) tmp_list = [] tmp_count = 0 tmp_list.append(line) tmp_count += 1 final_file.writelines(tmp_list) final_file.close()

    Read the article

  • Querying the Datastore in python

    - by Ray
    Greetings! I am trying to work with a single column in the datatstore, I can view and display the contents, like this - q = test.all() q.filter("adjclose =", "adjclose") q = db.GqlQuery("SELECT * FROM test") results = q.fetch(5) for p in results: p1 = p.adjclose print "The value is --> %f" % (p.adjclose) however i need to calculate the historical values with the adjclose column, and I am not able to get over the errors for c in range(len(p1)-1): TypeError: object of type 'float' has no len() here is my code! for c in range(len(p1)-1): p1.append(p1[c+1]-p1[c]/p1[c]) p2 = (p1[c+1]-p1[c]/p1[c]) print "the p1 value<-- %f" % (p2) print "dfd %f" %(p1) new to python, any help will be greatly appreciated! thanks in advance Ray HERE IS THE COMPLETE CODE class CalHandler(webapp.RequestHandler): def get(self): que = db.GqlQuery("SELECT * from test") user_list = que.fetch(limit=100) doRender( self, 'memberscreen2.htm', {'user_list': user_list} ) q = test.all() q.filter("adjclose =", "adjclose") q = db.GqlQuery("SELECT * FROM test") results = q.fetch(5) for p in results: p1 = p.adjclose print "The value is --> %f" % (p.adjclose) for c in range(len(p1)-1): p1.append(p1[c+1]-p1[c]/p1[c]) print "the p1 value<--> %f" % (p2) print "dfd %f" %(p1)

    Read the article

  • how to multithread on a python server

    - by user3732790
    HELP please i have this code import socket from threading import * import time HOST = '' # Symbolic name meaning all available interfaces PORT = 8888 # Arbitrary non-privileged port s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) print ('Socket created') s.bind((HOST, PORT)) print ('Socket bind complete') s.listen(10) print ('Socket now listening') def listen(conn): odata = "" end = 'end' while end == 'end': data = conn.recv(1024) if data != odata: odata = data print(data) if data == b'end': end = "" print("conection ended") conn.close() while True: time.sleep(1) conn, addr = s.accept() print ('Connected with ' + addr[0] + ':' + str(addr[1])) Thread.start_new_thread(listen,(conn)) and i would like it so that when ever a person comes onto the server it has its own thread. but i can't get it to work please someone help me. :_( here is the error code: Socket created Socket bind complete Socket now listening Connected with 127.0.0.1:61475 Traceback (most recent call last): File "C:\Users\Myles\Desktop\test recever - Copy.py", line 29, in <module> Thread.start_new_thread(listen,(conn)) AttributeError: type object 'Thread' has no attribute 'start_new_thread' i am on python version 3.4.0 and here is the users code: import socket #for sockets import time s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) print('Socket Created') host = 'localhost' port = 8888 remote_ip = socket.gethostbyname( host ) print('Ip address of ' + host + ' is ' + remote_ip) #Connect to remote server s.connect((remote_ip , port)) print ('Socket Connected to ' + host + ' on ip ' + remote_ip) while True: message = input("> ") #Set the whole string s.send(message.encode('utf-8')) print ('Message send successfully') data = s.recv(1024) print(data) s.close

    Read the article

  • Python optimization problem?

    - by user342079
    Alright, i had this homework recently (don't worry, i've already done it, but in c++) but I got curious how i could do it in python. The problem is about 2 light sources that emit light. I won't get into details tho. Here's the code (that I've managed to optimize a bit in the latter part): import math, array import numpy as np from PIL import Image size = (800,800) width, height = size s1x = width * 1./8 s1y = height * 1./8 s2x = width * 7./8 s2y = height * 7./8 r,g,b = (255,255,255) arr = np.zeros((width,height,3)) hy = math.hypot print 'computing distances (%s by %s)'%size, for i in xrange(width): if i%(width/10)==0: print i, if i%20==0: print '.', for j in xrange(height): d1 = hy(i-s1x,j-s1y) d2 = hy(i-s2x,j-s2y) arr[i][j] = abs(d1-d2) print '' arr2 = np.zeros((width,height,3),dtype="uint8") for ld in [200,116,100,84,68,52,36,20,8,4,2]: print 'now computing image for ld = '+str(ld) arr2 *= 0 arr2 += abs(arr%ld-ld/2)*(r,g,b)/(ld/2) print 'saving image...' ar2img = Image.fromarray(arr2) ar2img.save('ld'+str(ld).rjust(4,'0')+'.png') print 'saved as ld'+str(ld).rjust(4,'0')+'.png' I have managed to optimize most of it, but there's still a huge performance gap in the part with the 2 for-s, and I can't seem to think of a way to bypass that using common array operations... I'm open to suggestions :D

    Read the article

  • Python and the self parameter

    - by Svend
    I'm having some issues with the self parameter, and some seemingly inconsistent behavior in Python is annoying me, so I figure I better ask some people in the know. I have a class, Foo. This class will have a bunch of methods, m1, through mN. For some of these, I will use a standard definition, like in the case of m1 below. But for others, it's more convinient to just assign the method name directly, like I've done with m2 and m3. import os def myfun(x, y): return x + y class Foo(): def m1(self, y, z): return y + z + 42 m2 = os.access m3 = myfun f = Foo() print f.m1(1, 2) print f.m2("/", os.R_OK) print f.m3(3, 4) Now, I know that os.access does not take a self parameter (seemingly). And it still has no issues with this type of assignment. However, I cannot do the same for my own modules (imagine myfun defined off in mymodule.myfun). Running the above code yields the following output: 3 True Traceback (most recent call last): File "foo.py", line 16, in <module> print f.m3(3, 4) TypeError: myfun() takes exactly 2 arguments (3 given) The problem is that, due to the framework I work in, I cannot avoid having a class Foo at least. But I'd like to avoid having my mymodule stuff in a dummy class. In order to do this, I need to do something ala def m3(self,a1, a2): return mymodule.myfun(a1,a2) Which is hugely redundant when you have like 20 of them. So, the question is, either how do I do this in a totally different and obviously much smarter way, or how can I make my own modules behave like the built-in ones, so it does not complain about receiving 1 argument too many.

    Read the article

  • Creating subtree from tree which is represented in xml - python

    - by Jay
    Hi I have an XML (in the form of tree), I require to create sub-tree out of it. For ex: <a> <b> <c>Hello</c> <d> <e>Hi</e> </a> Subtree would be <root> <a> <b> <c>Hello</c> </b> </a> <a> <d> <e>Hi</e> </d> </a> </root> What is the best XML library in python to do it? Any algorithm that already does this would also be helpful. Note: the XML doc won't be that big, it will easily fit in memory.

    Read the article

  • Validating and filling default values in XML based on XSD in Python

    - by PoltoS
    I have an XML like <a> <b/> <b c="2"/> </a> I have my XSD <xs:element name="a"> <xs:complexType> <xs:sequence> <xs:element name="b" maxOccurs="unbounded"> <xs:attribute name="c" default="1"/> </xs:element> </xs:sequence> </xs:complexType> </xs:element> I want to use my XSD to validate my original XML and fill all default values: <a> <b c="1"/> <b c="2"/> </a> How do I get it in Python? With validation there is no problem (e.g. XMLSchema). The problem are the default values.

    Read the article

< Previous Page | 131 132 133 134 135 136 137 138 139 140 141 142  | Next Page >