Search Results

Search found 17845 results on 714 pages for 'python social auth'.

Page 175/714 | < Previous Page | 171 172 173 174 175 176 177 178 179 180 181 182  | Next Page >

  • Python: Unpack arbitary length bits for database storage

    - by sberry2A
    I have a binary data format consisting of 18,000+ packed int64s, ints, shorts, bytes and chars. The data is packed to minimize it's size, so they don't always use byte sized chunks. For example, a number whose min and max value are 31, 32 respectively might be stored with a single bit where the actual value is bitvalue + min, so 0 is 31 and 1 is 32. I am looking for the most efficient way to unpack all of these for subsequent processing and database storage. Right now I am able to read any value by using either struct.unpack, or BitBuffer. I use struct.unpack for any data that starts on a bit where (bit-offset % 8 == 0 and data-length % 8 == 0) and I use BitBuffer for anything else. I know the offset and size of every packed piece of data, so what is going to be the fasted way to completely unpack them? Many thanks.

    Read the article

  • Python metaprogramming help

    - by Timmy
    im looking into mongoengine, and i wanted to make a class an "EmbeddedDocument" dynamically, so i do this def custom(cls): cls = type( cls.__name__, (EmbeddedDocument,), cls.__dict__.copy() ) cls.a = FloatField(required=True) cls.b = FloatField(required=True) return cls A = custom( A ) and tried it on some classes, but its not doing some of the base class's init or sumthing in BaseDocument def __init__(self, **values): self._data = {} # Assign initial values to instance for attr_name, attr_value in self._fields.items(): if attr_name in values: setattr(self, attr_name, values.pop(attr_name)) else: # Use default value if present value = getattr(self, attr_name, None) setattr(self, attr_name, value) but this never gets used, thus never setting ._data, and giving me errors. how do i do this?

    Read the article

  • Caching result of setUp() using Python unittest

    - by dbr
    I currently have a unittest.TestCase that looks like.. class test_appletrailer(unittest.TestCase): def setup(self): self.all_trailers = Trailers(res = "720", verbose = True) def test_has_trailers(self): self.failUnless(len(self.all_trailers) > 1) # ..more tests.. This works fine, but the Trailers() call takes about 2 seconds to run.. Given that setUp() is called before each test is run, the tests now take almost 10 seconds to run (with only 3 test functions) What is the correct way of caching the self.all_trailers variable between tests? Removing the setUp function, and doing.. class test_appletrailer(unittest.TestCase): all_trailers = Trailers(res = "720", verbose = True) ..works, but then it claims "Ran 3 tests in 0.000s" which is incorrect.. The only other way I could think of is to have a cache_trailers global variable (which works correctly, but is rather horrible): cache_trailers = None class test_appletrailer(unittest.TestCase): def setUp(self): global cache_trailers if cache_trailers is None: cache_trailers = self.all_trailers = all_trailers = Trailers(res = "720", verbose = True) else: self.all_trailers = cache_trailers

    Read the article

  • OpenMeetings + Python + Suds

    - by user366774
    Trying to integrate openmeetings with django website, but can't understand how properly configure ImportDoctor: (here :// replaced with __ 'cause spam protection) print url http://sovershenstvo.com.ua:5080/openmeetings/services/UserService?wsdl imp = Import('http__schemas.xmlsoap.org/soap/encoding/') imp.filter.add('http__services.axis.openmeetings.org') imp.filter.add('http__basic.beans.hibernate.app.openmeetings.org/xsd') imp.filter.add('http__basic.beans.data.app.openmeetings.org/xsd') imp.filter.add('http__services.axis.openmeetings.org') d = ImportDoctor(imp) client = Client(url, doctor = d) client.service.getSession() Traceback (most recent call last): File "", line 1, in File "/usr/lib/python2.6/site-packages/suds/client.py", line 539, in call return client.invoke(args, kwargs) File "/usr/lib/python2.6/site-packages/suds/client.py", line 598, in invoke result = self.send(msg) File "/usr/lib/python2.6/site-packages/suds/client.py", line 627, in send result = self.succeeded(binding, reply.message) File "/usr/lib/python2.6/site-packages/suds/client.py", line 659, in succeeded r, p = binding.get_reply(self.method, reply) File "/usr/lib/python2.6/site-packages/suds/bindings/binding.py", line 159, in get_reply resolved = rtypes[0].resolve(nobuiltin=True) File "/usr/lib/python2.6/site-packages/suds/xsd/sxbasic.py", line 63, in resolve raise TypeNotFound(qref) suds.TypeNotFound: Type not found: '(Sessiondata, http__basic.beans.hibernate.app.openmeetings.org/xsd, )' what i'm doing wrong? please help and sorry for my english, but you are my last chance to save position :( need webinars at morning (2.26 am now)

    Read the article

  • Dynamic Operator Overloading on dict classes in Python

    - by Ishpeck
    I have a class that dynamically overloads basic arithmetic operators like so... import operator class IshyNum: def __init__(self, n): self.num=n self.buildArith() def arithmetic(self, other, o): return o(self.num, other) def buildArith(self): map(lambda o: setattr(self, "__%s__"%o,lambda f: self.arithmetic(f, getattr(operator, o))), ["add", "sub", "mul", "div"]) if __name__=="__main__": number=IshyNum(5) print number+5 print number/2 print number*3 print number-3 But if I change the class to inherit from the dictionary (class IshyNum(dict):) it doesn't work. I need to explicitly def __add__(self, other) or whatever in order for this to work. Why?

    Read the article

  • compute mean in python for a generator

    - by nmaxwell
    Hi, I'm doing some statistics work, I have a (large) collection of random numbers to compute the mean of, I'd like to work with generators, because I just need to compute the mean, so I don't need to store the numbers. The problem is that numpy.mean breaks if you pass it a generator. I can write a simple function to do what I want, but I'm wondering if there's a proper, built-in way to do this? It would be nice if I could say "sum(values)/len(values)", but len doesn't work for genetators, and sum already consumed values. here's an example: import numpy def my_mean(values): n = 0 Sum = 0.0 try: while True: Sum += next(values) n += 1 except StopIteration: pass return float(Sum)/n X = [k for k in range(1,7)] Y = (k for k in range(1,7)) print numpy.mean(X) print my_mean(Y) these both give the same, correct, answer, buy my_mean doesn't work for lists, and numpy.mean doesn't work for generators. I really like the idea of working with generators, but details like this seem to spoil things. thanks for any help -nick

    Read the article

  • Restart logging to a new file (Python)

    - by compie
    I'm using the following code to initialize logging in my application. logger = logging.getLogger() logger.setLevel(logging.DEBUG) # log to a file directory = '/reserved/DYPE/logfiles' now = datetime.now().strftime("%Y%m%d_%H%M%S") filename = os.path.join(directory, 'dype_%s.log' % now) file_handler = logging.FileHandler(filename) file_handler.setLevel(logging.DEBUG) formatter = logging.Formatter("%(asctime)s %(filename)s, %(lineno)d, %(funcName)s: %(message)s") file_handler.setFormatter(formatter) logger.addHandler(file_handler) # log to the console console_handler = logging.StreamHandler() level = logging.INFO console_handler.setLevel(level) logger.addHandler(console_handler) logging.debug('logging initialized') How can I close the current logging file and restart logging to a new file? Note: I don't want to use RotatingFileHandler, because I want full control over all the filenames and the moment of rotation.

    Read the article

  • Python RegExp exception

    - by Jasie
    How do I split on all nonalphanumeric characters, EXCEPT the apostrophe? re.split('\W+',text) works, but will also split on apostrophes. How do I add an exception to this rule? Thanks!

    Read the article

  • supply inputs to python unittests

    - by zubin71
    I`m relatively new to the concept of unit-testing and have very little experience in the same. I have been looking at lots of articles on how to write unit-tests; however, I still have difficulty in writing tests where conditions like the following arise:- Test user Input. Test input read from a file. Test input read from an environment variable. Itd be great if someone could show me how to approach the above mentioned scenarios; itd still be awesome if you could point me to a few docs/articles/blog posts which I could read.

    Read the article

  • Python 4 steps setup with progressBars

    - by Samuel Taylor
    I'm having a problem with the code below. When I run it the progress bar will pulse for around 10 secs as meant to and then move on to downloading and will show the progress but when finished it will not move on to the next step it just locks up. import sys import time import pygtk import gtk import gobject import threading import urllib import urlparse class WorkerThread(threading.Thread): def __init__ (self, function, parent, arg = None): threading.Thread.__init__(self) self.function = function self.parent = parent self.arg = arg self.parent.still_working = True def run(self): # when does "run" get executed? self.parent.still_working = True if self.arg == None: self.function() else: self.function(self.arg) self.parent.still_working = False def stop(self): self = None class MainWindow: def __init__(self): gtk.gdk.threads_init() self.wTree = gtk.Builder() self.wTree.add_from_file("gui.glade") self.mainWindows() def mainWindows(self): self.mainWindow = self.wTree.get_object("frmMain") dic = { "on_btnNext_clicked" : self.mainWindowNext, } self.wTree.connect_signals(dic) self.mainWindow.show() self.installerStep = 0 # 0 = none, 1 = preinstall, 2 = download, 3 = install info, 4 = install #gtk.main() self.mainWindowNext() def pulse(self): self.wTree.get_object("progress").pulse() if self.still_working == False: self.mainWindowNext() return self.still_working def preinstallStep(self): self.wTree.get_object("progress").set_fraction(0) self.wTree.get_object("btnNext").set_sensitive(0) self.wTree.get_object("notebook1").set_current_page(0) self.installerStep = 1 WT = WorkerThread(self.heavyWork, self) #Would do a heavy function here like setup some thing WT.start() gobject.timeout_add(75, self.pulse) def downloadStep(self): self.wTree.get_object("progress").set_fraction(0) self.wTree.get_object("btnNext").set_sensitive(0) self.wTree.get_object("notebook1").set_current_page(0) self.installerStep = 2 urllib.urlretrieve('http://mozilla.mirrors.evolva.ro//firefox/releases/3.6.3/win32/en-US/Firefox%20Setup%203.6.3.exe', '/tmp/firefox.exe', self.updateHook) self.mainWindowNext() def updateHook(self, blocks, blockSize, totalSize): percentage = float ( blocks * blockSize ) / totalSize if percentage > 1: percentage = 1 self.wTree.get_object("progress").set_fraction(percentage) while gtk.events_pending(): gtk.main_iteration() def installInfoStep(self): self.wTree.get_object("btnNext").set_sensitive(1) self.wTree.get_object("notebook1").set_current_page(1) self.installerStep = 3 def installStep(self): self.wTree.get_object("progress").set_fraction(0) self.wTree.get_object("btnNext").set_sensitive(0) self.wTree.get_object("notebook1").set_current_page(0) self.installerStep = 4 WT = WorkerThread(self.heavyWork, self) #Would do a heavy function here like setup some thing WT.start() gobject.timeout_add(75, self.pulse) def mainWindowNext(self, widget = None): if self.installerStep == 0: self.preinstallStep() elif self.installerStep == 1: self.downloadStep() elif self.installerStep == 2: self.installInfoStep() elif self.installerStep == 3: self.installStep() elif self.installerStep == 4: sys.exit(0) def heavyWork(self): time.sleep(10) if __name__ == '__main__': MainWindow() gtk.main() I have a feeling that its something to do with: while gtk.events_pending(): gtk.main_iteration() Is there a better way of doing this?

    Read the article

  • strip spaces in python.

    - by Richard
    ok I know that this should be simple... anyways say: line = "$W5M5A,100527,142500,730301c44892fd1c,2,686.5 4,333.96,0,0,28.6,123,75,-0.4,1.4*49" I want to strip out the spaces. I thought you would just do this line = line.strip() but now line is still '$W5M5A,100527,142500,730301c44892fd1c,2,686.5 4,333.96,0,0,28.6,123,75,-0.4,1.4*49' instead of '$W5M5A,100527,142500,730301c44892fd1c,2,686.54,333.96,0,0,28.6,123,75,-0.4,1.4*49' any thoughts?

    Read the article

  • Python - pickling fails for numpy.void objects

    - by I82Much
    >>> idmapfile = open("idmap", mode="w") >>> pickle.dump(idMap, idmapfile) >>> idmapfile.close() >>> idmapfile = open("idmap") >>> unpickled = pickle.load(idmapfile) >>> unpickled == idMap False idMap[1] {1537: (552, 1, 1537, 17.793827056884766, 3), 1540: (4220, 1, 1540, 19.31205940246582, 3), 1544: (592, 1, 1544, 18.129131317138672, 3), 1675: (529, 1, 1675, 18.347782135009766, 3), 1550: (4048, 1, 1550, 19.31205940246582, 3), 1424: (1528, 1, 1424, 19.744396209716797, 3), 1681: (1265, 1, 1681, 19.596025466918945, 3), 1560: (3457, 1, 1560, 20.530569076538086, 3), 1690: (477, 1, 1690, 17.395542144775391, 3), 1691: (554, 1, 1691, 13.446117401123047, 3), 1436: (3010, 1, 1436, 19.596025466918945, 3), 1434: (3183, 1, 1434, 19.744396209716797, 3), 1441: (3570, 1, 1441, 20.589576721191406, 3), 1435: (476, 1, 1435, 19.640911102294922, 3), 1444: (527, 1, 1444, 17.98480224609375, 3), 1478: (1897, 1, 1478, 19.596025466918945, 3), 1575: (614, 1, 1575, 19.371648788452148, 3), 1586: (2189, 1, 1586, 19.31205940246582, 3), 1716: (3470, 1, 1716, 19.158674240112305, 3), 1590: (2278, 1, 1590, 19.596025466918945, 3), 1463: (991, 1, 1463, 19.31205940246582, 3), 1594: (1890, 1, 1594, 19.596025466918945, 3), 1467: (1087, 1, 1467, 19.31205940246582, 3), 1596: (3759, 1, 1596, 19.744396209716797, 3), 1602: (3011, 1, 1602, 20.530569076538086, 3), 1547: (490, 1, 1547, 17.994071960449219, 3), 1605: (658, 1, 1605, 19.31205940246582, 3), 1606: (1794, 1, 1606, 16.964881896972656, 3), 1719: (1826, 1, 1719, 19.596025466918945, 3), 1617: (583, 1, 1617, 11.894925117492676, 3), 1492: (3441, 1, 1492, 20.500667572021484, 3), 1622: (3215, 1, 1622, 19.31205940246582, 3), 1628: (2761, 1, 1628, 19.744396209716797, 3), 1502: (1563, 1, 1502, 19.596025466918945, 3), 1632: (1108, 1, 1632, 15.457141876220703, 3), 1468: (3779, 1, 1468, 19.596025466918945, 3), 1642: (3970, 1, 1642, 19.744396209716797, 3), 1518: (612, 1, 1518, 18.570245742797852, 3), 1647: (854, 1, 1647, 16.964881896972656, 3), 1650: (2099, 1, 1650, 20.439058303833008, 3), 1651: (540, 1, 1651, 18.552841186523438, 3), 1653: (613, 1, 1653, 19.237197875976563, 3), 1532: (537, 1, 1532, 18.885730743408203, 3)} >>> unpickled[1] {1537: (64880, 1638, 56700, -1.0808743559293829e+18, 152), 1540: (64904, 1638, 0, 0.0, 0), 1544: (54472, 1490, 0, 0.0, 0), 1675: (6464, 1509, 0, 0.0, 0), 1550: (43592, 1510, 0, 0.0, 0), 1424: (43616, 1510, 0, 0.0, 0), 1681: (0, 0, 0, 0.0, 0), 1560: (400, 152, 400, 2.1299736657737219e-43, 0), 1690: (408, 152, 408, 2.7201111331839077e+26, 34), 1435: (424, 152, 61512, 1.0122952080313192e-39, 0), 1436: (400, 152, 400, 20.250289916992188, 3), 1434: (424, 152, 62080, 1.0122952080313192e-39, 0), 1441: (400, 152, 400, 12.250144958496094, 3), 1691: (424, 152, 42608, 15.813941955566406, 3), 1444: (400, 152, 400, 19.625289916992187, 3), 1606: (424, 152, 42432, 5.2947192852601414e-22, 41), 1575: (400, 152, 400, 6.2537390010262572e-36, 0), 1586: (424, 152, 42488, 1.0122601755697111e-39, 0), 1716: (400, 152, 400, 6.2537390010262572e-36, 0), 1590: (424, 152, 64144, 1.0126357235581501e-39, 0), 1463: (400, 152, 400, 6.2537390010262572e-36, 0), 1594: (424, 152, 32672, 17.002994537353516, 3), 1467: (400, 152, 400, 19.750289916992187, 3), 1596: (424, 152, 7176, 1.0124003054161436e-39, 0), 1602: (400, 152, 400, 18.500289916992188, 3), 1547: (424, 152, 7000, 1.0124003054161436e-39, 0), 1605: (400, 152, 400, 20.500289916992188, 3), 1478: (424, 152, 42256, -6.0222748507426518e+30, 222), 1719: (400, 152, 400, 6.2537390010262572e-36, 0), 1617: (424, 152, 16472, 1.0124283313854301e-39, 0), 1492: (400, 152, 400, 6.2537390010262572e-36, 0), 1622: (424, 152, 35304, 1.0123190301052127e-39, 0), 1628: (400, 152, 400, 6.2537390010262572e-36, 0), 1502: (424, 152, 63152, 19.627988815307617, 3), 1632: (400, 152, 400, 19.375289916992188, 3), 1468: (424, 152, 38088, 1.0124213248931084e-39, 0), 1642: (400, 152, 400, 6.2537390010262572e-36, 0), 1518: (424, 152, 63896, 1.0127436235399031e-39, 0), 1647: (400, 152, 400, 6.2537390010262572e-36, 0), 1650: (424, 152, 53424, 16.752857208251953, 3), 1651: (400, 152, 400, 19.250289916992188, 3), 1653: (424, 152, 50624, 1.0126497365427934e-39, 0), 1532: (400, 152, 400, 6.2537390010262572e-36, 0)} The keys come out fine, the values are screwed up. I tried same thing loading file in binary mode; didn't fix the problem. Any idea what I'm doing wrong? Edit: Here's the code with binary. Note that the values are different in the unpickled object. >>> idmapfile = open("idmap", mode="wb") >>> pickle.dump(idMap, idmapfile) >>> idmapfile.close() >>> idmapfile = open("idmap", mode="rb") >>> unpickled = pickle.load(idmapfile) >>> unpickled==idMap False >>> unpickled[1] {1537: (12176, 2281, 56700, -1.0808743559293829e+18, 152), 1540: (0, 0, 15934, 2.7457842047810522e+26, 108), 1544: (400, 152, 400, 4.9518498821046956e+27, 53), 1675: (408, 152, 408, 2.7201111331839077e+26, 34), 1550: (456, 152, 456, -1.1349175514578289e+18, 152), 1424: (432, 152, 432, 4.5939047815653343e-40, 11), 1681: (408, 152, 408, 2.1299736657737219e-43, 0), 1560: (376, 152, 376, 2.1299736657737219e-43, 0), 1690: (376, 152, 376, 2.1299736657737219e-43, 0), 1435: (376, 152, 376, 2.1299736657737219e-43, 0), 1436: (376, 152, 376, 2.1299736657737219e-43, 0), 1434: (376, 152, 376, 2.1299736657737219e-43, 0), 1441: (376, 152, 376, 2.1299736657737219e-43, 0), 1691: (376, 152, 376, 2.1299736657737219e-43, 0), 1444: (376, 152, 376, 2.1299736657737219e-43, 0), 1606: (25784, 2281, 376, -3.2883343074537754e+26, 34), 1575: (24240, 2281, 376, 2.1299736657737219e-43, 0), 1586: (24240, 2281, 376, 2.1299736657737219e-43, 0), 1716: (24240, 2281, 376, -3.0093091599657311e-35, 26), 1590: (24240, 2281, 376, 2.1299736657737219e-43, 0), 1463: (24240, 2281, 376, 2.1299736657737219e-43, 0), 1594: (24240, 2281, 376, -4123208450048.0, 196), 1467: (25784, 2281, 376, 2.1299736657737219e-43, 0), 1596: (25784, 2281, 376, 2.1299736657737219e-43, 0), 1602: (25784, 2281, 376, -5.9963281433905448e+26, 76), 1547: (25784, 2281, 376, -218106240.0, 139), 1605: (25784, 2281, 376, -3.7138649803377281e+27, 56), 1478: (376, 152, 376, 2.1299736657737219e-43, 0), 1719: (25784, 2281, 376, 2.1299736657737219e-43, 0), 1617: (25784, 2281, 376, -1.4411779941597184e+17, 237), 1492: (25784, 2281, 376, 2.8596493694487798e-30, 80), 1622: (25784, 2281, 376, 184686084096.0, 93), 1628: (1336, 152, 1336, 3.1691839245470052e+29, 179), 1502: (1272, 152, 1272, -5.2042207205116645e-17, 99), 1632: (1208, 152, 1208, 2.1299736657737219e-43, 0), 1468: (1144, 152, 1144, 2.1299736657737219e-43, 0), 1642: (1080, 152, 1080, 2.1299736657737219e-43, 0), 1518: (1016, 152, 1016, 4.0240902787680023e+35, 145), 1647: (952, 152, 952, -985172619034624.0, 237), 1650: (888, 152, 888, 12094787289088.0, 66), 1651: (824, 152, 824, 2.1299736657737219e-43, 0), 1653: (760, 152, 760, 0.00018310768064111471, 238), 1532: (696, 152, 696, 8.8978061885676389e+26, 125)} OK I've isolated the problem, but don't know why it's so. First, apparently what I'm pickling are not tuples (though they look like it), but instead numpy.void types. Here is a series to illustrate the problem. first = run0.detections[0] >>> first (1, 19, 1578, 82.637763977050781, 1) >>> type(first) <type 'numpy.void'> >>> firstTuple = tuple(first) >>> theFile = open("pickleTest", "w") >>> pickle.dump(first, theFile) >>> theTupleFile = open("pickleTupleTest", "w") >>> pickle.dump(firstTuple, theTupleFile) >>> theFile.close() >>> theTupleFile.close() >>> first (1, 19, 1578, 82.637763977050781, 1) >>> firstTuple (1, 19, 1578, 82.637764, 1) >>> theFile = open("pickleTest", "r") >>> theTupleFile = open("pickleTupleTest", "r") >>> unpickledTuple = pickle.load(theTupleFile) >>> unpickledVoid = pickle.load(theFile) >>> type(unpickledVoid) <type 'numpy.void'> >>> type(unpickledTuple) <type 'tuple'> >>> unpickledTuple (1, 19, 1578, 82.637764, 1) >>> unpickledTuple == firstTuple True >>> unpickledVoid == first False >>> unpickledVoid (7936, 1705, 56700, -1.0808743559293829e+18, 152) >>> first (1, 19, 1578, 82.637763977050781, 1)

    Read the article

  • Python hash() can't handle long integer?

    - by Xie
    I defined a class: class A: ''' hash test class a = A(9, 1196833379, 1, 1773396906) hash(a) -340004569 This is weird, 12544897317L expected. ''' def __init__(self, a, b, c, d): self.a = a self.b = b self.c = c self.d = d def __hash__(self): return self.a * self.b + self.c * self.d Why, in the doctest, hash() function gives a negative integer?

    Read the article

  • Python/YACC Lexer: Token priority?

    - by Rosarch
    I'm trying to use reserved words in my grammar: reserved = { 'if' : 'IF', 'then' : 'THEN', 'else' : 'ELSE', 'while' : 'WHILE', } tokens = [ 'DEPT_CODE', 'COURSE_NUMBER', 'OR_CONJ', 'ID', ] + list(reserved.values()) t_DEPT_CODE = r'[A-Z]{2,}' t_COURSE_NUMBER = r'[0-9]{4}' t_OR_CONJ = r'or' t_ignore = ' \t' def t_ID(t): r'[a-zA-Z_][a-zA-Z_0-9]*' if t.value in reserved.values(): t.type = reserved[t.value] return t return None However, the t_ID rule somehow swallows up DEPT_CODE and OR_CONJ. How can I get around this? I'd like those two to take higher precedence than the reserved words.

    Read the article

  • how can i randomly print an element from a list in python

    - by lm
    So far i have this, which prints out every word in my list, but i am trying to print only one word at random. Any suggestions? def main(): # open a file wordsf = open('words.txt', 'r') word=random.choice('wordsf') words_count=0 for line in wordsf: word= line.rstrip('\n') print(word) words_count+=1 # close the file wordsf.close()

    Read the article

  • python cairoplot store previous readings..

    - by krisdigitx
    hi, i am using cairoplot, to make graphs, however the file from where i am reading the data is growing huge and its taking a long time to process the graph is there any real-time way to produce cairo graph, or at least store the previous readings..like rrd. -krisdigitx

    Read the article

  • Python: Taking an array and break it into subarrays based on some criteria

    - by randombits
    I have an array of files. I'd like to be able to break that array down into one array with multiple subarrays, each subarray contains files that were created on the same day. So right now if the array contains files from March 1 - March 31, I'd like to have an array with 31 subarrays (assuming there is at least 1 file for each day). In the long run, I'm trying to find the file from each day with the latest creation/modification time. If there is a way to bundle that into the iterations that are required above to save some CPU cycles, that would be even more ideal. Then I'd have one flat array with 31 files, one for each day, for the latest file created on each individual day.

    Read the article

  • Python - Subprocess Popen and Thread error

    - by n0idea
    In both functions record and ftp, i have subprocess.Popen if __name__ == '__main__': try: t1 = threading.Thread(target = record) t1.daemon = True t1.start() t2 = threading.Thread(target = ftp) t2.daemon = True t2.start() except (KeyboardInterrupt, SystemExit): sys.exit() The error I'm receiving is: Exception in thread Thread-1 (most likely raised during interpreter shutdown): Traceback (most recent call last): File "/usr/lib/python2.7/threading.py", line 551, in __bootstrap_inner File "/usr/lib/python2.7/threading.py", line 504, in run File "./in.py", line 20, in recordaudio File "/usr/lib/python2.7/subprocess.py", line 493, in call File "/usr/lib/python2.7/subprocess.py", line 679, in __init__ File "/usr/lib/python2.7/subprocess.py", line 1237, in _execute_child <type 'exceptions.AttributeError'>: 'NoneType' object has no attribute 'close' What might the issue be ?

    Read the article

  • Optimizing python code performance when importing zipped csv to a mongo collection

    - by mark
    I need to import a zipped csv into a mongo collection, but there is a catch - every record contains a timestamp in Pacific Time, which must be converted to the local time corresponding to the (longitude,latitude) pair found in the same record. The code looks like so: def read_csv_zip(path, timezones): with ZipFile(path) as z, z.open(z.namelist()[0]) as input: csv_rows = csv.reader(input) header = csv_rows.next() check,converters = get_aux_stuff(header) for csv_row in csv_rows: if check(csv_row): row = { converter[0]:converter[1](value) for converter, value in zip(converters, csv_row) if allow_field(converter) } ts = row['ts'] lng, lat = row['loc'] found_tz_entry = timezones.find_one(SON({'loc': {'$within': {'$box': [[lng-tz_lookup_radius, lat-tz_lookup_radius],[lng+tz_lookup_radius, lat+tz_lookup_radius]]}}})) if found_tz_entry: tz_name = found_tz_entry['tz'] local_ts = ts.astimezone(timezone(tz_name)).replace(tzinfo=None) row['tz'] = tz_name else: local_ts = (ts.astimezone(utc) + timedelta(hours = int(lng/15))).replace(tzinfo = None) row['local_ts'] = local_ts yield row def insert_documents(collection, source, batch_size): while True: items = list(itertools.islice(source, batch_size)) if len(items) == 0: break; try: collection.insert(items) except: for item in items: try: collection.insert(item) except Exception as exc: print("Failed to insert record {0} - {1}".format(item['_id'], exc)) def main(zip_path): with Connection() as connection: data = connection.mydb.data timezones = connection.timezones.data insert_documents(data, read_csv_zip(zip_path, timezones), 1000) The code proceeds as follows: Every record read from the csv is checked and converted to a dictionary, where some fields may be skipped, some titles be renamed (from those appearing in the csv header), some values may be converted (to datetime, to integers, to floats. etc ...) For each record read from the csv, a lookup is made into the timezones collection to map the record location to the respective time zone. If the mapping is successful - that timezone is used to convert the record timestamp (pacific time) to the respective local timestamp. If no mapping is found - a rough approximation is calculated. The timezones collection is appropriately indexed, of course - calling explain() confirms it. The process is slow. Naturally, having to query the timezones collection for every record kills the performance. I am looking for advises on how to improve it. Thanks. EDIT The timezones collection contains 8176040 records, each containing four values: > db.data.findOne() { "_id" : 3038814, "loc" : [ 1.48333, 42.5 ], "tz" : "Europe/Andorra" } EDIT2 OK, I have compiled a release build of http://toblerity.github.com/rtree/ and configured the rtree package. Then I have created an rtree dat/idx pair of files corresponding to my timezones collection. So, instead of calling collection.find_one I call index.intersection. Surprisingly, not only there is no improvement, but it works even more slowly now! May be rtree could be fine tuned to load the entire dat/idx pair into RAM (704M), but I do not know how to do it. Until then, it is not an alternative. In general, I think the solution should involve parallelization of the task. EDIT3 Profile output when using collection.find_one: >>> p.sort_stats('cumulative').print_stats(10) Tue Apr 10 14:28:39 2012 ImportDataIntoMongo.profile 64549590 function calls (64549180 primitive calls) in 1231.257 seconds Ordered by: cumulative time List reduced from 730 to 10 due to restriction <10> ncalls tottime percall cumtime percall filename:lineno(function) 1 0.012 0.012 1231.257 1231.257 ImportDataIntoMongo.py:1(<module>) 1 0.001 0.001 1230.959 1230.959 ImportDataIntoMongo.py:187(main) 1 853.558 853.558 853.558 853.558 {raw_input} 1 0.598 0.598 370.510 370.510 ImportDataIntoMongo.py:165(insert_documents) 343407 9.965 0.000 359.034 0.001 ImportDataIntoMongo.py:137(read_csv_zip) 343408 2.927 0.000 287.035 0.001 c:\python27\lib\site-packages\pymongo\collection.py:489(find_one) 343408 1.842 0.000 274.803 0.001 c:\python27\lib\site-packages\pymongo\cursor.py:699(next) 343408 2.542 0.000 271.212 0.001 c:\python27\lib\site-packages\pymongo\cursor.py:644(_refresh) 343408 4.512 0.000 253.673 0.001 c:\python27\lib\site-packages\pymongo\cursor.py:605(__send_message) 343408 0.971 0.000 242.078 0.001 c:\python27\lib\site-packages\pymongo\connection.py:871(_send_message_with_response) Profile output when using index.intersection: >>> p.sort_stats('cumulative').print_stats(10) Wed Apr 11 16:21:31 2012 ImportDataIntoMongo.profile 41542960 function calls (41542536 primitive calls) in 2889.164 seconds Ordered by: cumulative time List reduced from 778 to 10 due to restriction <10> ncalls tottime percall cumtime percall filename:lineno(function) 1 0.028 0.028 2889.164 2889.164 ImportDataIntoMongo.py:1(<module>) 1 0.017 0.017 2888.679 2888.679 ImportDataIntoMongo.py:202(main) 1 2365.526 2365.526 2365.526 2365.526 {raw_input} 1 0.766 0.766 502.817 502.817 ImportDataIntoMongo.py:180(insert_documents) 343407 9.147 0.000 491.433 0.001 ImportDataIntoMongo.py:152(read_csv_zip) 343406 0.571 0.000 391.394 0.001 c:\python27\lib\site-packages\rtree-0.7.0-py2.7.egg\rtree\index.py:384(intersection) 343406 379.957 0.001 390.824 0.001 c:\python27\lib\site-packages\rtree-0.7.0-py2.7.egg\rtree\index.py:435(_intersection_obj) 686513 22.616 0.000 38.705 0.000 c:\python27\lib\site-packages\rtree-0.7.0-py2.7.egg\rtree\index.py:451(_get_objects) 343406 6.134 0.000 33.326 0.000 ImportDataIntoMongo.py:162(<dictcomp>) 346 0.396 0.001 30.665 0.089 c:\python27\lib\site-packages\pymongo\collection.py:240(insert) EDIT4 I have parallelized the code, but the results are still not very encouraging. I am convinced it could be done better. See my own answer to this question for details.

    Read the article

  • Optimizing BeautifulSoup (Python) code

    - by user283405
    I have code that uses the BeautifulSoup library for parsing, but it is very slow. The code is written in such a way that threads cannot be used. Can anyone help me with this? I am using BeautifulSoup for parsing and than save into a DB. If I comment out the save statement, it still takes a long time, so there is no problem with the database. def parse(self,text): soup = BeautifulSoup(text) arr = soup.findAll('tbody') for i in range(0,len(arr)-1): data=Data() soup2 = BeautifulSoup(str(arr[i])) arr2 = soup2.findAll('td') c=0 for j in arr2: if str(j).find("<a href=") > 0: data.sourceURL = self.getAttributeValue(str(j),'<a href="') else: if c == 2: data.Hits=j.renderContents() #and few others... c = c+1 data.save() Any suggestions? Note: I already ask this question here but that was closed due to incomplete information.

    Read the article

< Previous Page | 171 172 173 174 175 176 177 178 179 180 181 182  | Next Page >