Search Results

Search found 16680 results on 668 pages for 'python datetime'.

Page 397/668 | < Previous Page | 393 394 395 396 397 398 399 400 401 402 403 404  | Next Page >

  • Technique to remove common words(and their plural versions) from a string

    - by Jake M
    I am attempting to find tags(keywords) for a recipe by parsing a long string of text. The text contains the recipe ingredients, directions and a short blurb. What do you think would be the most efficient way to remove common words from the tag list? By common words, I mean words like: 'the', 'at', 'there', 'their' etc. I have 2 methodologies I can use, which do you think is more efficient in terms of speed and do you know of a more efficient way I could do this? Methodology 1: - Determine the number of times each word occurs(using the library Collections) - Have a list of common words and remove all 'Common Words' from the Collection object by attempting to delete that key from the Collection object if it exists. - Therefore the speed will be determined by the length of the variable delims import collections from Counter delim = ['there','there\'s','theres','they','they\'re'] # the above will end up being a really long list! word_freq = Counter(recipe_str.lower().split()) for delim in set(delims): del word_freq[delim] return freq.most_common() Methodology 2: - For common words that can be plural, look at each word in the recipe string, and check if it partially contains the non-plural version of a common word. Eg; For the string "There's a test" check each word to see if it contains "there" and delete it if it does. delim = ['this','at','them'] # words that cant be plural partial_delim = ['there','they',] # words that could occur in many forms word_freq = Counter(recipe_str.lower().split()) for delim in set(delims): del word_freq[delim] # really slow for delim in set(partial_delims): for word in word_freq: if word.find(delim) != -1: del word_freq[delim] return freq.most_common()

    Read the article

  • Designing a Tag table that tells how many times it's used

    - by Satoru.Logic
    Hi, all. I am trying to design a tagging system with a model like this: Tag: content = CharField creator = ForeignKey used = IntergerField It is a many-to-many relationship between tags and what's been tagged. Everytime I insert a record into the assotication table, Tag.used is incremented by one, and decremented by one in case of deletion. Tag.used is maintained because I want to speed up answering the question 'How many times this tag is used?'. However, this seems to slow insertion down obviously. Please tell me how to improve this design. Thanks in advance.

    Read the article

  • How to set up Atana Studio 3 Themes in Pydev

    - by willy1234x1
    I've installed the Aptana Studio 3 preview and noticed it has support for themes (such as a bespin style or Ruby envy) and I'd love to use the Bespin one in Pydev but so far I've had no luck getting it to work, anyone have a clue as to how to get it to work? Video showing the themes in action.

    Read the article

  • How to turn a list of tuples into a string?

    - by matt
    I have a list of tuples that I'm trying to incorporate into a SQL query but I can't figure out how to join them together without adding slashes. My like this: list = [('val', 'val'), ('val', 'val'), ('val', 'val')] If I turn each tuple into a string and try to join them with a a comma I'll get something like ' (\'val\, \'val\'), ... ' What's the right way to do this, so I can get the list (without brackets) as a string?

    Read the article

  • Matplotlib: plotting discrete values

    - by Arkapravo
    I am trying to plot the following ! from numpy import * from pylab import * import random for x in range(1,500): y = random.randint(1,25000) print(x,y) plot(x,y) show() However, I keep getting a blank graph (?). Just to make sure that the program logic is correct I added the code print(x,y), just the confirm that (x,y) pairs are being generated. (x,y) pairs are being generated, but there is no plot, I keep getting a blank graph. Any help ?

    Read the article

  • Why is the destructor called when the CPython garbage collector is disabled?

    - by Frederik
    I'm trying to understand the internals of the CPython garbage collector, specifically when the destructor is called. So far, the behavior is intuitive, but the following case trips me up: Disable the GC. Create an object, then remove a reference to it. The object is destroyed and the __del__ method is called. I thought this would only happen if the garbage collector was enabled. Can someone explain why this happens? Is there a way to defer calling the destructor? import gc import unittest _destroyed = False class MyClass(object): def __del__(self): global _destroyed _destroyed = True class GarbageCollectionTest(unittest.TestCase): def testExplicitGarbageCollection(self): gc.disable() ref = MyClass() ref = None # The next test fails. # The object is automatically destroyed even with the collector turned off. self.assertFalse(_destroyed) gc.collect() self.assertTrue(_destroyed) if __name__=='__main__': unittest.main() Disclaimer: this code is not meant for production -- I've already noted that this is very implementation-specific and does not work on Jython.

    Read the article

  • Why is win32com so much slower than xlrd?

    - by Josh
    I have the same code, written using win32com and xlrd. xlrd preforms the algorithm in less than a second, while win32com takes minutes. Here is the win32com: def makeDict(ws): """makes dict with key as header name, value as tuple of column begin and column end (inclusive)""" wsHeaders = {} # key is header name, value is column begin and end inclusive for cnum in xrange(9, find_last_col(ws)): if ws.Cells(7, cnum).Value: wsHeaders[str(ws.Cells(7, cnum).Value)] = (cnum, find_last_col(ws)) for cend in xrange(cnum + 1, find_last_col(ws)): #finds end column if ws.Cells(7, cend).Value: wsHeaders[str(ws.Cells(7, cnum).Value)] = (cnum, cend - 1) break return wsHeaders And the xlrd def makeDict(ws): """makes dict with key as header name, value as tuple of column begin and column end (inclusive)""" wsHeaders = {} # key is header name, value is column begin and end inclusive for cnum in xrange(8, ws.ncols): if ws.cell_value(6, cnum): wsHeaders[str(ws.cell_value(6, cnum))] = (cnum, ws.ncols) for cend in xrange(cnum + 1, ws.ncols):#finds end column if ws.cell_value(6, cend): wsHeaders[str(ws.cell_value(6, cnum))] = (cnum, cend - 1) break return wsHeaders

    Read the article

  • how to import a.py not a folder

    - by zjm1126
    zjm_code |-----a.py |-----a |----- __init__.py |-----b.py in a.py is : c='ccc' in b.py is : import a print dir(a) when i execute b.py ,it show (it import 'a' folder): ['__builtins__', '__doc__', '__file__', '__name__', '__path__'] and when i delete a folder, it show ,(it import a.py): ['__builtins__', '__doc__', '__file__', '__name__', 'c'] so my question is : how to import a.py via not delete a folder thanks

    Read the article

  • Error -3 while decompressing data: incorrect header check

    - by Rahul99
    I have .zip file which contain csv data. I am reading .zip file using <input type = "file" name = "select_file"/> I want to decompress that .zip file and read csv data. file_data = self.request.get('select_file') file_str = zlib.decompress(file_data) #file_data_list = file_str.split('\n') #file_Reader = csv.reader(file_data_list,quoting=csv.QUOTE_NONE ) I am expecting csv data in file_str but I am getting error. error :: Error -3 while decompressing data: incorrect header check What I have to use?

    Read the article

  • How can I display multiple django modelformset forms together?

    - by JT
    I have a problem with needing to provide multiple model backed forms on the same page. I understand how to do this with single forms, i.e. just create both the forms call them something different then use the appropriate names in the template. Now how exactly do you expand that solution to work with modelformsets? The wrinkle, of course, is that each 'form' must be rendered together in the appropriate fieldset. For example I want my template to produce something like this: <fieldset> <label for="id_base-0-desc">Home Base Description:</label> <input id="id_base-0-desc" type="text" name="base-0-desc" maxlength="100" /> <label for="id_likes-0-icecream">Want ice cream?</label> <input type="checkbox" name="likes-0-icecream" id="id_likes-0-icecream" /> </fieldset> <fieldset> <label for="id_base-1-desc">Home Base Description:</label> <input id="id_base-1-desc" type="text" name="base-1-desc" maxlength="100" /> <label for="id_likes-1-icecream">Want ice cream?</label> <input type="checkbox" name="likes-1-icecream" id="id_likes-1-icecream" /> </fieldset> I am using a loop like this to process the results for base_form, likes_form in map(None, base_forms, likes_forms): which works as I'd expect (I'm using map because the # of forms can be different). The problem is that I can't figure out a way to do the same thing with the templating engine. The system does work if I layout all the base models together then all the likes models after wards, but it doesn't meet the layout requirements.

    Read the article

  • Loading url with cyrillic symbols

    - by Ockonal
    Hi guys, I have to load some url with cyrillic symbols. My script should work with this: http://wincode.org/%D0%BF%D1%80%D0%BE%D0%B3%D1%80%D0%B0%D0%BC%D0%BC%D0%B8%D1%80%D0%BE%D0%B2%D0%B0%D0%BD%D0%B8%D0%B5/ If I'll use this in browser it would replaced into normal symbols, but urllib code fails with 404 error. How to decode correctly this url? When I'm using that url directly in code, like address = 'that address', it works perfect. But I used parsing page for getting this url. I have a list of urls which contents cyrillic. Maybe they have uncorrect encoding? Here is more code: requestData = urllib2.Request( %SOME_ADDRESS%, None, {"User-Agent": user_agent}) requestHandler = pageHandler.open(requestData) pageData = requestHandler.read().decode('utf-8') soupHandler = BeautifulSoup(pageData) topicLinks = [] for postBlock in soupHandler.findAll('a', href=re.compile('%SOME_REGEXP%')): topicLinks.append(postBlock['href']) postAddress = choice(topicLinks) postRequestData = urllib2.Request(postAddress, None, {"User-Agent": user_agent}) postHandler = pageHandler.open(postRequestData) postData = postHandler.read() File "/usr/lib/python2.6/urllib2.py", line 518, in http_error_default raise HTTPError(req.get_full_url(), code, msg, hdrs, fp) urllib2.HTTPError: HTTP Error 404: Not Found

    Read the article

  • Huge amount of time sending data with suds and proxy

    - by Roman
    Hi everyone, I have the following code to send data through a proxy using suds: import suds t = suds.transport.http.HttpTransport() proxy = urllib2.ProxyHandler({'http': 'http://192.168.3.217:3128'}) opener = urllib2.build_opener(proxy) t.urlopener = opener ws = suds.client.Client('http://xxxxxxx/web.asmx?WSDL', transport=t) req = ws.factory.create('ActionRequest.request') req.SerialNumber = 'asdf' req.HostName = 'hola' res = ws.service.ActionRequest(req) I don't know why, but it can be sending data above 2 or 3 minutes, or even more and it raises a "Gateway timeout" exception sometimes. If I don't use the proxy, the amount of time used is above 2 seconds or less. Here is the SOAP reply: (ActionResponse){ Id = None Action = "Action.None" Objects = "" } The proxy is running right with other requests through urllib2, or using normal web browsers like firefox. Does anyone have any idea what's happening here with suds? Thanks a lot in advance!!!

    Read the article

  • How do I launch background jobs w/ paramiko?

    - by sophacles
    Here is my scenario: I am trying to automate some tasks using Paramiko. The tasks need to be started in this order (using the notation (host, task)): (A, 1), (B, 2), (C, 2), (A,3), (B,3) -- essentially starting servers and clients for some testing in the correct order. Further, because in the tests networking may get mucked up, and because I need some of the output from the tests, I would like to just redirect output to a file. In similar scenarios the common response is to use 'screen -m -d' or to use 'nohup'. However with paramiko's exec_cmd, nohup doesn't actually exit. Using: bash -c -l nohup test_cmd & doesnt work either, exec_cmd still blocks to process end. In the screen case, output redirection doesn't work very well, (actually, doesnt work at all the best I can figure out). So, after all that explanation, my question is: is there an easy elegant way to detach processes and capture output in such a way as to end paramiko's exec_cmd blocking? Update The dtach command works nicely for this!

    Read the article

  • Scraping paginated items from a website using scrapy

    - by Mridang Agarwalla
    I'm using scrapy to scrape items from a site. I'm not being able to implement this scraping pattern. The site I'm trying to scrape is a forum and I scrape the site once a day. Each page has a table containing posts. New posts are added to the top of the table and as more and more posts are posted to the site, the older posts go further into the pages due to pagination. This is a very simple scenario and we will assume that the order of the posts never change. I would like to scrape this site and scrape all the "new" records until the last scraped post from yesterday is encountered. I have configured my spider to paginate endlessly and when it encounters yesterday's last scraped post, it should stop. How can implement this? (My Scrapy installation works with my Django installation using django-dynamic-scraper )

    Read the article

  • Combinatorial optimisation of a distance metric

    - by Jose
    I have a set of trajectories, made up of points along the trajectory, and with the coordinates associated with each point. I store these in a 3d array ( trajectory, point, param). I want to find the set of r trajectories that have the maximum accumulated distance between the possible pairwise combinations of these trajectories. My first attempt, which I think is working looks like this: max_dist = 0 for h in itertools.combinations ( xrange(num_traj), r): for (m,l) in itertools.combinations (h, 2): accum = 0. for ( i, j ) in itertools.izip ( range(k), range(k) ): A = [ (my_mat[m, i, z] - my_mat[l, j, z])**2 \ for z in xrange(k) ] A = numpy.array( numpy.sqrt (A) ).sum() accum += A if max_dist < accum: selected_trajectories = h This takes forever, as num_traj can be around 500-1000, and r can be around 5-20. k is arbitrary, but can typically be up to 50. Trying to be super-clever, I have put everything into two nested list comprehensions, making heavy use of itertools: chunk = [[ numpy.sqrt((my_mat[m, i, :] - my_mat[l, j, :])**2).sum() \ for ((m,l),i,j) in \ itertools.product ( itertools.combinations(h,2), range(k), range(k)) ]\ for h in itertools.combinations(range(num_traj), r) ] Apart from being quite unreadable (!!!), it is also taking a long time. Can anyone suggest any ways to improve on this?

    Read the article

  • Django: Summing values

    - by Anry
    I have a two Model - Project and Cost. class Project(models.Model): title = models.CharField(max_length=150) url = models.URLField() manager = models.ForeignKey(User) class Cost(models.Model): project = models.ForeignKey(Project) cost = models.FloatField() date = models.DateField() I must return the sum of costs for each project. view.py: from mypm.costs.models import Project, Cost from django.shortcuts import render_to_response from django.db.models import Avg, Sum def index(request): #... return render_to_response('index.html',... How?

    Read the article

< Previous Page | 393 394 395 396 397 398 399 400 401 402 403 404  | Next Page >