Search Results

Search found 15187 results on 608 pages for 'boost python'.

Page 170/608 | < Previous Page | 166 167 168 169 170 171 172 173 174 175 176 177  | Next Page >

  • Algorithm detect repeating/similiar strings in a corpus of data -- say email subjects, in Python

    - by RizwanK
    I'm downloading a long list of my email subject lines , with the intent of finding email lists that I was a member of years ago, and would want to purge them from my Gmail account (which is getting pretty slow.) I'm specifically thinking of newsletters that often come from the same address, and repeat the product/service/group's name in the subject. I'm aware that I could search/sort by the common occurrence of items from a particular email address (and I intend to), but I'd like to correlate that data with repeating subject lines.... Now, many subject lines would fail a string match, but "Google Friends : Our latest news" "Google Friends : What we're doing today" are more similar to each other than a random subject line, as is: "Virgin Airlines has a great sale today" "Take a flight with Virgin Airlines" So -- how can I start to automagically extract trends/examples of strings that may be more similar. Approaches I've considered and discarded ('because there must be some better way'): Extracting all the possible substrings and ordering them by how often they show up, and manually selecting relevant ones Stripping off the first word or two and then count the occurrence of each sub string Comparing Levenshtein distance between entries Some sort of string similarity index ... Most of these were rejected for massive inefficiency or likelyhood of a vast amount of manual intervention required. I guess I need some sort of fuzzy string matching..? In the end, I can think of kludgy ways of doing this, but I'm looking for something more generic so I've added to my set of tools rather than special casing for this data set. After this, I'd be matching the occurring of particular subject strings with 'From' addresses - I'm not sure if there's a good way of building a data structure that represents how likely/not two messages are part of the 'same email list' or by filtering all my email subjects/from addresses into pools of likely 'related' emails and not -- but that's a problem to solve after this one. Any guidance would be appreciated.

    Read the article

  • sorting in python

    - by tipu
    I have a hashmap like so: results[tweet_id] = {"score" : float(dot(query,doc) / (norm(query) * norm(doc))), "tweet" : tweet} What I'd like to do is to sort results by the innser "score" key. I don't know how possible this is, I saw many sorting tutorials but they were for simple (not nested) data structures.

    Read the article

  • How to concat a string in Python

    - by alex
    query = "SELECT * FROM mytable WHERE time=%s", (mytime) Then, I want to add a limit %s to it. How can I do that without messing up the %s in mytime? Edit: I want to concat query2, which has "LIMIT %s, %s"

    Read the article

  • Match HTML tags in two strings using regex in Python

    - by jack
    I want to verify that the HTML tags present in a source string are also present in a target string. For example: >> source = '<em>Hello</em><label>What's your name</label>' >> verify_target(’<em>Hi</em><label>My name is Jim</label>') True >> verify_target('<label>My name is Jim</label><em>Hi</em>') True >> verify_target('<em>Hi<label>My name is Jim</label></em>') False

    Read the article

  • How to remove commas etc form a matrix in python

    - by robert
    say ive got a matrix that looks like: [[0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0]] how can i make it on seperate lines: [[0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0]] and then remove commas etc: 0 0 0 0 0 And also to make it blank instead of 0's, so that numbers can be put in later, so in the end it will be like: _ 1 2 _ 1 _ 1 (spaces not underscores) thanks

    Read the article

  • ListCtrl - wxPython / Python

    - by Francisco Aleixo
    Hello everyone. My question is if we can assign/bind some value to a certain item and hide that value(or if we can do the same thing in another way). Example: Lets say the columns on ListCtrl are "Name" and "Description": self.lc = wx.ListCtrl(self, -1, style=wx.LC_REPORT) self.lc.InsertColumn(0, 'Name') self.lc.InsertColumn(1, 'Description') And when I add a item I want them to show the Name parameter and the description: num_items = self.lc.GetItemCount() self.lc.InsertStringItem(num_items, "Randomname") self.lc.SetStringItem(num_items, 1, "Some description here") Now what I want to do is basically assign something to that item that is not shown so I can access later on the app. So I would like to add something that is not shown on the app but is on the item value like: hiddendescription = "Somerandomthing" Still didn't undestand? Well lets say I add a button to add a item with some other TextCtrls to set the parameters and the TextCtrls parameters are: "Name" "Description" "Hiddendescription" So then the user fills this textctrls out and clicks the button to create the item, and I basically want only to show the Name and Description and hide the "HiddenDescription" but to do it so I can use it later. Sorry for explaining more than 1 time on this post but I want to make sure you understand what I pretend to do.

    Read the article

  • python threading and performace?

    - by kumar
    I had to do heavy I/o bound operation, i.e Parsing large files and converting from one format to other format. Initially I used to do it serially, i.e parsing one after another..! Performance was very poor ( it used take 90+ seconds). So I decided to use threading to improve the performance. I created one thread for each file. ( 4 threads) for file in file_list: t=threading.Thread(target = self.convertfile,args = file) t.start() ts.append(t) for t in ts: t.join() But for my astonishment, there is no performance improvement whatsoever. Now also it takes around 90+ seconds to complete the task. As this is I/o bound operation , I had expected to improve the performance. What am I doing wrong?

    Read the article

  • Python ctypes argument errors

    - by Patrick Moriarty
    Hello. I wrote a test dll in C++ to make sure things work before I start using a more important dll that I need. Basically it takes two doubles and adds them, then returns the result. I've been playing around and with other test functions I've gotten returns to work, I just can't pass an argument due to errors. My code is: import ctypes import string nDLL = ctypes.WinDLL('test.dll') func = nDLL['haloshg_add'] func.restype = ctypes.c_double func.argtypes = (ctypes.c_double,ctypes.c_double) print(func(5.0,5.0)) It returns the error for the line that called "func": ValueError: Procedure probably called with too many arguments (8 bytes in excess) What am I doing wrong? Thanks.

    Read the article

  • Get Path of Uploaded File using Python

    - by Ali
    Is it possible to get the full path of the file on the user's computer being uploaded to my site? Using os.path.abspath(fileitem.filename) simply gets me the address of where my script is executing from on my shared hosting server. FYI: fileitem = form['file'] and form = cgi.FieldStorage()

    Read the article

  • How do I calculate percentiles with python/numpy?

    - by Uri
    Is there a convenient way to calculate percentiles for a sequence or single-dimensional numpy array? I am looking for something similar to Excel's percentile function. I looked in NumPy's statistics reference, and couldn't find this. All I could find is the median (50th percentile), but not something more specific.

    Read the article

  • Small Python optional arguments question

    - by ooboo
    I have two functions: def f(a,b,c=g(b)): blabla def g(n): blabla c is an optional argument in function f. If the user does not specify its value, the program should compute g(b) and that would be the value of c. But the code does not compile - it says name 'b' is not defined. How to fix that? Someone suggested: def g(b): blabla def f(a,b,c=None): if c is None: c = g(b) blabla But this doesn't work, because maybe the user intended c to be None and then c will have another value.

    Read the article

  • Python beautiful soup arguments

    - by scott
    Hi I have this code that fetches some text from a page using BeautifulSoup soup= BeautifulSoup(html) body = soup.find('div' , {'id':'body'}) print body I would like to make this as a reusable function that takes in some htmltext and the tags to match it like the following def parse(html, atrs): soup= BeautifulSoup(html) body = soup.find(atrs) return body But if i make a call like this parse(htmlpage, ('div' , {'id':'body'}")) or like parse(htmlpage, ['div' , {'id':'body'}"]) I get only the div element, the body attribute seems to get ignored. Is there a way to fix this?

    Read the article

  • Parsing text file in python

    - by Ockonal
    Hello, I have html-file. I have to replace all text between this: [%anytext%]. As I understand, it's very easy to do with BeautifulSoup for parsing hmtl. But what is regular expression and how to remove&write back text data?

    Read the article

  • Take the intersection of an arbitrary number of lists in python

    - by thepandaatemyface
    Suppose I have a list of lists of elements which are all the same (i'll use ints in this example) [range(100)[::4], range(100)[::3], range(100)[::2], range(100)[::1]] What would be a nice and/or efficient way to take the intersection of these lists (so you would get every element that is in each of the lists)? For the example that would be: [0, 12, 24, 36, 48, 60, 72, 84, 96]

    Read the article

  • how to build good python web application

    - by Moayyad Yaghi
    hello i never worked with web programming and i've been asked lately to write a web-based software to manage assets and tasks. to be used by more than 900 persons what are the recommended modules , frameworks , libraries for this task. and it will be highly appreciated if you guyz recommend some books and articles that might help me. thanks in advance

    Read the article

  • Dynamic dispatch and inheritance in python

    - by Bill Zimmerman
    Hi, I'm trying to modify Guido's multimethod (dynamic dispatch code): http://www.artima.com/weblogs/viewpost.jsp?thread=101605 to handle inheritance and possibly out of order arguments. e.g. (inheritance problem) class A(object): pass class B(A): pass @multimethod(A,A) def foo(arg1,arg2): print 'works' foo(A(),A()) #works foo(A(),B()) #fails Is there a better way than iteratively checking for the super() of each item until one is found? e.g. (argument ordering problem) I was thinking of this from a collision detection standpoint. e.g. foo(Car(),Truck()) and foo(Truck(), Car()) and should both trigger foo(Car,Truck) # Note: @multimethod(Truck,Car) will throw an exception if @multimethod(Car,Truck) was registered first? I'm looking specifically for an 'elegant' solution. I know that I could just brute force my way through all the possibilities, but I'm trying to avoid that. I just wanted to get some input/ideas before sitting down and pounding out a solution. Thanks

    Read the article

  • How to show why "try" failed in python

    - by calccrypto
    is there anyway to show why a "try" failed, and skipped to "except", without writing out all the possible errors by hand, and without ending the program? example: try: 1/0 except: someway to show "Traceback (most recent call last): File "<pyshell#0>", line 1, in <module> 1/0 ZeroDivisionError: integer division or modulo by zero" i dont want to doif:print error 1, elif: print error 2, elif: etc.... i want to see the error that would be shown had try not been there

    Read the article

  • python conditional list creation from 2D lists

    - by dls
    Say I've got a list of lists. Say the inner list of three elements in size and looks like this: ['apple', 'fruit', 1.23] The outer list looks like this data = [['apple', 'fruit', 1.23], ['pear', 'fruit', 2.34], ['lettuce', 'vegetable', 3.45]] I want to iterate through the outer list and cull data for a temporary list only in the case that element 1 matches some keyword (aka: 'fruit'). So, if I'm matching fruit, I would end up with this: tempList = [('apple', 1.23), ('pear', 2.34)] This is one way to accomplish this: tempList = [] for i in data: if i[1] == 'fruit': tempList.append(i[0], i[2]) is there some 'Pythonic' way to do this in fewer lines?

    Read the article

  • sorting content of a text file in python

    - by rabidmachine9
    I have this small script that sorts the content of a text file # The built-in function `open` opens a file and returns a file object. # Read mode opens a file for reading only. try: f = open("tracks.txt", "r") try: # Read the entire contents of a file at once. # string = f.read() # OR read one line at a time. #line = f.readline() # OR read all the lines into a list. lines = f.readlines() lines.sort() f = open('tracks.txt', 'w') f.writelines(lines) # Write a sequence of strings to a file finally: f.close() except IOError: pass the only problem is that the text is displayed at the bottom of the text file everytime it's sortened... I assume it also sorts the blank lines...anybody knows why? thanks in advance

    Read the article

  • How to loop over nodes with xmlfeed using scrapy python

    - by Kour ipm
    Hi i working on scrapy and trying xml feeds first time, below is my code class TestxmlItemSpider(XMLFeedSpider): name = "TestxmlItem" allowed_domains = {"http://www.nasinteractive.com"} start_urls = [ "http://www.nasinteractive.com/jobexport/advance/hcantexasexport.xml" ] iterator = 'iternodes' itertag = 'job' def parse_node(self, response, node): title = node.select('title/text()').extract() job_code = node.select('job-code/text()').extract() detail_url = node.select('detail-url/text()').extract() category = node.select('job-category/text()').extract() print title,";;;;;;;;;;;;;;;;;;;;;" print job_code,";;;;;;;;;;;;;;;;;;;;;" item = TestxmlItem() item['title'] = node.select('title/text()').extract() ....... return item result: File "/usr/lib/python2.7/site-packages/Scrapy-0.14.3-py2.7.egg/scrapy/item.py", line 56, in __setitem__ (self.__class__.__name__, key)) exceptions.KeyError: 'TestxmlItem does not support field: title' Totally there are 200+ items so i need to loop over and assign the node text to item but here all the results are displaying at once when we print, actually how can we loop over on nodes in scraping xml files with xmlfeedspider

    Read the article

< Previous Page | 166 167 168 169 170 171 172 173 174 175 176 177  | Next Page >