Search Results

Search found 13596 results on 544 pages for 'mechanize python'.

Page 159/544 | < Previous Page | 155 156 157 158 159 160 161 162 163 164 165 166 | Next Page >

Optimizing BeautifulSoup (Python) code

- by user283405

I have code that uses the BeautifulSoup library for parsing, but it is very slow. The code is written in such a way that threads cannot be used. Can anyone help me with this? I am using BeautifulSoup for parsing and than save into a DB. If I comment out the save statement, it still takes a long time, so there is no problem with the database. def parse(self,text): soup = BeautifulSoup(text) arr = soup.findAll('tbody') for i in range(0,len(arr)-1): data=Data() soup2 = BeautifulSoup(str(arr[i])) arr2 = soup2.findAll('td') c=0 for j in arr2: if str(j).find("<a href=") > 0: data.sourceURL = self.getAttributeValue(str(j),'<a href="') else: if c == 2: data.Hits=j.renderContents() #and few others... c = c+1 data.save() Any suggestions? Note: I already ask this question here but that was closed due to incomplete information.

Read the article
Efficient way in Python to remove an element from a comma-separated string

- by ensnare

I'm looking for the most efficient way to add an element to a comma-separated string while maintaining alphabetical order for the words: For example: string = 'Apples, Bananas, Grapes, Oranges' subtraction = 'Bananas' result = 'Apples, Grapes, Oranges' Also, a way to do this but while maintaining IDs: string = '1:Apples, 4:Bananas, 6:Grapes, 23:Oranges' subtraction = '4:Bananas' result = '1:Apples, 6:Grapes, 23:Oranges' Sample code is greatly appreciated. Thank you so much.

Read the article
Python/YACC Lexer: Token priority?

- by Rosarch

I'm trying to use reserved words in my grammar: reserved = { 'if' : 'IF', 'then' : 'THEN', 'else' : 'ELSE', 'while' : 'WHILE', } tokens = [ 'DEPT_CODE', 'COURSE_NUMBER', 'OR_CONJ', 'ID', ] + list(reserved.values()) t_DEPT_CODE = r'[A-Z]{2,}' t_COURSE_NUMBER = r'[0-9]{4}' t_OR_CONJ = r'or' t_ignore = ' \t' def t_ID(t): r'[a-zA-Z_][a-zA-Z_0-9]*' if t.value in reserved.values(): t.type = reserved[t.value] return t return None However, the t_ID rule somehow swallows up DEPT_CODE and OR_CONJ. How can I get around this? I'd like those two to take higher precedence than the reserved words.

Read the article
Caching result of setUp() using Python unittest

- by dbr

I currently have a unittest.TestCase that looks like.. class test_appletrailer(unittest.TestCase): def setup(self): self.all_trailers = Trailers(res = "720", verbose = True) def test_has_trailers(self): self.failUnless(len(self.all_trailers) > 1) # ..more tests.. This works fine, but the Trailers() call takes about 2 seconds to run.. Given that setUp() is called before each test is run, the tests now take almost 10 seconds to run (with only 3 test functions) What is the correct way of caching the self.all_trailers variable between tests? Removing the setUp function, and doing.. class test_appletrailer(unittest.TestCase): all_trailers = Trailers(res = "720", verbose = True) ..works, but then it claims "Ran 3 tests in 0.000s" which is incorrect.. The only other way I could think of is to have a cache_trailers global variable (which works correctly, but is rather horrible): cache_trailers = None class test_appletrailer(unittest.TestCase): def setUp(self): global cache_trailers if cache_trailers is None: cache_trailers = self.all_trailers = all_trailers = Trailers(res = "720", verbose = True) else: self.all_trailers = cache_trailers

Read the article
strip spaces in python.

- by Richard

ok I know that this should be simple... anyways say: line = "$W5M5A,100527,142500,730301c44892fd1c,2,686.5 4,333.96,0,0,28.6,123,75,-0.4,1.4*49" I want to strip out the spaces. I thought you would just do this line = line.strip() but now line is still '$W5M5A,100527,142500,730301c44892fd1c,2,686.5 4,333.96,0,0,28.6,123,75,-0.4,1.4*49' instead of '$W5M5A,100527,142500,730301c44892fd1c,2,686.54,333.96,0,0,28.6,123,75,-0.4,1.4*49' any thoughts?

Read the article
Python - pickling fails for numpy.void objects

- by I82Much

>>> idmapfile = open("idmap", mode="w") >>> pickle.dump(idMap, idmapfile) >>> idmapfile.close() >>> idmapfile = open("idmap") >>> unpickled = pickle.load(idmapfile) >>> unpickled == idMap False idMap[1] {1537: (552, 1, 1537, 17.793827056884766, 3), 1540: (4220, 1, 1540, 19.31205940246582, 3), 1544: (592, 1, 1544, 18.129131317138672, 3), 1675: (529, 1, 1675, 18.347782135009766, 3), 1550: (4048, 1, 1550, 19.31205940246582, 3), 1424: (1528, 1, 1424, 19.744396209716797, 3), 1681: (1265, 1, 1681, 19.596025466918945, 3), 1560: (3457, 1, 1560, 20.530569076538086, 3), 1690: (477, 1, 1690, 17.395542144775391, 3), 1691: (554, 1, 1691, 13.446117401123047, 3), 1436: (3010, 1, 1436, 19.596025466918945, 3), 1434: (3183, 1, 1434, 19.744396209716797, 3), 1441: (3570, 1, 1441, 20.589576721191406, 3), 1435: (476, 1, 1435, 19.640911102294922, 3), 1444: (527, 1, 1444, 17.98480224609375, 3), 1478: (1897, 1, 1478, 19.596025466918945, 3), 1575: (614, 1, 1575, 19.371648788452148, 3), 1586: (2189, 1, 1586, 19.31205940246582, 3), 1716: (3470, 1, 1716, 19.158674240112305, 3), 1590: (2278, 1, 1590, 19.596025466918945, 3), 1463: (991, 1, 1463, 19.31205940246582, 3), 1594: (1890, 1, 1594, 19.596025466918945, 3), 1467: (1087, 1, 1467, 19.31205940246582, 3), 1596: (3759, 1, 1596, 19.744396209716797, 3), 1602: (3011, 1, 1602, 20.530569076538086, 3), 1547: (490, 1, 1547, 17.994071960449219, 3), 1605: (658, 1, 1605, 19.31205940246582, 3), 1606: (1794, 1, 1606, 16.964881896972656, 3), 1719: (1826, 1, 1719, 19.596025466918945, 3), 1617: (583, 1, 1617, 11.894925117492676, 3), 1492: (3441, 1, 1492, 20.500667572021484, 3), 1622: (3215, 1, 1622, 19.31205940246582, 3), 1628: (2761, 1, 1628, 19.744396209716797, 3), 1502: (1563, 1, 1502, 19.596025466918945, 3), 1632: (1108, 1, 1632, 15.457141876220703, 3), 1468: (3779, 1, 1468, 19.596025466918945, 3), 1642: (3970, 1, 1642, 19.744396209716797, 3), 1518: (612, 1, 1518, 18.570245742797852, 3), 1647: (854, 1, 1647, 16.964881896972656, 3), 1650: (2099, 1, 1650, 20.439058303833008, 3), 1651: (540, 1, 1651, 18.552841186523438, 3), 1653: (613, 1, 1653, 19.237197875976563, 3), 1532: (537, 1, 1532, 18.885730743408203, 3)} >>> unpickled[1] {1537: (64880, 1638, 56700, -1.0808743559293829e+18, 152), 1540: (64904, 1638, 0, 0.0, 0), 1544: (54472, 1490, 0, 0.0, 0), 1675: (6464, 1509, 0, 0.0, 0), 1550: (43592, 1510, 0, 0.0, 0), 1424: (43616, 1510, 0, 0.0, 0), 1681: (0, 0, 0, 0.0, 0), 1560: (400, 152, 400, 2.1299736657737219e-43, 0), 1690: (408, 152, 408, 2.7201111331839077e+26, 34), 1435: (424, 152, 61512, 1.0122952080313192e-39, 0), 1436: (400, 152, 400, 20.250289916992188, 3), 1434: (424, 152, 62080, 1.0122952080313192e-39, 0), 1441: (400, 152, 400, 12.250144958496094, 3), 1691: (424, 152, 42608, 15.813941955566406, 3), 1444: (400, 152, 400, 19.625289916992187, 3), 1606: (424, 152, 42432, 5.2947192852601414e-22, 41), 1575: (400, 152, 400, 6.2537390010262572e-36, 0), 1586: (424, 152, 42488, 1.0122601755697111e-39, 0), 1716: (400, 152, 400, 6.2537390010262572e-36, 0), 1590: (424, 152, 64144, 1.0126357235581501e-39, 0), 1463: (400, 152, 400, 6.2537390010262572e-36, 0), 1594: (424, 152, 32672, 17.002994537353516, 3), 1467: (400, 152, 400, 19.750289916992187, 3), 1596: (424, 152, 7176, 1.0124003054161436e-39, 0), 1602: (400, 152, 400, 18.500289916992188, 3), 1547: (424, 152, 7000, 1.0124003054161436e-39, 0), 1605: (400, 152, 400, 20.500289916992188, 3), 1478: (424, 152, 42256, -6.0222748507426518e+30, 222), 1719: (400, 152, 400, 6.2537390010262572e-36, 0), 1617: (424, 152, 16472, 1.0124283313854301e-39, 0), 1492: (400, 152, 400, 6.2537390010262572e-36, 0), 1622: (424, 152, 35304, 1.0123190301052127e-39, 0), 1628: (400, 152, 400, 6.2537390010262572e-36, 0), 1502: (424, 152, 63152, 19.627988815307617, 3), 1632: (400, 152, 400, 19.375289916992188, 3), 1468: (424, 152, 38088, 1.0124213248931084e-39, 0), 1642: (400, 152, 400, 6.2537390010262572e-36, 0), 1518: (424, 152, 63896, 1.0127436235399031e-39, 0), 1647: (400, 152, 400, 6.2537390010262572e-36, 0), 1650: (424, 152, 53424, 16.752857208251953, 3), 1651: (400, 152, 400, 19.250289916992188, 3), 1653: (424, 152, 50624, 1.0126497365427934e-39, 0), 1532: (400, 152, 400, 6.2537390010262572e-36, 0)} The keys come out fine, the values are screwed up. I tried same thing loading file in binary mode; didn't fix the problem. Any idea what I'm doing wrong? Edit: Here's the code with binary. Note that the values are different in the unpickled object. >>> idmapfile = open("idmap", mode="wb") >>> pickle.dump(idMap, idmapfile) >>> idmapfile.close() >>> idmapfile = open("idmap", mode="rb") >>> unpickled = pickle.load(idmapfile) >>> unpickled==idMap False >>> unpickled[1] {1537: (12176, 2281, 56700, -1.0808743559293829e+18, 152), 1540: (0, 0, 15934, 2.7457842047810522e+26, 108), 1544: (400, 152, 400, 4.9518498821046956e+27, 53), 1675: (408, 152, 408, 2.7201111331839077e+26, 34), 1550: (456, 152, 456, -1.1349175514578289e+18, 152), 1424: (432, 152, 432, 4.5939047815653343e-40, 11), 1681: (408, 152, 408, 2.1299736657737219e-43, 0), 1560: (376, 152, 376, 2.1299736657737219e-43, 0), 1690: (376, 152, 376, 2.1299736657737219e-43, 0), 1435: (376, 152, 376, 2.1299736657737219e-43, 0), 1436: (376, 152, 376, 2.1299736657737219e-43, 0), 1434: (376, 152, 376, 2.1299736657737219e-43, 0), 1441: (376, 152, 376, 2.1299736657737219e-43, 0), 1691: (376, 152, 376, 2.1299736657737219e-43, 0), 1444: (376, 152, 376, 2.1299736657737219e-43, 0), 1606: (25784, 2281, 376, -3.2883343074537754e+26, 34), 1575: (24240, 2281, 376, 2.1299736657737219e-43, 0), 1586: (24240, 2281, 376, 2.1299736657737219e-43, 0), 1716: (24240, 2281, 376, -3.0093091599657311e-35, 26), 1590: (24240, 2281, 376, 2.1299736657737219e-43, 0), 1463: (24240, 2281, 376, 2.1299736657737219e-43, 0), 1594: (24240, 2281, 376, -4123208450048.0, 196), 1467: (25784, 2281, 376, 2.1299736657737219e-43, 0), 1596: (25784, 2281, 376, 2.1299736657737219e-43, 0), 1602: (25784, 2281, 376, -5.9963281433905448e+26, 76), 1547: (25784, 2281, 376, -218106240.0, 139), 1605: (25784, 2281, 376, -3.7138649803377281e+27, 56), 1478: (376, 152, 376, 2.1299736657737219e-43, 0), 1719: (25784, 2281, 376, 2.1299736657737219e-43, 0), 1617: (25784, 2281, 376, -1.4411779941597184e+17, 237), 1492: (25784, 2281, 376, 2.8596493694487798e-30, 80), 1622: (25784, 2281, 376, 184686084096.0, 93), 1628: (1336, 152, 1336, 3.1691839245470052e+29, 179), 1502: (1272, 152, 1272, -5.2042207205116645e-17, 99), 1632: (1208, 152, 1208, 2.1299736657737219e-43, 0), 1468: (1144, 152, 1144, 2.1299736657737219e-43, 0), 1642: (1080, 152, 1080, 2.1299736657737219e-43, 0), 1518: (1016, 152, 1016, 4.0240902787680023e+35, 145), 1647: (952, 152, 952, -985172619034624.0, 237), 1650: (888, 152, 888, 12094787289088.0, 66), 1651: (824, 152, 824, 2.1299736657737219e-43, 0), 1653: (760, 152, 760, 0.00018310768064111471, 238), 1532: (696, 152, 696, 8.8978061885676389e+26, 125)} OK I've isolated the problem, but don't know why it's so. First, apparently what I'm pickling are not tuples (though they look like it), but instead numpy.void types. Here is a series to illustrate the problem. first = run0.detections[0] >>> first (1, 19, 1578, 82.637763977050781, 1) >>> type(first) <type 'numpy.void'> >>> firstTuple = tuple(first) >>> theFile = open("pickleTest", "w") >>> pickle.dump(first, theFile) >>> theTupleFile = open("pickleTupleTest", "w") >>> pickle.dump(firstTuple, theTupleFile) >>> theFile.close() >>> theTupleFile.close() >>> first (1, 19, 1578, 82.637763977050781, 1) >>> firstTuple (1, 19, 1578, 82.637764, 1) >>> theFile = open("pickleTest", "r") >>> theTupleFile = open("pickleTupleTest", "r") >>> unpickledTuple = pickle.load(theTupleFile) >>> unpickledVoid = pickle.load(theFile) >>> type(unpickledVoid) <type 'numpy.void'> >>> type(unpickledTuple) <type 'tuple'> >>> unpickledTuple (1, 19, 1578, 82.637764, 1) >>> unpickledTuple == firstTuple True >>> unpickledVoid == first False >>> unpickledVoid (7936, 1705, 56700, -1.0808743559293829e+18, 152) >>> first (1, 19, 1578, 82.637763977050781, 1)

Read the article
how to send some data to the Thread module on python and google-map-engine

- by zjm1126

from google.appengine.ext import db class Log(db.Model): content = db.StringProperty(multiline=True) class MyThread(threading.Thread): def run(self,request): #logs_query = Log.all().order('-date') #logs = logs_query.fetch(3) log=Log() log.content=request.POST.get('content',None) log.put() def Log(request): thr = MyThread() thr.start(request) return HttpResponse('') error is : Exception in thread Thread-1: Traceback (most recent call last): File "D:\Python25\lib\threading.py", line 486, in __bootstrap_inner self.run() File "D:\zjm_code\helloworld\views.py", line 33, in run log.content=request.POST.get('content',None) NameError: global name 'request' is not defined

Read the article
Python RegExp exception

- by Jasie

How do I split on all nonalphanumeric characters, EXCEPT the apostrophe? re.split('\W+',text) works, but will also split on apostrophes. How do I add an exception to this rule? Thanks!

Read the article
I want to design a html form in python

- by VaIbHaV-JaIn

when user will enter details in the text box on the html from <h1>Please enter new password</h1> <form method="POST" enctype="application/json action="uid"> Password<input name="passwd"type="password" /><br> Retype Password<input name="repasswd" type="password" /><br> <input type="Submit" /> </form> </body> i want to post the data in json format through http post request and also i want to set content-type = application/json

Read the article
How to remove commas etc from a matrix in python

- by robert

say ive got a matrix that looks like: [[0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0]] how can i make it on seperate lines: [[0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0]] and then remove commas etc: 0 0 0 0 0 And also to make it blank instead of 0's, so that numbers can be put in later, so in the end it will be like: _ 1 2 _ 1 _ 1 (spaces not underscores) thanks

Read the article
Django models & Python class attributes

- by Geo

The tutorial on the django website shows this code for the models: from django.db import models class Poll(models.Model): question = models.CharField(max_length=200) pub_date = models.DateTimeField('date published') class Choice(models.Model): poll = models.ForeignKey(Poll) choice = models.CharField(max_length=200) votes = models.IntegerField() Now, each of those attribute, is a class attribute, right? So, the same attribute should be shared by all instances of the class. A bit later, they present this code: class Poll(models.Model): # ... def __unicode__(self): return self.question class Choice(models.Model): # ... def __unicode__(self): return self.choice How did they turn from class attributes into instance attributes? Did I get class attributes wrong?

Read the article
Python unicode search not giving correct answer

- by user1318912

I am trying to search hindi words contained one line per file in file-1 and find them in lines in file-2. I have to print the line numbers with the number of words found. This is the code: import codecs hypernyms = codecs.open("hindi_hypernym.txt", "r", "utf-8").readlines() words = codecs.open("hypernyms_en2hi.txt", "r", "utf-8").readlines() count_arr = [] for counter, line in enumerate(hypernyms): count_arr.append(0) for word in words: if line.find(word) >=0: count_arr[counter] +=1 for iterator, count in enumerate(count_arr): if count>0: print iterator, ' ', count This is finding some words, but ignoring some others The input files are: File-1: ???? ??????? File-2: ???????, ????-???? ?????-???, ?????-???, ?????_???, ?????_??? ????_????, ????-????, ???????_???? ????-???? This gives output: 0 1 3 1 Clearly, it is ignoring ??????? and searching for ???? only. I have tried with other inputs as well. It only searches for one word. Any idea how to correct this?

Read the article
[Python] OR in regular expression?

- by www.yegorov-p.ru

Hello. I have text file with several thousands lines. I want to parse this file into database and decided to write a regexp. Here's part of file: blablabla checked=12 unchecked=1 blablabla unchecked=13 blablabla checked=14 As a result, I would like to get something like (12,1) (0,13) (14,0) Is it possible?

Read the article
Dynamic Operator Overloading on dict classes in Python

- by Ishpeck

I have a class that dynamically overloads basic arithmetic operators like so... import operator class IshyNum: def __init__(self, n): self.num=n self.buildArith() def arithmetic(self, other, o): return o(self.num, other) def buildArith(self): map(lambda o: setattr(self, "__%s__"%o,lambda f: self.arithmetic(f, getattr(operator, o))), ["add", "sub", "mul", "div"]) if __name__=="__main__": number=IshyNum(5) print number+5 print number/2 print number*3 print number-3 But if I change the class to inherit from the dictionary (class IshyNum(dict):) it doesn't work. I need to explicitly def __add__(self, other) or whatever in order for this to work. Why?

Read the article
Adding a small photo/image to a large graph in Matplotlib/Python

- by Mark

I have a large graph that I am generating in matplotlib. I'd like to add a number of icons to this graph at certain (x,y) coordinates. I am wondering if there is any way to do that in matplotlib Thank you

Read the article
Python debugging in Eclipse+PyDev

- by Gökhan Sever

Hello, I try Eclipse+PyDev pair for some of my work. (Eclipse v3.5.0 + PyDev v1.5.6) I couldn't find a way to expose all of my variables to the PyDev console (Through PyDev console - Console for current active editor option) I use a simple code to describe the issue. When I step-by-step go through the code I can't access my "x" variable from the console. It is viewed on Variables tab, but that's not really what I want. Any help is appreciate. See my screenshot for better description:

Read the article
Python combinations no repeat by constraint

- by user2758113

I have a tuple of tuples (Name, val 1, val 2, Class) tuple = (("Jackson",10,12,"A"), ("Ryan",10,20,"A"), ("Michael",10,12,"B"), ("Andrew",10,20,"B"), ("McKensie",10,12,"C"), ("Alex",10,20,"D")) I need to return all combinations using itertools combinations that do not repeat classes. How can I return combinations that dont repeat classes. For example, the first returned statement would be: tuple0, tuple2, tuple4, tuple5 and so on.

Read the article
Look for match in a nested list in Python

- by elfuego1

Hello everybody, I have two nested lists of different sizes: A = [[1, 7, 3, 5], [5, 5, 14, 10]] B = [[1, 17, 3, 5], [1487, 34, 14, 74], [1487, 34, 3, 87], [141, 25, 14, 10]] I'd like to gather all nested lists from list B if A[2:4] == B[2:4] and put it into list L: L = [[1, 17, 3, 5], [141, 25, 14, 10]] Additionally if the match occurs then I want to change last element of sublist B into first element of sublist A so the final solution would look like this: L1 = [[1, 17, 3, 1], [141, 25, 14, 5]]

Read the article
supply inputs to python unittests

- by zubin71

I`m relatively new to the concept of unit-testing and have very little experience in the same. I have been looking at lots of articles on how to write unit-tests; however, I still have difficulty in writing tests where conditions like the following arise:- Test user Input. Test input read from a file. Test input read from an environment variable. Itd be great if someone could show me how to approach the above mentioned scenarios; itd still be awesome if you could point me to a few docs/articles/blog posts which I could read.

Read the article
Extra characters Extracted with XPath and Python (html)

- by Nacari

I have been using XPath with scrapy to extract text from html tags online, but when I do I get extra characters attached. An example is trying to extract a number, like "204" from a <td> tag and getting [u'204']. In some cases its much worse. For instance trying to extract "1 - Mathoverflow" and instead getting [u'\r\n\t\t 1 \u2013 MathOverflow\r\n\t\t ']. Is there a way to prevent this, or trim the strings so that the extra characters arent a part of the string? (using items to store the data). It looks like it has something to do with formatting, so how do I get xpath to not pick up that stuff?

Read the article
How to change the stdin encoding on python

- by user210481

Hi, I'm using windows and linux machines for the same project. The default encoding for stdin on windows is cp1252 and on linux is utf-8. I would like to change everything to uft-8. Is it possible? How can I do it? Thanks Eduardo

Read the article
Python module being reloaded for each request with django and mod_wsgi

- by Vishal

I have a variable in init of a module which get loaded from the database and takes about 15 seconds. For django development server everything is working fine but looks like with apache2 and mod_wsgi the module is loaded with every request (taking 15 seconds). Any idea about this behavior? Update: I have enabled daemon mode in mod wsgi, looks like its not reloading the modules now! needs more testing and I will update.

Read the article
Sorting Python list based on the length of the string

- by prosseek

I want to sort a list of strings based on the string length. I tried to use sort as follows, but it doesn't seem to give me correct result. xs = ['dddd','a','bb','ccc'] print xs xs.sort(lambda x,y: len(x) < len(y)) print xs ['dddd', 'a', 'bb', 'ccc'] ['dddd', 'a', 'bb', 'ccc'] What might be wrong?

Read the article
match strings in python

- by mesun

Write a function, called constrainedMatchPair which takes three arguments: a tuple representing starting points for the first substring, a tuple representing starting points for the second substring, and the length of the first substring. The function should return a tuple of all members (call it n) of the first tuple for which there is an element in the second tuple (call it k) such that n+m+1 = k, where m is the length of the first substring. Complete the definition def constrainedMatchPair(firstMatch,secondMatch,length):

Read the article
has any simply way to delete a value in list of python

- by zjm1126

a=[1,2,3,4] b=a.index(6) del a[b] print a it show error: Traceback (most recent call last): File "D:\zjm_code\a.py", line 6, in <module> b=a.index(6) ValueError: list.index(x): x not in list so i have to do this : a=[1,2,3,4] try: b=a.index(6) del a[b] except: pass print a but this is not simple,has any simply way ? thanks

Read the article

< Previous Page | 155 156 157 158 159 160 161 162 163 164 165 166 | Next Page >