urlopen - Page 5 - Developer IT

Partial Upload With storbinary in python

- by brian

I've written some python code to download an image using urllib.urlopen().read() and then upload it to an FTP site using ftplib.FTP().storbinary() but I'm having a problem. Sometimes the image file is only partially uploaded, so I get images with the bottom 20% or so cut off. I've checked the locally downloaded version and I have successfully downloaded the entire image, which leads me to believe that it is a problem with storbinary. I believe I am opening and closing all of the files correctly. Does anyone have any clues as to why I'm getting a partial upload with storbinary? Update: When I run through the commands in the Python shell, the upload completes successfully, I don't know why it would be different from when run as a script...

Read the article

How can I get the last-modified time with python3 urllib?

- by Daenyth

I'm porting over a program of mine from python2 to python3, and I'm hitting the following error: AttributeError: 'HTTPMessage' object has no attribute 'getdate' Here's the code: conn = urllib.request.urlopen(fileslist, timeout=30) last_modified = conn.info().getdate('last-modified') This section worked under python 2.7, and so far I haven't been able to find out the correct method to get this information in python 3.1. The full context is an update method. It pulls new files from a server down to its local database, but only if the file on the server is newer than the local file. If there's a smarter way to achieve this functionality than just comparing local and remote file timestamps, then I'm open to that as well.

Read the article

How to implement python to find value between xml tags?

- by Harshit Sharma

I am using google site to retrieve weather information , I want to find values between XML tags. Following code give me weather condition of a city , but I am unable to obtain other parameters such as temperature and if possible explain working of split function implied in the code: import urllib def getWeather(city): #create google weather api url url = "http://www.google.com/ig/api?weather=" + urllib.quote(city) try: # open google weather api url f = urllib.urlopen(url) except: # if there was an error opening the url, return return "Error opening url" # read contents to a string s = f.read() # extract weather condition data from xml string weather = s.split("<current_conditions><condition data=\"")[-1].split("\"")[0] # if there was an error getting the condition, the city is invalid if weather == "<?xml version=": return "Invalid city" #return the weather condition return weather def main(): while True: city = raw_input("Give me a city: ") weather = getWeather(city) print(weather) if __name__ == "__main__": main() Thank You

Read the article

Backup Google Calendar programmatically: https://www.google.com/calendar/exporticalzip

- by Michael

I'm struggling with writing a python script that automatically grabs the zip fail containing all my google calendars and stores it (as a backup) on my harddisk. I'm using ClientLogin to get an authentication token (and successfully can obtain the token). Unfortunately, i'm unable to retrieve the file at https://www.google.com/calendar/exporticalzip It always asks me for the login credentials again by returning a login page as html (instead of the zip). Here's the critical code: post_data = post_data = urllib.urlencode({ 'auth': token, 'continue': zip_url}) request = urllib2.Request('https://www.google.com/calendar', post_data, header) try: f = urllib2.urlopen(request) result = f.read() except: print "Error" Anyone any ideas or done that before? Or an alternative idea how to backup all my calendars (automatically!)

Read the article

Python: find <title>

- by Peter

I have this: response = urllib2.urlopen(url) html = response.read() begin = html.find('<title>') end = html.find('</title>',begin) title = html[begin+len('<title>'):end].strip() if the url = http://www.google.com then the title have no problem as "Google", but if the url = "http://www.britishcouncil.org/learning-english-gateway" then the title become "<!doctype html public "-//W3C//DTD HTML 4.0 Transitional//EN"> <HTML> <HEAD> <base href="http://www.britishcouncil.org/" /> <META http-equiv="Content-Type" Content="text/html;charset=utf-8"> <meta name="WT.sp" content="Learning;Home Page Smart View" /> <meta name="WT.cg_n" content="Learn English Gateway" /> <META NAME="DCS.dcsuri" CONTENT="/learning-english-gateway.htm">..." What is actually happening, why I couldn't return the "title"?

Read the article

Google Search API - Only returning 4 results

- by user353829

After much experimenting and googling, the following Python code successfully calls Google's Search APi - but only returns 4 results: after reading the Google Search API docs, I thought the 'start=' would return additional results: but this not happen. Can anyone give pointers? Thanks. Python code: /usr/bin/python import urllib import simplejson query = urllib.urlencode({'q' : 'site:example.com'}) url = 'http://ajax.googleapis.com/ajax/services/search/web?v=1.0&%s&start=50' \ % (query) search_results = urllib.urlopen(url) json = simplejson.loads(search_results.read()) results = json['responseData']['results'] for i in results: print i['title'] + ": " + i['url']

Read the article

Python BeautifulSoup Print Info in CSV

- by Codin

I can print the information I am pulling from a site with no problem. But when I try to place the street names in one column and the zipcodes into another column into a CSV file that is when I run into problems. All I get in the CSV is the two column names and every thing in its own column across the page. Here is my code. Also I am using Python 2.7.5 and Beautiful soup 4 from bs4 import BeautifulSoup import csv import urllib2 url="http://www.conakat.com/states/ohio/cities/defiance/road_maps/" page=urllib2.urlopen(url) soup = BeautifulSoup(page.read()) f = csv.writer(open("Defiance Steets1.csv", "w")) f.writerow(["Name", "ZipCodes"]) # Write column headers as the first line links = soup.find_all(['i','a']) for link in links: names = link.contents[0] print unicode(names) f.writerow(names)

Read the article

Python: replace urls with title names from a string

- by Hellnar

Hello I would like to remove urls from a string replace them with their titles of the original contents. For example: mystring = "Ah I like this site: http://www.stackoverflow.com. Also I must say I like http://www.digg.com" sanitize(mystring) # it becomes "Ah I like this site: Stack Overflow. Also I must say I like Digg - The Latest News Headlines, Videos and Images" For replacing url to the title, I have written this snipplet: #get_title: string -> string def get_title(url): """Returns the title of the input URL""" output = BeautifulSoup.BeautifulSoup(urllib.urlopen(url)) return output.title.string

Read the article

Replacing backslashes in Python strings

- by user323659

I have some code to encrypt some strings in Python. Encrypted text is used as a parameter in some urls, but after encrypting, there comes backslashes in string and I cannot use single backslash in urllib2.urlopen. I cannot replace single backslash with double. For example: print cipherText '\t3-@\xab7+\xc7\x93H\xdc\xd1\x13G\xe1\xfb' print cipherText.replace('\\','\\\\') '\t3-@\xab7+\xc7\x93H\xdc\xd1\x13G\xe1\xfb' Also putting r in front of \ in replace statement did not worked. All I want to do is calling that kind of url: http://awebsite.me/main?param="\t3-@\xab7+\xc7\x93H\xdc\xd1\x13G\xe1\xfb" And also this url can be successfully called: http://awebsite.me/main?param="\\t3-@\\xab7+\\xc7\\x93H\\xdc\\xd1\\x13G\\xe1\\xfb" Any idea will be appreciated.

Read the article

convert the key in MIME encoded form in python

- by jaysh

this is the code : f = urllib.urlopen('http://pool.sks-keyservers.net:11371/pks/lookup?op=get&search= 0x58e9390daf8c5bf3') #Retrieve the public key from PKS data = f.read() decoded_bytes = base64.b64decode(data) print decoded_bytes i need to convert the key in MIME encoded form which is presently comes in (ascii armored) radix 64 format.for that i have to get this radix64 format in its binary form and also need to remove its header and checksum than coversion in MIME format but i didnt find any method which can do this conversion. i used the base64.b64decode method and its give me error: Traceback (most recent call last): File "RetEnc.py", line 12, in ? decoded_bytes = base64.b64decode(data) File "/usr/lib/python2.4/base64.py", line 76, in b64decode raise TypeError(msg) TypeError: Incorrect padding what to do i'didnt getting .can anybody suggest me something related to this...... thanks!!!!

Read the article

escaping query string with special characters with python

- by that_guy

I got some pretty messy urls that i got via scraping here, problem is that they contain spaces or other special characters in the path and query string, here is some example http://www.example.com/some path/to the/file.html http://www.example.com/some path/?file=path to/file name.png&name=name.me so, is there an easy and robust way to escape the urls so that i can pass them to urlopen? i tried urlib.quote, but it seems to escape the '?', '&', and '=' in the query string as well, and it seems to escape the protocol as well, currently, what i am trying to do is use regex to separate the protocol, path name, and query string and escape them separately, but there are cases where they arent separated properly any advice is appreciated

Read the article

How do I print the Images?

- by user1477539

I want to print the images of the 30 nba teams drafting in the first round. However when I tell it to print it prints out the link instead of the image. How do I get it to print out the image instead of giving me the image link. Here's my code: import urllib2 from BeautifulSoup import BeautifulSoup # or if your're using BeautifulSoup4: # from bs4 import BeautifulSoup soup = BeautifulSoup(urllib2.urlopen('http://www.cbssports.com/nba/draft/mock-draft').read()) rows = soup.findAll("table", attrs = {'class': 'data borderTop'})[0].tbody.findAll("tr")[2:] for row in rows: fields = row.findAll("td") if len(fields) >= 3: anchor = row.findAll("td")[1].find("a") if anchor: print anchor

Read the article

Correct way to protect a private API key when versioning a python application on a public git repo

- by systempuntoout

I would like to open-source a python project on Github but it contains an API key that should not be distributed. I guess there's something better than removing the key each time a "push" is committed to the repo. Imagine a simplified foomodule.py : import urllib2 API_KEY = 'XXXXXXXXX' urllib2.urlopen("http://example.com/foo?id=123%s" % API_KEY ).read() What i'm thinking is: Move the API_KEY in a second key.py module importing it on foomodule.py; i would then add key.py on .gitignore file. Same as 1 but using ConfigParser Do you know a good programmatic way to handle this scenario?

Read the article

Python: Problems finding string in website source code

- by j00niner

I open a website with urlopen. I then put the website sourcecode into a variable like so source = website.read() When I just print the source it comes out formatted correctly, however when I try to iterate through each line each character is it's own line. for example when I just print it looks like this <HTML> title</html> When I do this for line in source: print line it looks like this < H T M L ... etc I need to find a string that starts with "var" and then print that entire line.

Read the article

Python if statement not working as expected

- by Chris Esposito

I'm searching for a string in a website and checking to see if the location of this string is in the expected location. I know the string starts at the 182nd character, and if I print temp it will even tell me that it is 182, however, the if statement says 182 is not 182. Some code f = urllib.urlopen(link) #store page contents in 's' s = f.read() f.close() temp = s.find('lettersandnumbers') if (htmlsize == "197"): #if ((s.find('lettersandnumbers')) == "182"): if (temp=="182"): print "Glorious" doStuff() else: print "HTML not correct. Aborting." else: print htmlsize print "File size is incorrect. Aborting."

Read the article

Iterating through a JSON object.

- by user327508

[ { "title": "Baby (Feat. Ludacris) - Justin Bieber", "description": "Baby (Feat. Ludacris) by Justin Bieber on Grooveshark", "link": "http://listen.grooveshark.com/s/Baby+Feat+Ludacris+/2Bqvdq", "pubDate": "Wed, 28 Apr 2010 02:37:53 -0400", "pubTime": 1272436673, "TinyLink": "http://tinysong.com/d3wI", "SongID": "24447862", "SongName": "Baby (Feat. Ludacris)", "ArtistID": "1118876", "ArtistName": "Justin Bieber", "AlbumID": "4104002", "AlbumName": "My World (Part II);\nhttp://tinysong.com/gQsw", "LongLink": "11578982", "GroovesharkLink": "11578982", "Link": "http://tinysong.com/d3wI" }, { "title": "Feel Good Inc - Gorillaz", "description": "Feel Good Inc by Gorillaz on Grooveshark", "link": "http://listen.grooveshark.com/s/Feel+Good+Inc/1UksmI", "pubDate": "Wed, 28 Apr 2010 02:25:30 -0400", "pubTime": 1272435930 } ] That is the current JSON object I have. I am now trying to iterate through it to get the import stuff like title and link. This is where I am having trouble I cant seem to get to the content that is past the ":" i tried doing dictionary way couldn't get it. def getLastSong(user,limit): base_url = 'http://gsuser.com/lastSong/' user_url = base_url + str(user) + '/' + str(limit) + "/" raw = urllib.urlopen(user_url) json_raw= raw.readlines() json_object = json.loads(json_raw[0]) #filtering and making it look good. gsongs = [] print json_object for song in json_object[0]: print song This code prints all the information before ":" Please help. ignore the Justin Bieber track :)

Read the article

Trying to grab just absolute links from a webpage using BeautifulSoup

- by Kevin

I am reading the contents of a webpage using BeautifulSoup. What I want is to just grab the <a href> that start with http://. I know in beautifulsoup you can search by the attributes. I guess I am just having a syntax issue. I would imagine it would go something like. page = urllib2.urlopen("http://www.linkpages.com") soup = BeautifulSoup(page) for link in soup.findAll('a'): if link['href'].startswith('http://'): print links But that returns: Traceback (most recent call last): File "<stdin>", line 2, in <module> File "C:\Python26\lib\BeautifulSoup.py", line 598, in __getitem__ return self._getAttrMap()[key] KeyError: 'href' Any ideas? Thanks in advance. EDIT This isn't for any site in particular. The script gets the url from the user. So internal link targets would be an issue, that's also why I only want the <'a'> from the pages. If I turn it towards www.reddit.com, it parses the beginning links and it gets to this: <a href="http://www.reddit.com/top/">top</a> <a href="http://www.reddit.com/saved/">saved</a> Traceback (most recent call last): File "<stdin>", line 2, in <module> File "C:\Python26\lib\BeautifulSoup.py", line 598, in __getitem__ return self._getAttrMap()[key] KeyError: 'href'

Read the article

Python: fetching SVG file using urllib is returning binary when I need ASCII

- by Drew Dara-Abrams

I'm using urllib (in Python) to fetch an SVG file: import urllib urllib.urlopen('http://alpha.vectors.cloudmade.com/BC9A493B41014CAABB98F0471D759707/-122.2487,37.87588,-122.265823,37.868054?styleid=1&viewport=400x231').read() which produces output of the sort: xb6\xf6\x00\xb3\xfb2\xff\xda\xc5\xf2\xc2\x14\xef\xcd\x82\x0b\xdbU\xb0\x81\xcaF\xd8\x1a\xf6\xdf[i)\xba\xcf\x80\xab\xd6\x8c\xe3l_\xe7\n\xed2,\xbdm\xa0_|\xbb\x12\xff\xb6\xf8\xda\xd9\xc3\xd9\t\xde\x9a\xf8\xae\xb3T\xa3\r`\x8a\x08!T\xfb8\x92\x95\x0c\xdd\x8b!\x02P\xea@\x98\x1c^\xc7\xda\\\xec\xe3\xe1\xbe,0\xcd\xbeZ~\x92\xb3\xfa\xdd\xfcbyu\xb8\x83\xbb\xbdS\x0f\x82\x0b\xfe\xf5_\xdawn\xff\xef_\xff\xe5\xfa\x1f?\xbf\xffoZ\x0f\x8b\xbfV\xf4\x04\x00' when I was expecting more like this: <?xml version='1.0' encoding='UTF-8'?> <svg xmlns="http://www.w3.org/2000/svg" xmlns:cm="http://cloudmade.com/" width="400" height="231"> <rect width="100%" height="100%" fill="#eae8dd" opacity="1"/> <g transform="scale(0.209849975856)"> <g transform="translate(13610569, 4561906)" flood-opacity="0.1" flood-color="grey"> <path d="M -13610027.720000000670552 -4562403.660000000149012 I guess this is an issue of binary vs. ASCII. Can anyone help me (a Python newbie) with the appropriate conversion so that I can get on with parsing and manipulating the SVG code?

Read the article

Python web scraping involving HTML tags with attributes

- by rohanbk

I'm trying to make a web scraper that will parse a web-page of publications and extract the authors. The skeletal structure of the web-page is the following: <html> <body> <div id="container"> <div id="contents"> <table> <tbody> <tr> <td class="author">####I want whatever is located here ###</td> </tr> </tbody> </table> </div> </div> </body> </html> I've been trying to use BeautifulSoup and lxml thus far to accomplish this task, but I'm not sure how to handle the two div tags and td tag because they have attributes. In addition to this, I'm not sure whether I should rely more on BeautifulSoup or lxml or a combination of both. What should I do? At the moment, my code looks like what is below: import re import urllib2,sys import lxml from lxml import etree from lxml.html.soupparser import fromstring from lxml.etree import tostring from lxml.cssselect import CSSSelector from BeautifulSoup import BeautifulSoup, NavigableString address='http://www.example.com/' html = urllib2.urlopen(address).read() soup = BeautifulSoup(html) html=soup.prettify() html=html.replace('&nbsp', '&#160') html=html.replace('&iacute','&#237') root=fromstring(html) I realize that a lot of the import statements may be redundant, but I just copied whatever I currently had in more source file. EDIT: I suppose that I didn't make this quite clear, but I have multiple tags in page that I want to scrape.

Read the article

Download-from-PyPI-and-install script

- by zubin71

Hello, I have written a script which fetches a distribution, given the URL. After downloading the distribution, it compares the md5 hashes to verify that the file has been downloaded properly. This is how I do it. def download(package_name, url): import urllib2 downloader = urllib2.urlopen(url) package = downloader.read() package_file_path = os.path.join('/tmp', package_name) package_file = open(package_file_path, "w") package_file.write(package) package_file.close() I wonder if there is any better(more pythonic) way to do what I have done using the above code snippet. Also, once the package is downloaded this is what is done: def install_package(package_name): if package_name.endswith('.tar'): import tarfile tarfile.open('/tmp/' + package_name) tarfile.extract('/tmp') import shlex import subprocess installation_cmd = 'python %ssetup.py install' %('/tmp/'+package_name) subprocess.Popen(shlex.split(installation_cmd) As there are a number of imports for the install_package method, i wonder if there is a better way to do this. I`d love to have some constructive criticism and suggestions for improvement. Also, I have only implemented the install_package method for .tar files; would there be a better manner by which I could install .tar.gz and .zip files too without having to write seperate methods for each of these?

Read the article

Python: puzzling behaviour inside httplib

- by Anna

I have added one line ( import pdb; pdb.set_trace() ) to httplib's HTTPConnection.putheader, so I can see what's going on inside. httplib.py, line 489: def putheader(self, header, value): """Send a request header line to the server. For example: h.putheader('Accept', 'text/html') """ import pdb; pdb.set_trace() if self.__state != _CS_REQ_STARTED: raise CannotSendHeader() str = '%s: %s' % (header, value) self._output(str) then ran this from the interpreter import urllib2 urllib2.urlopen('http://www.ioerror.us/ip/headers') ... and as expected PDB kicks in: > c:\python26\lib\httplib.py(858)putheader() -> if self.__state != _CS_REQ_STARTED: (Pdb) in PDB I have the luxury of evaluating expressions on the fly, so I have tried to enter self.__state: (Pdb) self.__state *** AttributeError: HTTPConnection instance has no attribute '__state' Alas, there is no __state of this instance. However when I enter step, the debugger gets past the if self.__state != _CS_REQ_STARTED: line without a problem. Why is this happening? If the self.__state doesn't exist python would have to raise an exception as it did when I entered the expression. Python version: 2.6.4 on win32

Read the article

Making HTTP POST request

- by infrared

I'm trying to make a POST request to retrieve information about a book. Here is the code that returns HTTP code: 302, Moved import httplib, urllib params = urllib.urlencode({ 'isbn' : '9780131185838', 'catalogId' : '10001', 'schoolStoreId' : '15828', 'search' : 'Search' }) headers = {"Content-type": "application/x-www-form-urlencoded", "Accept": "text/plain"} conn = httplib.HTTPConnection("bkstr.com:80") conn.request("POST", "/webapp/wcs/stores/servlet/BuybackSearch", params, headers) response = conn.getresponse() print response.status, response.reason data = response.read() conn.close() When I try from a browser, from this page: http://www.bkstr.com/webapp/wcs/stores/servlet/BuybackMaterialsView?langId=-1&catalogId=10001&storeId=10051&schoolStoreId=15828 , it works. What am I missing in my code? Thanks EDIT: Here's what I get when I call print response.msg 302 Moved Date: Tue, 07 Sep 2010 16:54:29 GMT Vary: Host,Accept-Encoding,User-Agent Location: http://www.bkstr.com/webapp/wcs/stores/servlet/BuybackSearch X-UA-Compatible: IE=EmulateIE7 Content-Length: 0 Content-Type: text/plain; charset=utf-8 Seems that the location points to the same url I'm trying to access in the first place? EDIT2: I've tried using urllib2 as suggested here. Here is the code: import urllib, urllib2 url = 'http://www.bkstr.com/webapp/wcs/stores/servlet/BuybackSearch' values = {'isbn' : '9780131185838', 'catalogId' : '10001', 'schoolStoreId' : '15828', 'search' : 'Search' } data = urllib.urlencode(values) req = urllib2.Request(url, data) response = urllib2.urlopen(req) print response.geturl() print response.info() the_page = response.read() print the_page And here is the output: http://www.bkstr.com/webapp/wcs/stores/servlet/BuybackSearch Date: Tue, 07 Sep 2010 16:58:35 GMT Pragma: No-cache Cache-Control: no-cache Expires: Thu, 01 Jan 1970 00:00:00 GMT Set-Cookie: JSESSIONID=0001REjqgX2axkzlR6SvIJlgJkt:1311s25dm; Path=/ Vary: Accept-Encoding,User-Agent X-UA-Compatible: IE=EmulateIE7 Content-Length: 0 Connection: close Content-Type: text/html; charset=utf-8 Content-Language: en-US Set-Cookie: TSde3575=225ec58bcb0fdddfad7332c2816f1f152224db2f71e1b0474c866f3b; Path=/

Read the article

why does b'(and sometimes b' ') show up when I split some HTML source[Python]

- by Oliver

I'm fairly new to Python and programming in general. I have done a few tutorials and am about 2/3 through a pretty good book. That being said I've been trying to get more comfortable with Python and proggramming by just trying things in the std lib out. that being said I have recently run into a wierd quirk that I'm sure is the result of my own incorrect or un-"pythonic" use of the urllib module(with Python 3.2.2) import urllib.request HTML_source = urllib.request.urlopen(www.somelink.com).read() print(HTML_source) when this bit is run through the active interpreter it returns the HTML source of somelink, however it prefixes it with b' for example b'<HTML>\r\n<HEAD> (etc). . . . if I split the string into a list by whitespace it prefixes every item with the b' I'm not really trying to accomplish something specific just trying to familiarize myself with the std lib. I would like to know why this b' is getting prefixed also bonus -- Is there a better way to get HTML source WITHOUT using a third party module. I know all that jazz about not reinventing the wheel and what not but I'm trying to learn by "building my own tools" Thanks in Advance!

Read the article

python thread prob after build

- by Apache

hi expert, i'm having task to scan wifi at specific interval and send it to the server, i've it in python and its works fine when i run manually, then build it to package and when run there is no progress at all, i already ask this question before at http://stackoverflow.com/questions/2735410/python-scritp-problem-once-build-and-package-it, then, i re-modify my code as below, then i found that thread is not functioning once i build, #!/usr/bin/env python import subprocess,threading,... configFile = open('/opt/Jemapoh_Wifi/config.txt', 'r') url = configFile.readline().strip() intervalTime = configFile.readline().strip() status = configFile.readline().strip() print "url "+url print "intervalTime "+intervalTime print "Status "+status.strip() def getMacAddress(): proc = subprocess.Popen('ifconfig -a wlan0 | grep HWaddr | sed \'/^.*HWaddr */!d; s///;q\'', shell=True, stdout=subprocess.PIPE, ) macAddress = proc.communicate()[0].strip() return macAddress def getTimestamp(): from time import strftime timeStamp = strftime("%Y-%m-%d %H:%M:%S") return timeStamp def scanWifi(): try: print "Scanning..." proc = subprocess.Popen('iwlist scan 2>/dev/null', shell=True, stdout=subprocess.PIPE, ) stdout_str = proc.communicate()[0] stdout_list=stdout_str.split('\n') essid=[] rssi=[] preQuality=[] for line in stdout_list: line=line.strip() match=re.search('ESSID:"(\S+)"',line) if match: essid.append(match.group(1)) match=re.search('Quality=(\S+)',line) if match: preQuality.append(match.group(1)) for qualityConversion in preQuality: qualityConversion = qualityConversion.split()[0].split('/') temp = str(int(round(float(qualityConversion[0]) / float(qualityConversion[1]) * 100))).rjust(2) rssi.append(temp) dataToPost = '{"userId":"' + getMacAddress() + '","timestamp":"' + getTimestamp() + '","wifi":[' for no in range(len(essid)): dataToPost += '{"ssid":"' + essid[no] + '","rssi":"' + rssi[no] + '"}' if no+1 == len(essid): pass else: dataToPost += ',' dataToPost += ']}' query_args = {"data":dataToPost} request = urllib2.Request(url) request.add_data(urllib.urlencode(query_args)) request.add_header('Content-Type', 'application/x-www-form-urlencoded') print "Waiting for server response..." print urllib2.urlopen(request).read() print "Data Sent @ " + getTimestamp() print "------------------------------------------------------" t = threading.Timer(int(intervalTime), scanWifi).start() except Exception, e: print e t = threading.Timer(int(intervalTime), scanWifi) t.start() once build, its not reaching the thread, do can anyone help, why the thread is not working after build thanks

Read the article

BeautifulSoup can't parse a webpage?

- by JLTChiu

I am using beautiful soup for parsing webpage now, I've heard it's very famous and good, but it doesn't seems works properly. Here's what I did import urllib2 from bs4 import BeautifulSoup page = urllib2.urlopen("http://www.cnn.com/2012/10/14/us/skydiver-record-attempt/index.html?hpt=hp_t1") soup = BeautifulSoup(page) print soup.prettify() I think this is kind of straightforward. I open the webpage and pass it to the beautifulsoup. But here's what I got: Warning (from warnings module): File "C:\Python27\lib\site-packages\bs4\builder\_htmlparser.py", line 149 "Python's built-in HTMLParser cannot parse the given document. This is not a bug in Beautiful Soup. The best solution is to install an external parser (lxml or html5lib), and use Beautiful Soup with that parser. See http://www.crummy.com/software/BeautifulSoup/bs4/doc/#installing-a-parser for help.")) ... HTMLParseError: bad end tag: u'</"+"script>', at line 634, column 94 I thought CNN website should be well designed, so I am not very sure what's going on though. Does anyone has idea about this?

Search Results

Search found 144 results on 6 pages for 'urlopen'.

Page 5/6 | < Previous Page | 1 2 3 4 5 6 | Next Page >

- by brian

- by Daenyth

- by Harshit Sharma

- by Michael

- by Peter

- by user353829

- by Codin

- by Hellnar

- by user323659

- by jaysh

- by that_guy

- by user1477539

- by systempuntoout

- by j00niner

- by Chris Esposito

- by user327508

- by Kevin

- by Drew Dara-Abrams

- by rohanbk

- by zubin71

- by Anna

- by infrared

- by Oliver

- by Apache

- by JLTChiu

< Previous Page | 1 2 3 4 5 6 | Next Page >