Search Results

Search found 166 results on 7 pages for 'urllib'.

Page 3/7 | < Previous Page | 1 2 3 4 5 6 7  | Next Page >

  • How do I modify this download function in Python?

    - by TIMEX
    Right now, it's iffy. Gzip, images, sometimes it doesn't work. How do I modify this download function so that it can work with anything? (Regardless of gzip or any header?) How do I automatically "Detect" if it's gzip? I don't want to always pass True/False, like I do right now. def download(source_url, g = False, correct_url = True): try: socket.setdefaulttimeout(10) agents = ['Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0)','Mozilla/4.0 (compatible; MSIE 7.0b; Windows NT 5.1)','Microsoft Internet Explorer/4.0b1 (Windows 95)','Opera/8.00 (Windows NT 5.1; U; en)'] ree = urllib2.Request(source_url) ree.add_header('User-Agent',random.choice(agents)) ree.add_header('Accept-encoding', 'gzip') opener = urllib2.build_opener() h = opener.open(ree).read() if g: compressedstream = StringIO(h) gzipper = gzip.GzipFile(fileobj=compressedstream) data = gzipper.read() return data else: return h except Exception, e: return ""

    Read the article

  • Log into Launchpad from python script

    - by jack
    How can I log into my Launchpad account in a python script? Any sample code would be appreciated. The login url is https://launchpad.net/+login and then redirect to something like https://login.launchpad.net/fJLVSRbxPfKTpVDr/+decide Thanks in advance!

    Read the article

  • escaping query string with special characters with python

    - by that_guy
    I got some pretty messy urls that i got via scraping here, problem is that they contain spaces or other special characters in the path and query string, here is some example http://www.example.com/some path/to the/file.html http://www.example.com/some path/?file=path to/file name.png&name=name.me so, is there an easy and robust way to escape the urls so that i can pass them to urlopen? i tried urlib.quote, but it seems to escape the '?', '&', and '=' in the query string as well, and it seems to escape the protocol as well, currently, what i am trying to do is use regex to separate the protocol, path name, and query string and escape them separately, but there are cases where they arent separated properly any advice is appreciated

    Read the article

  • python FancyURLopener timeout

    - by j3nc3k
    Hi, is there a way to set connection timeout for FancyURLopener()? I'm using FancyURLopener.retrieve() to download a file, but sometimes it just stucks and that's all... I think this is because it's still trying to connect and it's not possible. So is there a way to set that timeout? Thanks for every reply

    Read the article

  • Manually extracting portions of strings contained in a list (parsing)

    - by user1652011
    I'm aware that there are modules that fully simplify this function, but saying that I am running from a base install of python (standard modules only), how would I extract the following: I have a list. This list is the contents, line by line, of a webpage. Here is a mock up list (unformatted) for informative purposes: <script> link = "/scripts/playlists/1/" + a.id + "/0-5417069212.asx"; <script> "<a href="/apps/audio/?feedId=11065"><span class="px13">Eastern Metro Area Fire</span>" From the above string, I need the following extracted. The feedId (11065), which is incidentally a.id in the code above., "/scripts/playlists/1/" and "/0-5417069212.asx". Remembering that each of these lines is just contents from objects in a list, how would I go about extracting that data? Here is the full list: contents = urllib2.urlopen("http://www.radioreference.com/apps/audio/?ctid=5586") Pseudo: from urllib2 import urlopen as getpage page_contents = getpage("http://www.radioreference.com/apps/audio/?ctid=5586") feedID = % in (page_contents.search() for "/apps/audio/?feedId=%") titleID = % in (page_contents.search() for "<span class="px13">%</span>") playlistID = % in (page_contents.search() for "link = "%" + a.id + "*.asx";") asxID = * in (page_contents.search() for "link = "*" + a.id + "%.asx";") streamURL = "http://www.radioreference.com/" + playlistID + feedID + asxID + ".asx" I plan to format it as such that streamURL should = : http://www.radioreference.com/scripts/playlists/1/11065/0-5417067072.asx

    Read the article

  • dpkg stuck downloading font files

    - by Bob Bowles
    I have been reinstalling Ubuntu 12.04. The install from USB works fine, and I could update everything OK, but when I got to re-installing my application software I hit a snag. One of the packages I tried to re-install was ttf-mscorefonts-installer. dpkg stalled during this setup, downloading a font file (it had tried to download it all night). I stopped dpkg, and attempted to re-start downloading something else, but it would not let me. The commands I typed are as follows: bob@bobStudio:~$ sudo rm /var/lib/dpkg/lock This unlocks dpkg, but if I try to do something I get the following message (eg): bob@bobStudio:~$ sudo apt-get install synaptic E: dpgk was interrupted, you must manually run 'sudo dpkg --configure -a' to correct the problem So, I did just that: bob@bobStudio:~$ sudo dpkg --configure -a whereupon it started the previously failed download all over again. I went round the loop here a few times and each time after the configure command it re-started the failing download, but then I got this: bob@bobStudio:~$ sudo dpkg --configure -a Setting up update-notifier-common (0.119ubuntu8.4) ... ttf-mscorefonts-installer: downloading http://downloads.sourceforge.net/corefonts/andale32.exe Traceback (most recent call last): File "/usr/lib/update-notifier/package-data-downloader", line 234, in process_download_requests dest_file = urllib.urlretrieve(files[i])[0] File "/usr/lib/python2.7/urllib.py", line 93, in urlretrieve return _urlopener.retrieve(url, filename, reporthook, data) File "/usr/lib/python2.7/urllib.py", line 239, in retrieve fp = self.open(url, data) File "/usr/lib/python2.7/urllib.py", line 207, in open return getattr(self, name)(url) File "/usr/lib/python2.7/urllib.py", line 344, in open_http h.endheaders(data) File "/usr/lib/python2.7/httplib.py", line 954, in endheaders self._send_output(message_body) File "/usr/lib/python2.7/httplib.py", line 814, in _send_output self.send(msg) File "/usr/lib/python2.7/httplib.py", line 776, in send self.connect() File "/usr/lib/python2.7/httplib.py", line 757, in connect self.timeout, self.source_address) File "/usr/lib/python2.7/socket.py", line 553, in create_connection for res in getaddrinfo(host, port, 0, SOCK_STREAM): IOError: [Errno socket error] [Errno -2] Name or service not known Setting up ttf-mscorefonts-installer (3.4ubuntu3) ... bob@bobStudio:~$ sudo apt-get update E: Could not get lock /var/lib/apt/lists/lock - open (11: Resource temporarily unavailable) E: Unable to lock directory /var/lib/apt/lists/ bob@bobStudio:~$ sudo rm /var/lib/dpkg/lock bob@bobStudio:~$ sudo apt-get update E: Could not get lock /var/lib/apt/lists/lock - open (11: Resource temporarily unavailable) E: Unable to lock directory /var/lib/apt/lists/ The good news is that, once I sorted out the file locks, this seems to have permanently aborted the setup of the font package, so at least I can do something else with dpkg. That leaves two questions: 1) How could I have broken the loop without actually crashing out of dpkg? 2) How can I set up the ttf-mscorefonts-installer package in the future? Is this download really broken, or is it 'just' a bad Internet connection?

    Read the article

  • Best practise when using httplib2.Http() object

    - by tomaz
    I'm writing a pythonic web API wrapper with a class like this import httplib2 import urllib class apiWrapper: def __init__(self): self.http = httplib2.Http() def _http(self, url, method, dict): ''' Im using this wrapper arround the http object all the time inside the class ''' params = urllib.urlencode(dict) response, content = self.http.request(url,params,method) as you can see I'm using the _http() method to simplify the interaction with the httplib2.Http() object. This method is called quite often inside the class and I'm wondering what's the best way to interact with this object: create the object in the __init__ and then reuse it when the _http() method is called (as shown in the code above) or create the httplib2.Http() object inside the method for every call of the _http() method (as shown in the code sample below) import httplib2 import urllib class apiWrapper: def __init__(self): def _http(self, url, method, dict): '''Im using this wrapper arround the http object all the time inside the class''' http = httplib2.Http() params = urllib.urlencode(dict) response, content = http.request(url,params,method)

    Read the article

  • stopping a cherrypy server over http

    - by d.c
    I have a cherrypy app that I'm controlling over http with a wxpython ui. I want to kill the server when the ui closes, but I don't know how to do that. Right now I'm just doing a sys.exit() on the window close event but thats resulting in Traceback (most recent call last): File "ui.py", line 67, in exitevent urllib.urlopen("http://"+server+"/?sigkill=1") File "c:\python26\lib\urllib.py", line 87, in urlopen return opener.open(url) File "c:\python26\lib\urllib.py", line 206, in open return getattr(self, name)(url) File "c:\python26\lib\urllib.py", line 354, in open_http 'got a bad status line', None) IOError: ('http protocol error', 0, 'got a bad status line', None) is that because I'm not stopping cherrypy properly?

    Read the article

  • Can't parse XML effectively using Python

    - by Harshit Sharma
    import urllib import xml.etree.ElementTree as ET def getWeather(city): #create google weather api url url = "http://www.google.com/ig/api?weather=" + urllib.quote(city) try: # open google weather api url f = urllib.urlopen(url) except: # if there was an error opening the url, return return "Error opening url" # read contents to a string s = f.read() tree=ET.parse(s) current= tree.find("current_condition/condition") condition_data = current.get("data") weather = condition_data if weather == "<?xml version=": return "Invalid city" #return the weather condition #return weather def main(): while True: city = raw_input("Give me a city: ") weather = getWeather(city) print(weather) if __name__ == "__main__": main() gives error , I actually wanted to find values from google weather xml site tags

    Read the article

  • [Python]Download an image embedded in a mime multipart message

    - by michele
    Hi, I have to download some images from links. This links return me a file where is embedded a multipart mime and a tiff image. I have writed this code but it downloads the file with mime. How I can remove the mime from this file and have the image returned? Can I do this with wget or curl? My code: def download(url,local): import urllib urllib.urlretrieve(url,local) urllib.urlcleanup() Thanks a lot.

    Read the article

  • Error while trying to parse a website url using python . how to debug it ?

    - by mekasperasky
    #!/usr/bin/python import json import urllib from BeautifulSoup import BeautifulSoup from BeautifulSoup import BeautifulStoneSoup import BeautifulSoup def showsome(searchfor): query = urllib.urlencode({'q': searchfor}) url = 'http://ajax.googleapis.com/ajax/services/search/web?v=1.0&%s' % query search_response = urllib.urlopen(url) search_results = search_response.read() results = json.loads(search_results) data = results['responseData'] print 'Total results: %s' % data['cursor']['estimatedResultCount'] hits = data['results'] print 'Top %d hits:' % len(hits) for h in hits: print ' ', h['url'] resp = urllib.urlopen(h['url']) res = resp.read() soup = BeautifulSoup(res) print soup.prettify() print 'For more results, see %s' % data['cursor']['moreResultsUrl'] showsome('sachin') What is the wrong in this code ? Note all the 4 links that I am getting out of the search , I am feeding it back to extract the contents out of it , and then use BeautifulSoup to parse it . How should I go about it ?

    Read the article

  • Google app engine error when I login.

    - by zjm1126
    i am using http://code.google.com/p/gaema/source/browse/#hg/demos/webapp, and this is my traceback: Traceback (most recent call last): File "D:\Program Files\Google\google_appengine\google\appengine\ext\webapp\__init__.py", line 510, in __call__ handler.get(*groups) File "D:\gaema\demos\webapp\main.py", line 31, in get google_auth.get_authenticated_user(self._on_auth) File "D:\gaema\demos\webapp\gaema\auth.py", line 641, in get_authenticated_user OpenIdMixin.get_authenticated_user(self, callback) File "D:\gaema\demos\webapp\gaema\auth.py", line 83, in get_authenticated_user url = self._OPENID_ENDPOINT + "?" + urllib.urlencode(args) File "D:\Python25\lib\urllib.py", line 1250, in urlencode v = quote_plus(str(v)) UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-1: ordinal not in range(128) how to do this thanks updated i change the code from args = dict((k, v[-1]) for k, v in self.request.arguments.iteritems()) args["openid.mode"] = u"check_authentication" url = self._OPENID_ENDPOINT + "?" + urllib.urlencode(args) to args = dict((k, v[-1].encode('utf-8')) for k, v in self.request.arguments.iteritems()) args["openid.mode"] = u"check_authentication" url = self._OPENID_ENDPOINT + "?" + urllib.urlencode(args) but also error.

    Read the article

  • Basic Google search using a shell script

    - by Lri
    Something like this but using just basic shell scripting: #!/usr/bin/env python import urllib import json base = 'http://ajax.googleapis.com/ajax/services/search/web?v=1.0&' query = urllib.urlencode({'q' : "something"}) response = urllib.urlopen(base + query).read() data = json.loads(response) print data['responseData']['results'][0]['url'] Any more convenient alternatives to ajax.googleapis.com? If not, how should you encode the URL and parse JSON?

    Read the article

  • urlencode an array of values

    - by Ikke
    I'm trying to urlencode an dictionary in python with urllib.urlencode. The problem is, I have to encode an array. The result needs to be: criterias%5B%5D=member&criterias%5B%5D=issue #unquoted: criterias[]=member&criterias[]=issue But the result I get is: criterias=%5B%27member%27%2C+%27issue%27%5D #unquoted: criterias=['member',+'issue'] I have tried several things, but I can't seem to get the right result. import urllib criterias = ['member', 'issue'] params = { 'criterias[]': criterias, } print urllib.urlencode(params) If I use cgi.parse_qs to decode a correct query string, I get this as result: {'criterias[]': ['member', 'issue']} But if I encode that result, I get a wrong result back. Is there a way to produce the expected result?

    Read the article

  • How to make a POST request with python-webkit?

    - by shakaran
    Hi, I new using python + webkit. I need make a POST request with webkit, but I dont know how to it. I use python-webkit because my app load a form on the GUI (for vote, comments and send more data) and I need post all these data with a POST request and load the html result send for the server to my GUI app with python-webkit. I have only this example with urllib: #!/usr/bin/python import urllib2, urllib import httplib server = 'server.somesite.com' data = {'name' : 'shakaran', 'password' : 'Only_I_know'} d = urllib.urlencode(data) headers = {"Content-type": "application/x-www-form- urlencoded", "Accept": "text/plain"} conn = httplib.HTTPConnection(server) conn.request("POST", "/login.php", d, headers) response = conn.getresponse() if response.status == 200: print response.status, response.reason print response.getheaders() data = response.read() print data conn.close() I need a simple example with webkit. I look in the documentation for Webkit.HTTPRequest http://www.webwareforpython.org/WebKit/Docs/Source/Docs/WebKit.HTTPRequest.html I try with webkit.NetworkRequest() but I don't know how to it. Some help? Thanks

    Read the article

  • Google App Engine python - Self is not defined

    - by sdasdas
    I have a request that maps to this class ChatMsg It takes in 3 get variables, username, roomname, and msg. But it fails on this last line here. class ChatMsg(webapp.RequestHandler): # this is line 239 def get(self): username = urllib.unquote(self.request.get('username')) roomname = urllib.unquote(self.request.get('roomname')) # this is line 242 When it tries to assign roomname, it tells me: <type 'exceptions.NameError'>: name 'self' is not defined Traceback (most recent call last): File "/base/data/home/apps/chatboxes/1.341998073649951735/chatroom.py", line 239, in <module> class ChatMsg(webapp.RequestHandler): File "/base/data/home/apps/chatboxes/1.341998073649951735/chatroom.py", line 242, in ChatMsg roomname = urllib.unquote(self.request.get('roomname')) what the hell is going on to make self not defined

    Read the article

  • How to implement python to find value between xml tags?

    - by Harshit Sharma
    I am using google site to retrieve weather information , I want to find values between XML tags. Following code give me weather condition of a city , but I am unable to obtain other parameters such as temperature and if possible explain working of split function implied in the code: import urllib def getWeather(city): #create google weather api url url = "http://www.google.com/ig/api?weather=" + urllib.quote(city) try: # open google weather api url f = urllib.urlopen(url) except: # if there was an error opening the url, return return "Error opening url" # read contents to a string s = f.read() # extract weather condition data from xml string weather = s.split("<current_conditions><condition data=\"")[-1].split("\"")[0] # if there was an error getting the condition, the city is invalid if weather == "<?xml version=": return "Invalid city" #return the weather condition return weather def main(): while True: city = raw_input("Give me a city: ") weather = getWeather(city) print(weather) if __name__ == "__main__": main() Thank You

    Read the article

  • Google Search API - Only returning 4 results

    - by user353829
    After much experimenting and googling, the following Python code successfully calls Google's Search APi - but only returns 4 results: after reading the Google Search API docs, I thought the 'start=' would return additional results: but this not happen. Can anyone give pointers? Thanks. Python code: /usr/bin/python import urllib import simplejson query = urllib.urlencode({'q' : 'site:example.com'}) url = 'http://ajax.googleapis.com/ajax/services/search/web?v=1.0&%s&start=50' \ % (query) search_results = urllib.urlopen(url) json = simplejson.loads(search_results.read()) results = json['responseData']['results'] for i in results: print i['title'] + ": " + i['url']

    Read the article

  • Does python's httplib.HTTPConnection block?

    - by python_noob
    Hello, I am unsure whether or not the following code is a blocking operation in python: import httplib import urllib def do_request(server, port, timeout, remote_url): conn = httplib.HTTPConnection(server, port, timeout=timeout) conn.request("POST", remote_url, urllib.urlencode(query_dictionary, True)) conn.close() return True do_request("http://www.example.org", 80, 30, "foo/bar") print "hi!" And if it is, how would one go about creating a non-blocking asynchronous http request in python? Thanks from a python noob.

    Read the article

  • Python form POST using urllib2 (also question on saving/using cookies)

    - by morpheous
    I am trying to write a function to post form data and save returned cookie info in a file so that the next time the page is visited, the cookie information is sent to the server (i.e. normal browser behavior). I wrote this relatively easily in C++ using curlib, but have spent almost an entire day trying to write this in Python, using urllib2 - and still no success. This is what I have so far: import urllib, urllib2 import logging # the path and filename to save your cookies in COOKIEFILE = 'cookies.lwp' cj = None ClientCookie = None cookielib = None logger = logging.getLogger(__name__) # Let's see if cookielib is available try: import cookielib except ImportError: logger.debug('importing cookielib failed. Trying ClientCookie') try: import ClientCookie except ImportError: logger.debug('ClientCookie isn\'t available either') urlopen = urllib2.urlopen Request = urllib2.Request else: logger.debug('imported ClientCookie succesfully') urlopen = ClientCookie.urlopen Request = ClientCookie.Request cj = ClientCookie.LWPCookieJar() else: logger.debug('Successfully imported cookielib') urlopen = urllib2.urlopen Request = urllib2.Request # This is a subclass of FileCookieJar # that has useful load and save methods cj = cookielib.LWPCookieJar() login_params = {'name': 'anon', 'password': 'pass' } def login(theurl, login_params): init_cookies(); data = urllib.urlencode(login_params) txheaders = {'User-agent' : 'Mozilla/4.0 (compatible; MSIE 5.5; Windows NT)'} try: # create a request object req = Request(theurl, data, txheaders) # and open it to return a handle on the url handle = urlopen(req) except IOError, e: log.debug('Failed to open "%s".' % theurl) if hasattr(e, 'code'): log.debug('Failed with error code - %s.' % e.code) elif hasattr(e, 'reason'): log.debug("The error object has the following 'reason' attribute :"+e.reason) sys.exit() else: if cj is None: log.debug('We don\'t have a cookie library available - sorry.') else: print 'These are the cookies we have received so far :' for index, cookie in enumerate(cj): print index, ' : ', cookie # save the cookies again cj.save(COOKIEFILE) #return the data return handle.read() # FIXME: I need to fix this so that it takes into account any cookie data we may have stored def get_page(*args, **query): if len(args) != 1: raise ValueError( "post_page() takes exactly 1 argument (%d given)" % len(args) ) url = args[0] query = urllib.urlencode(list(query.iteritems())) if not url.endswith('/') and query: url += '/' if query: url += "?" + query resource = urllib.urlopen(url) logger.debug('GET url "%s" => "%s", code %d' % (url, resource.url, resource.code)) return resource.read() When I attempt to log in, I pass the correct username and pwd,. yet the login fails, and no cookie data is saved. My two questions are: can anyone see whats wrong with the login() function, and how may I fix it? how may I modify the get_page() function to make use of any cookie info I have saved ?

    Read the article

  • please help turn a simple Python2 code to PHP

    - by user296516
    Hi guys, Sorry to bother again, but I really need help transforming this Python2 code into PHP. net, cid, lac = 25002, 9164, 4000 import urllib a = '000E00000000000000000000000000001B0000000000000000000000030000' b = hex(cid)[2:].zfill(8) + hex(lac)[2:].zfill(8) c = hex(divmod(net,100)[1])[2:].zfill(8) + hex(divmod(net,100)[0])[2:].zfill(8) string = (a + b + c + 'FFFFFFFF00000000').decode('hex') data = urllib.urlopen('http://www.google.com/glm/mmap',string) r = data.read().encode('hex') print float(int(r[14:22],16))/1000000, float(int(r[22:30],16))/1000000 Would be great if someone could help, thanks in advance!

    Read the article

  • why does b'(and sometimes b' ') show up when I split some HTML source[Python]

    - by Oliver
    I'm fairly new to Python and programming in general. I have done a few tutorials and am about 2/3 through a pretty good book. That being said I've been trying to get more comfortable with Python and proggramming by just trying things in the std lib out. that being said I have recently run into a wierd quirk that I'm sure is the result of my own incorrect or un-"pythonic" use of the urllib module(with Python 3.2.2) import urllib.request HTML_source = urllib.request.urlopen(www.somelink.com).read() print(HTML_source) when this bit is run through the active interpreter it returns the HTML source of somelink, however it prefixes it with b' for example b'<HTML>\r\n<HEAD> (etc). . . . if I split the string into a list by whitespace it prefixes every item with the b' I'm not really trying to accomplish something specific just trying to familiarize myself with the std lib. I would like to know why this b' is getting prefixed also bonus -- Is there a better way to get HTML source WITHOUT using a third party module. I know all that jazz about not reinventing the wheel and what not but I'm trying to learn by "building my own tools" Thanks in Advance!

    Read the article

< Previous Page | 1 2 3 4 5 6 7  | Next Page >