urlopen - Page 4 - Developer IT

Downloading files from an http server in python

- by cellular

Using urllib2, we can get the http response from a web server. If that server simply holds a list of files, we could parse through the files and download each individually. However, I'm not sure what the easiest, most pythonic way to parse through the files would be. When you get a whole http response of the generic file server list, through urllib2's urlopen() method, how can we neatly download each file?

Read the article

Can I upload an object in memory to FTP using Python?

- by fsckin

Here's what I'm doing now: mysock = urllib.urlopen('http://localhost/image.jpg') fileToSave = mysock.read() oFile = open(r"C:\image.jpg",'wb') oFile.write(fileToSave) oFile.close f=file('image.jpg','rb') ftp.storbinary('STOR '+os.path.basename('image.jpg'),f) os.remove('image.jpg') Writing files to disk and then imediately deleting them seems like extra work on the system that should be avoided. Can I upload an object in memory to FTP using Python?

Read the article

pip install fails on guest Linux Mint 15

- by synergetic

On my Windows 7 PC, I installed VMware VM for Linux Mint 15. Windows PC is behind corporate firewall /proxy server. Now inside Linux I issued: sudo apt-get install python-virtualenv Then created ~/projects folder and python virtual environment: mkdir projects cd projects virtualenv venv Then activated my virtual env: . venv/bin/activate So far no problem. Then tried to install python libraries, for example markupsafe: pip install markupsafe It throws an error: Cannot fetch index base URL https://pypi.python.org/simple/ Could not find any downloads that satisfy the requirement markupsafe No distributions at all found for markupsafe Storing complete log in /home/me/.pip/pip.log Inside pip.log I found: <urlopen error [Errno 104] Connection reset by peer> Installing any other library throws similar error. What's wrong here?

Read the article

Can PuTTY be configured to display the following UTF-8 characters?

- by Stuart Powers

I'd like to be able to render the characters as seen in this tweet: I saved the tweet's JSON data and wrote a one-liner python script for testing. python -c 'import json,urllib; print json.load(urllib.urlopen("http://c.sente.cc/BUCq/tweet.json"))["text"]' This next image shows the output of this command on two different putty sessions, one with Bitstream Vera Sans Mono font and the other is using Courier New: Next is an example of correct output (I wasn't using PuTTY): The original JSON is at this link using Twitter's API. How can I get PuTTY to display those characters?

Read the article

Python script won't write data when ran from cron

- by Ruud

When I run a python script in a terminal it runs as expected; downloads file and saves it in the desired spot. sudo python script.py I've added the python script to the root crontab, but then it runs as it is supposed to except it does not write the file. $ sudo crontab -l > * * * * * python /home/test/script.py >> /var/log/test.log 2>&1 Below is a simplified script that still has the problem: #!/usr/bin/python scheduleUrl = 'http://test.com/schedule.xml' schedule = '/var/test/schedule.xml' # Download url and save as filename def wget(url, filename): import urllib2 try: response = urllib2.urlopen(url) except Exception: import traceback logging.exception('generic exception: ' + traceback.format_exc()) else: print('writing:'+filename+';') output = open(filename,'wb') output.write(response.read()) output.close() # Download the schedule wget(scheduleUrl, schedule) I do get the message "writing:name of file;" inside the log, to which the cron entry outputs. But the actual file is nowhere to be found... The dir /var/test is chmodded to 777 and using whatever user, I am allowed to add and change files as I please.

Read the article

Getting BeautifulSoup to find a specific

- by Ryan

I'm trying to put together a basic HTML scraper for a variety of scientific journal websites, specifically trying to get the abstract or introductory paragraph. The current journal I'm working on is Nature, and the article I've been using as my sample can be seen at http://www.nature.com/nature/journal/v463/n7284/abs/nature08715.html. I can't get the abstract out of that page, however. I'm searching for everything between the ... tags, but I can't seem to figure out how to isolate them. I thought it would be something simple like from BeautifulSoup import BeautifulSoup import re import urllib2 address="http://www.nature.com/nature/journal/v463/n7284/full/nature08715.html" html = urllib2.urlopen(address).read() soup = BeautifulSoup(html) abstract = soup.find('p', attrs={'class' : 'lead'}) print abstract Using Python 2.5, BeautifulSoup 3.0.8, running this returns 'None'. I have no option of using anything else that needs to be compiled/installed (like lxml). Is BeautifulSoup confused, or am I?

Read the article

Extracting an attribute value with beautifulsoup

- by Barnabe

I am trying to extract the content of a single "value" attribute in a specific "input" tag on a webpage. I use the following code: import urllib f = urllib.urlopen("http://58.68.130.147") s = f.read() f.close() from BeautifulSoup import BeautifulStoneSoup soup = BeautifulStoneSoup(s) inputTag = soup.findAll(attrs={"name" : "stainfo"}) output = inputTag['value'] print str(output) I get a TypeError: list indices must be integers, not str even though from the Beautifulsoup documentation i understand that strings should not be a problem here... but i a no specialist and i may have misunderstood. Any suggestion is greatly appreciated! Thanks in advance.

Read the article

Python urllib2 Basic Auth Problem

- by Simon

I'm having a problem sending basic AUTH over urllib2. I took a look at this article, and followed the example. My code: passman = urllib2.HTTPPasswordMgrWithDefaultRealm() passman.add_password(None, "api.foursquare.com", username, password) urllib2.install_opener(urllib2.build_opener(urllib2.HTTPBasicAuthHandler(passman))) req = urllib2.Request("http://api.foursquare.com/v1/user") f = urllib2.urlopen(req) data = f.read() I'm seeing the following on the Wire via wireshark: GET /v1/user HTTP/1.1 Host: api.foursquare.com Connection: close Accept-Encoding: gzip User-Agent: Python-urllib/2.5 You can see the Authorization is not sent, vs. when I send a request via curl: curl -u user:password http://api.foursquare.com/v1/user GET /v1/user HTTP/1.1 Authorization: Basic =SNIP= User-Agent: curl/7.19.4 (universal-apple-darwin10.0) libcurl/7.19.4 OpenSSL/0.9.8k zlib/1.2.3 Host: api.foursquare.com Accept: */* For some reason my code seems to not send the authentication - anyone see what I'm missing? thanks -simon

Read the article

Python: Getting INVALID response from PayPal's Sandbox IPN, slowly going insane...

- by thepeanut

Hi All I am trying to implement a simple online payment system using PayPal, however I have tried everything I know and am still getting an INVALID response. I know it's nothing too simple, because I get a VERIFIED response when using the IPN simulator. I have tried putting the items into a dict first, I have tried fixing the encoding, and still nothing. PayPal says the reasons for an INVALID response could be: Sending wrong items or in wrong order (pretty sure it's not this) Sending to the wrong address (definitely not this) Encoding items incorrectly (I dont think it's this, set encoding to UTF-8 on both paypal and my script) The following is the snippet concerned: f = cgi.FieldStorage() newparams = 'cmd=_notify-validate' for key in f.keys(): val = f[key].value newparams += '&' + urlencode({key: val.encode('utf-8')}) req = urllib2.Request(PP_URL, newparams) req.add_header("Content-type", "application/x-www-form-urlencoded") http = urllib2.urlopen(req) ret = http.read() fi.write(ret + '\n') if ret == 'VERIFIED': #*do stuff* Can anyone suggest anything I can do to fix this?! Cheers Sam

Read the article

: in node causing Keyerror in xmlparsing using ElementTree

- by kguckian

Hi I'm using ElementTree to parse out an xml feed from Kuler. I'm only beginning in python but am stuck here. The parsing works fine until I attempt to retrieve any nodes containing ':' e.g kuler:swatchHexColor Below is a cut down version of the full feed but same structure: <rss xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:kuler="http://kuler.adobe.com/kuler/API/rss/" xmlns:rss="http://blogs.law.harvard.edu/tech/rss" version="2.0"> <channel> <title>kuler popular themes</title> <item> <title>Theme Title: Fresh Money</title> <description> <img src="http://kuler-api.adobe.com/kuler/themeImages/theme_808366.png" /> Artist: thesylph005 ThemeID: 808366 Posted: 03/02/2010 Hex: 2F400D, 8CBF26, A8CA65, E8E5B0, 419184 </description> <kuler:themeItem> <kuler:themeID>808366</kuler:themeID> <kuler:themeTitle>Fresh Money</kuler:themeTitle> <kuler:themeImage>http://kuler-api.adobe.com/kuler/themeImages/theme_808366.png</kuler:themeImage> <kuler:themeAuthor> <kuler:authorID>370750</kuler:authorID> <kuler:authorLabel>thesylph005</kuler:authorLabel> </kuler:themeAuthor> <kuler:themeTags/> <kuler:themeRating>4</kuler:themeRating> <kuler:themeDownloadCount>708</kuler:themeDownloadCount> <kuler:themeCreatedAt>20100302</kuler:themeCreatedAt> <kuler:themeEditedAt>20100302</kuler:themeEditedAt> <kuler:themeSwatches> <kuler:swatch> <kuler:swatchHexColor>2F400D</kuler:swatchHexColor> <kuler:swatchColorMode>rgb</kuler:swatchColorMode> <kuler:swatchChannel1>0.183333</kuler:swatchChannel1> <kuler:swatchChannel2>0.25</kuler:swatchChannel2> <kuler:swatchChannel3>0.05</kuler:swatchChannel3> <kuler:swatchChannel4>0.0</kuler:swatchChannel4> <kuler:swatchIndex>0</kuler:swatchIndex> </kuler:swatch> <kuler:swatch> <kuler:swatchHexColor>8CBF26</kuler:swatchHexColor> <kuler:swatchColorMode>rgb</kuler:swatchColorMode> <kuler:swatchChannel1>0.55</kuler:swatchChannel1> <kuler:swatchChannel2>0.75</kuler:swatchChannel2> <kuler:swatchChannel3>0.15</kuler:swatchChannel3> <kuler:swatchChannel4>0.0</kuler:swatchChannel4> <kuler:swatchIndex>1</kuler:swatchIndex> </kuler:swatch> <kuler:swatch> <kuler:swatchHexColor>A8CA65</kuler:swatchHexColor> <kuler:swatchColorMode>rgb</kuler:swatchColorMode> <kuler:swatchChannel1>0.659722</kuler:swatchChannel1> <kuler:swatchChannel2>0.791667</kuler:swatchChannel2> <kuler:swatchChannel3>0.395833</kuler:swatchChannel3> <kuler:swatchChannel4>0.0</kuler:swatchChannel4> <kuler:swatchIndex>2</kuler:swatchIndex> </kuler:swatch> <kuler:swatch> <kuler:swatchHexColor>E8E5B0</kuler:swatchHexColor> <kuler:swatchColorMode>rgb</kuler:swatchColorMode> <kuler:swatchChannel1>0.91</kuler:swatchChannel1> <kuler:swatchChannel2>0.898047</kuler:swatchChannel2> <kuler:swatchChannel3>0.688705</kuler:swatchChannel3> <kuler:swatchChannel4>0.0</kuler:swatchChannel4> <kuler:swatchIndex>3</kuler:swatchIndex> </kuler:swatch> <kuler:swatch> <kuler:swatchHexColor>419184</kuler:swatchHexColor> <kuler:swatchColorMode>rgb</kuler:swatchColorMode> <kuler:swatchChannel1>0.254901</kuler:swatchChannel1> <kuler:swatchChannel2>0.57</kuler:swatchChannel2> <kuler:swatchChannel3>0.519034</kuler:swatchChannel3> <kuler:swatchChannel4>0.0</kuler:swatchChannel4> <kuler:swatchIndex>4</kuler:swatchIndex> </kuler:swatch> </kuler:themeSwatches> Tue, 30 Mar 2010 11:27:12 PST So if I do a findall on say each item's description, I get that back fine. But the minute I try to retrieve anything with a : in the nodename I get Exception Type: KeyError Exception Value: ':' So this works from elementtree.ElementTree import Element, SubElement, dump, parse def xml(): kulerurl = 'http://kuler-api.adobe.com/rss/get.cfm?listType=popular&startIndex=0&itemsPerPage=5&timeSpan=30&key=mykey' rss = parse(urllib.urlopen(kulerurl)).getroot() for element in rss.findall('channel/item'): print(element.findtext('description')) dump (rss) but this doesn't def xml(): kulerurl = 'http://kuler-api.adobe.com/rss/get.cfm?listType=popular&startIndex=0&itemsPerPage=5&timeSpan=30&key=mykey' rss = parse(urllib.urlopen(kulerurl)).getroot() for element in rss.findall('channel/item/kuler:themeItem'): print(element.findtext('kuler:themeID')) dump (rss) I'm sure it's something simple if anyone could point me to what I'm doing wrong here I'd be most grateful thanks Kieran

Read the article

Backup Google Calendar programmatically: http://www.google.com/reader/subscriptions/export

- by Michael

I'm struggling with writing a python script that automatically grabs the zip fail containing all my google calendars and stores it (as a backup) on my harddisk. I'm using ClientLogin to get an authentication token (and successfully can obtain the token). Unfortunately, i'm unable to retrieve the file at https://www.google.com/calendar/exporticalzip It always asks me for the login credentials again by returning a login page as html (instead of the zip). Here's the critical code: post_data = post_data = urllib.urlencode({ 'auth': token, 'continue': zip_url}) request = urllib2.Request('https://www.google.com/calendar', post_data, header) try: f = urllib2.urlopen(request) result = f.read() except: print "Error" Anyone any ideas or done that before? Or an alternative idea how to backup all my calendars (automatically!)

Read the article

encryption decryption function in php

- by parthav

import gnupg, urllib retk = urllib.urlopen("http://keyserver.pramberger.at/pks/" "lookup?op=get&search=userid for the key is required") pub_key = retk.read() #print pub_key gpg = gnupg.GPG(gnupghome="/tmp/foldername", verbose=True) print "Import the Key :", gpg.import_keys(pub_key).summary() print "Encrypt the Message:" msg = "Hellllllllllo" uid = "userid that has the key on public key server" enc = gpg.encrypt(msg, uid,always_trust=True) print "*The enc content***************************== ", enc this function written in python gives me encrypted message.The encryption is done using the public key which i am getting from public key server(pramberger.at). Now how can i implement the same functionality (getting the key from any public key server and using that key encrypt the message) in php

Read the article

Encoding in python with lxml - complex solution

- by Vojtech R.

Hi, I need to download and parse webpage with lxml and build UTF-8 xml output. I thing schema in pseudocode is more illustrative: from lxml import etree webfile = urllib2.urlopen(url) root = etree.parse(webfile.read(), parser=etree.HTMLParser(recover=True)) txt = my_process_text(etree.tostring(root.xpath('/html/body'), encoding=utf8)) output = etree.Element("out") output.text = txt outputfile.write(etree.tostring(output, encoding=utf8)) So webfile can be in any encoding (lxml should handle this). Outputfile have to be in utf-8. I'm not sure where to use encoding/coding. Is this schema ok? (I cant find good tutorial about lxml and encoding, but I can find many problems with this...) I need robust approved solution so I ask you seniors. Many thanks

Read the article

How come I get a timed-out when I try to download something off my own domain?

- by alex

def download(source_url): socket.setdefaulttimeout(10) agents = ['Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0)','Mozilla/4.0 (compatible; MSIE 7.0b; Windows NT 5.1)','Microsoft Internet Explorer/4.0b1 (Windows 95)','Opera/8.00 (Windows NT 5.1; U; en)'] ree = urllib2.Request(source_url) ree.add_header('User-Agent',random.choice(agents)) resp = urllib2.urlopen(ree) htmlSource = resp.read() return htmlSource url = "http://myIP/details/?id=4" result_html = download(url) It shouldn't time out...even with the 10 second timeout..

Read the article

python urllib2 thread safety

- by jldupont

Is urllib2 thread safe? I find that using urllib2.urlopen sometimes results in the thread issuing the call to stop functioning. Yes I have used the timeout option to no avail. Is there a thread safe HTTP GET functionality I could use? NOTE: I am not interested in using Twisted to solve this problem. I have used Twisted in the past and I love it but this time I need a simpler solution. NOTE2: I also tried httplib with the same result (blocking).

Read the article

how to read specific number of floats from file in python?

- by sahel

I am reading a text file from the web. The file starts with some header lines containing the number of data points, followed the actual vertices (3 coordinates each). The file looks like: # comment HEADER TEXT POINTS 6 float 1.1 2.2 3.3 4.4 5.5 6.6 7.7 8.8 9.9 1.1 2.2 3.3 4.4 5.5 6.6 7.7 8.8 9.9 POLYGONS the line starting with the word POINTS contains the number of vertices (in this case we have 3 vertices per line, but that could change) This is how I am reading it right now: ur=urlopen("http://.../file.dat") j=0 contents = [] while 1: line = ur.readline() if not line: break else: line=line.lower() if 'points' in line : myline=line.strip() word=myline.split() node_number=int(word[1]) node_type=word[2] while 'polygons' not in line : line = ur.readline() line=line.lower() myline=line.split() i=0 while(i<len(myline)): contents[j]=float(myline[i]) i=i+1 j=j+1 How can I read a specified number of floats instead of reading line by line as strings and converting to floating numbers? Instead of ur.readline() I want to read the specified number of elements in the file Any suggestion is welcome..

Read the article

How to set timeout with python-mechanize?

- by Michal Cihar

I'm using python-mechanize to scrape some web sites, which sometime simply don't respond to requests and these requests stay open too long, so I need to limit timeout for these requests. While using urlopen method, the timeout can be set using timeout parameter, but I have not found easy way for doing it with high level API such as submit or click methods. Ideally the timeout would be set just once for whole browser class and all calls would honor that. It would be probably possible to customize this by passing custom request_class to every click and submit call, but this would just pollute the code, so I'm looking for nicer solution for setting timeout for mechanize's browser class (and no, I don't want to change default socket timeout using socket.setdefaulttimeout).

Read the article

Parsing JSON file with Python -> google map api

- by Hannes

Hi all, I am trying to get started with JSON in Python, but it seems that I misunderstand something in the JSON concept. I followed the google api example, which works fine. But when I change the code to a lower level in the JSON response (as shown below, where I try to get access to the location), I get the following error message for code below: Traceback (most recent call last): File "geoCode.py", line 11, in test = json.dumps([s['location'] for s in jsonResponse['results']], indent=3) KeyError: 'location' How can I get access to lower information level in the JSON file in python? Do I have to go to a higher level and search the result string? That seems very weird to me? Here is the code I have tried to run: import urllib, json URL2 = "http://maps.googleapis.com/maps/api/geocode/json?address=1600+Amphitheatre+Parkway,+Mountain+View,+CA&sensor=false" googleResponse = urllib.urlopen(URL2); jsonResponse = json.loads(googleResponse.read()) test = json.dumps([s['location'] for s in jsonResponse['results']], indent=3) print test Thank you for your responses.

Read the article

How can I set controls for a web page ??

- by Rami Jarrar

I have this login page with https, and i reach to this approach:: import ClientForm import urllib2 request = urllib2.Request("http://ritaj.birzeit.edu") response = urllib2.urlopen(request) forms = ClientForms.ParseResponseEx(response) response.close() f = forms[0] username = str(raw_input("Username: ")) password = str(raw_input("Password: ")) ## Here What To Do request2 = form.click() i get the controls of that page >>> f = forms[0] >>> [c.name for c in f.controls] ['q', 'sitesearch', 'sa', 'domains', 'form:mode', 'form:id', '__confirmed_p', '__refreshing_p', 'return_url', 'time', 'token_id', 'hash', 'username', 'password', 'persistent_p', 'formbutton:ok'] so how can i set the username and password controls of the "non-form form" f ??? and i have another problem,, how to know if its the right username and password ??

Read the article

python lxml problem

- by David ???

I'm trying to print/save a certain element's HTML from a web-page. I've retrieved the requested element's XPath from firebug. All I wish is to save this element to a file. I don't seem to succeed in doing so. (tried the XPath with and without a /text() at the end) I would appreciate any help, or past experience. 10x, David import urllib2,StringIO from lxml import etree url='http://www.tutiempo.net/en/Climate/Londres_Heathrow_Airport/12-2009/37720.htm' seite = urllib2.urlopen(url) html = seite.read() seite.close() parser = etree.HTMLParser() tree = etree.parse(StringIO.StringIO(html), parser) xpath = "/html/body/table/tbody/tr/td[2]/div/table/tbody/tr[6]/td/table/tbody/tr/td[3]/table/tbody/tr[3]/td/table/tbody/tr/td/table/tbody/tr/td/table/tbody/text()" elem = tree.xpath(xpath) print elem[0].strip().encode("utf-8")

Read the article

Python: using a regular expression to match one line of HTML

- by skylarking

This simple Python method I put together just checks to see if Tomcat is running on one of our servers. import urllib2 import re import sys def tomcat_check(): tomcat_status = urllib2.urlopen('http://10.1.1.20:7880') results = tomcat_status.read() pattern = re.compile('<body>Tomcat is running...</body>',re.M|re.DOTALL) q = pattern.search(results) if q == []: notify_us() else: print ("Tomcat appears to be running") sys.exit() If this line is not found : <body>Tomcat is running...</body> It calls : notify_us() Which uses SMTP to send an email message to myself and another admin that Tomcat is no longer runnning on the server... I have not used the re module in Python before...so I am assuming there is a better way to do this... I am also open to a more graceful solution with Beautiful Soup ... but haven't used that either.. Just trying to keep this as simple as possible...

Read the article

Play Shoutcast MP3 radio stream with Python?

- by Zachary Brown

I have managed to create an online radio station using Shoutcast and Sam Broadcaster. Now, I am wanting to build my own player for that radio station. I am not sure where to begin, I have googled, but no luck. I am using Python 2.6 on Microsoft Windows. I have managed to capture the stream and save it as an MP# on the hard disk, just not sure what to do with it next. I tried playback of the file, but it always pulls up errors. This is the code I have so far: import urllib target = open("broadcast.mp3") conn = urllib.urlopen("http://78.159.104.175:80") while True: target.write(con.read(5200)) Any help would be greatly appreciated!

Read the article

Python's urllib2 don't work on some sites

- by Binny V A

I found that you can't read from some sites using Python's urllib2(or urllib). An example... urllib2.urlopen("http://www.dafont.com/").read() # Returns '' These sites works when you visit the site. I can even scrap them using PHP(didn't try other languages). I have seen other sites with the same issue - but can't remember the URL at the moment. My questions are... What is the cause of this issue? Any workaround for this issue?

Read the article

float change from python 3.0.1 to 3.1.2

- by Jeremy

Im trying to learn python. I am using 3.1.2 and the o'reilly book is using 3.0.1 here is my code import urllib.request price = (99.99) while price 4.74: page = urllib.request.urlopen ("http://www.beans-r-us.biz/prices-loyalty.html") text = page.read().decode("utf8") where = text.find('>$') start_of_price = where + 2 end_of_price = start_of_price + 6 price = float(text[start_of_price:end_of_price]) print ("Buy!") - here is my error Traceback (most recent call last): File "/Users/odin/Desktop/Coffe.py", line 14, in price = float(text[start_of_price:end_of_price]) ValueError: invalid literal for float(): 4.59 what is wrong? please help!!

Read the article

Url open encoding

- by badc0re

I have the following code for urllib and BeautifulSoup: getSite = urllib.urlopen(pageName) # open current site getSitesoup = BeautifulSoup(getSite.read()) # reading the site content print getSitesoup.originalEncoding for value in getSitesoup.find_all('link'): # extract all <a> tags defLinks.append(value.get('href')) The result of it: /usr/lib/python2.6/site-packages/bs4/dammit.py:231: UnicodeWarning: Some characters could not be decoded, and were replaced with REPLACEMENT CHARACTER. "Some characters could not be decoded, and were " And when i try to read the site i get: ?7?e????0*"I??G?H????F??????9-??????;??E?YÞBs????????????4i???)?????^W?????`w?Ke??%??*9?.'OQB???V??@?????]???(P??^??q?$?S5???tT*?Z

Search Results

Search found 144 results on 6 pages for 'urlopen'.

Page 4/6 | < Previous Page | 1 2 3 4 5 6 | Next Page >

- by cellular

- by fsckin

- by synergetic

- by Stuart Powers

- by Ruud

- by Ryan

- by Barnabe

- by Simon

- by thepeanut

- by kguckian

- by Michael

- by parthav

- by Vojtech R.

- by alex

- by jldupont

- by sahel

- by Michal Cihar

- by Hannes

- by Rami Jarrar

- by David ???

- by skylarking

- by Zachary Brown

- by Binny V A

- by Jeremy

- by badc0re

< Previous Page | 1 2 3 4 5 6 | Next Page >