Search Results

Search found 159 results on 7 pages for 'urllib2'.

Page 7/7 | < Previous Page | 3 4 5 6 7

How to Select Items in Dropdown in Selenium

- by Marcus Gladir

Firstly, I have been trying to get the dropdown from this web page: http://solutions.3m.com/wps/portal/3M/en_US/Interconnect/Home/Products/ProductCatalog/Catalog/?PC_Z7_RJH9U5230O73D0ISNF9B3C3SI1000000_nid=RFCNF5FK7WitWK7G49LP38glNZJXPCDXLDbl This is the code I have: import urllib2 from bs4 import BeautifulSoup import re from pprint import pprint import sys from selenium import common from selenium import webdriver import selenium.webdriver.support.ui as ui from boto.s3.key import Key import requests url = 'http://solutions.3m.com/wps/portal/3M/en_US/Interconnect/Home/Products/ProductCatalog/Catalog/?PC_Z7_RJH9U5230O73D0ISNF9B3C3SI1000000_nid=RFCNF5FK7WitWK7G49LP38glNZJXPCDXLDbl' element_xpath = '//*[@id="Component1"]' driver = webdriver.PhantomJS() driver.get(url) element = driver.find_element_by_xpath(element_xpath) element_xpath = '/option[@value="02"]' all_options = element.find_elements_by_tag_name("option") for option in all_options: print("Value is: %s" % option.get_attribute("value")) option.click() source = driver.page_source.encode('utf-8', 'ignore') driver.quit() source = str(source) soup = BeautifulSoup(source, 'html.parser') print soup What prints out is this: Traceback (most recent call last): File "../../../../test.py", line 58, in <module> Value is: XX main() File "../../../../test.py", line 46, in main option.click() File "/home/eric/dev/octocrawler-env/local/lib/python2.7/site-packages/selenium-2.33.0-py2.7.egg/selenium/webdriver/remote/webelement.py", line 54, in click self._execute(Command.CLICK_ELEMENT) File "/home/eric/dev/octocrawler-env/local/lib/python2.7/site-packages/selenium-2.33.0-py2.7.egg/selenium/webdriver/remote/webelement.py", line 228, in _execute return self._parent.execute(command, params) File "/home/eric/dev/octocrawler-env/local/lib/python2.7/site-packages/selenium-2.33.0-py2.7.egg/selenium/webdriver/remote/webdriver.py", line 165, in execute self.error_handler.check_response(response) File "/home/eric/dev/octocrawler-env/local/lib/python2.7/site-packages/selenium-2.33.0-py2.7.egg/selenium/webdriver/remote/errorhandler.py", line 158, in check_response raise exception_class(message, screen, stacktrace) selenium.common.exceptions.ElementNotVisibleException: Message: u'{"errorMessage":"Element is not currently visible and may not be manipulated","request":{"headers":{"Accept":"application/json","Accept-Encoding":"identity","Connection":"close","Content-Length":"81","Content-Type":"application/json;charset=UTF-8","Host":"127.0.0.1:51413","User-Agent":"Python-urllib/2.7"},"httpVersion":"1.1","method":"POST","post":"{\\"sessionId\\": \\"30e4fd50-f0e4-11e3-8685-6983e831d856\\", \\"id\\": \\":wdc:1402434863875\\"}","url":"/click","urlParsed":{"anchor":"","query":"","file":"click","directory":"/","path":"/click","relative":"/click","port":"","host":"","password":"","user":"","userInfo":"","authority":"","protocol":"","source":"/click","queryKey":{},"chunks":["click"]},"urlOriginal":"/session/30e4fd50-f0e4-11e3-8685-6983e831d856/element/%3Awdc%3A1402434863875/click"}}' ; Screenshot: available via screen And the weirdest most infuriating bit of it all is that sometimes it actually all works out. I have no clue what's going on here.

Read the article
Slow Python HTTP server on localhost

- by Abiel

I am experiencing some performance problems when creating a very simple Python HTTP server. The key issue is that performance is varying depending on which client I use to access it, where the server and all clients are being run on the local machine. For instance, a GET request issued from a Python script (urllib2.urlopen('http://localhost/').read()) takes just over a second to complete, which seems slow considering that the server is under no load. Running the GET request from Excel using MSXML2.ServerXMLHTTP also feels slow. However, requesting the data Google Chrome or from RCurl, the curl add-in for R, yields an essentially instantaneous response, which is what I would expect. Adding further to my confusion is that I do not experience any performance problems for any client when I am on my computer at work (the performance problems are on my home computer). Both systems run Python 2.6, although the work computer runs Windows XP instead of 7. Below is my very simple server example, which simply returns 'Hello world' for any get request. from BaseHTTPServer import BaseHTTPRequestHandler, HTTPServer class MyHandler(BaseHTTPRequestHandler): def do_GET(self): print("Just received a GET request") self.send_response(200) self.send_header("Content-type", "text/html") self.end_headers() self.wfile.write('Hello world') return def log_request(self, code=None, size=None): print('Request') def log_message(self, format, *args): print('Message') if __name__ == "__main__": try: server = HTTPServer(('localhost', 80), MyHandler) print('Started http server') server.serve_forever() except KeyboardInterrupt: print('^C received, shutting down server') server.socket.close() Note that in MyHandler I override the log_request() and log_message() functions. The reason is that I read that a fully-qualified domain name lookup performed by one of these functions might be a reason for a slow server. Unfortunately setting them to just print a static message did not solve my problem. Also, notice that I have put in a print() statement as the first line of the do_GET() routine in MyHandler. The slowness occurs prior to this message being printed, meaning that none of the stuff that comes after it is causing a delay.

Read the article
App-Engine Parse a UrlFetch UTF-8 encoded stream

- by Davidrd91

I am trying to parse an XML from a URL using the xml.sax parser. I know there are other libraries to use but coming from Java this is the one I am most familiar with and seems the least complicated to me. The code I'm using to parse is as follows: parser = xml.sax.make_parser() handler = MangaHandler() parser.setContentHandler(handler) url = urlfetch.Fetch('http://www.mangapanda.com/alphabetical', allow_truncated = False, follow_redirects = False, deadline = False) xml.sax.parseString(url.content, handler) This returns a SaxException (invalid token) once the parser reaches the first & sign: SAXParseException: <unknown>:582:34: not well-formed (invalid token) Because urlfetch returns a string and not a stream I cannot use the parse() (which only works with streams) and am left to use parseString() instead. To see if parsing as a stream would fix this I tried: parser.parse(io.StringIO(url.content).encode('utf-8')) but this returns: TypeError: initial_value must be unicode or None, not str I have also tried to use the urllib2 libraries which do return a stream instead of urlfetch but the file is too large and is automatically truncated, leaving me with missing data. Any Sort of work-around for this would be greatly appreciated as I've spent days getting around one obstacle just to be stopped by another.

Read the article
django test client trouble

- by Anton Koval'

I've got a problem... we're writing project using django, and i'm trying to use django.test.client with nose test-framework for tests. Our code is like this: from simplejson import loads from urlparse import urljoin from django.test.client import Client TEST_URL = "http://smakly.localhost:9090/" def test_register(): cln = Client() ref_data = {"email": "[email protected]", "name": "???????", "website": "http://hot.bear.com", "xhr": "true"} print urljoin(TEST_URL, "/accounts/register/") response = loads(cln.post(urljoin(TEST_URL, "/accounts/register/"), ref_data)) print response["message"] and in nose output I catch: Traceback (most recent call last): File "/home/psih/work/svn/smakly/eggs/nose-0.11.1-py2.6.egg/nose/case.py", line 183, in runTest self.test(*self.arg) File "/home/psih/work/svn/smakly/src/smakly.tests/smakly/tests/frontend/test_profile.py", line 25, in test_register response = loads(cln.post(urljoin(TEST_URL, "/accounts/register/"), ref_data)) File "/home/psih/work/svn/smakly/parts/django/django/test/client.py", line 313, in post response = self.request(**r) File "/home/psih/work/svn/smakly/parts/django/django/test/client.py", line 225, in request response = self.handler(environ) File "/home/psih/work/svn/smakly/parts/django/django/test/client.py", line 69, in __call__ response = self.get_response(request) File "/home/psih/work/svn/smakly/parts/django/django/core/handlers/base.py", line 78, in get_response urlconf = getattr(request, "urlconf", settings.ROOT_URLCONF) File "/home/psih/work/svn/smakly/parts/django/django/utils/functional.py", line 273, in __getattr__ return getattr(self._wrapped, name) AttributeError: 'Settings' object has no attribute 'ROOT_URLCONF' My settings.py file does have this attribute. If I get the data from the server with standard urllib2.urllopen().read() it works in the proper way. Any ideas how I can solve this case?

Read the article
Python - Problems using mechanize to log into a difficult website

- by user1781599

× 139886 I am trying to log in to betfair.com by using mechanize. I have tried several ways but it always fail. This is the code I have developed so far, can anyone help me to identify what is wrong with it and how I can improve it to log into my betfair account? Thanks, import cookielib import urllib import urllib2 from BeautifulSoup import BeautifulSoup import mechanize from mechanize import Browser import re bf_username_name = "username" bf_password_name = "password" bf_form_name = "loginForm" bf_username = "xxxxx" bf_password = "yyyyy" urlLogIn = "http://www.betfair.com/" accountUrl = "https://myaccount.betfair.com/account/home?rlhm=0&" # This url I will use to verify if log in has been successful br = mechanize.Browser(factory=mechanize.RobustFactory()) br.addheaders = [("User-Agent","Mozilla/5.0 (Macintosh; Intel Mac OS X 10_5_8) AppleWebKit/537.1 (KHTML, like Gecko) Chrome/21.0.1180.90 Safari/537.1")] br.open(urlLogIn) br.select_form(nr=0) print br.form br.form[bf_username_name] = bf_username br.form[bf_password_name] = bf_password print br.form #just to check username and psw have been recorded correctly responseSubmit = br.submit() response = br.open(accountUrl) text_file = open("LogInResponse.html", "w") text_file.write(responseSubmit.read()) #this file should show the home page with me logged in, but it show home page as if I was not logged it text_file.close() text_file = open("Account.html", "w") text_file.write(response.read()) #this file should show my account page, but it should a pop up with an error text_file.close()

Read the article
Paginating requests to an API

- by user332912

I'm consuming (via urllib/urllib2) an API that returns XML results. The API always returns the total_hit_count for my query, but only allows me to retrieve results in batches of, say, 100 or 1000. The API stipulates I need to specify a start_pos and end_pos for offsetting this, in order to walk through the results. Say the urllib request looks like "http://someservice?query='test'&start_pos=X&end_pos=Y". If I send an initial 'taster' query with lowest data transfer such as http://someservice?query='test'&start_pos=1&end_pos=1 in order to get back a result of, for conjecture, total_hits = 1234, I'd like to work out an approach to most cleanly request those 1234 results in batches of, again say, 100 or 1000 or... This is what I came up with so far, and it seems to work, but I'd like to know if you would have done things differently or if I could improve upon this: hits_per_page=1000 # or 1000 or 200 or whatever, adjustable total_hits = 1234 # retreived with BSoup from 'taster query' base_url = "http://someservice?query='test'" startdoc_positions = [n for n in range(1, total_hits, hits_per_page)] enddoc_positions = [startdoc_position + hits_per_page - 1 for startdoc_position in startdoc_positions] for start, end in zip(startdoc_positions, enddoc_positions): if end total_hits: end = total_hits print "url to request is:\n ", print "%s&start_pos=%s&end_pos=%s" % (base_url, start, end) p.s. I'm a long time consumer of StackOverflow, especially the Python questions, but this is my first question posted. You guys are just brilliant.

Read the article
Python script web service timeout

- by Robert

We have had a Python script running for many months now that simply scans through a directory of files, and posts each file to our web site via a web service call. The web site is also written in Python. For no apparent reason, this morning this script started throwing the following error: urllib2.URLError: <urlopen error (10060, 'Operation timed out')> The site itself is up and running just fine. There are no indications of any errors. The developer that was working on this site is no longer with us, and we do not have a strong Python developer on staff as we are moving away from that. Before I do an all nighter and rewrite this thing in C#, I wanted to see if anyone had any experience dealing with this issue. I do know that the script is connecting to a secure site (HTTPS), so I am not sure if something has come up with that, and I honestly dont know where to look to determine that. As I said before, the site itself isn't showing any signs of error, including SSL. Any thoughts?

Read the article
How to get a html elements with python lxml

- by Damiano

Hello! I have this html code: <table> <tr> <td class="test"><b><a href="">aaa</a></b></td> <td class="test">bbb</td> <td class="test">ccc</td> <td class="test"><small>ddd</small></td> </tr> <tr> <td class="test"><b><a href="">eee</a></b></td> <td class="test">fff</td> <td class="test">ggg</td> <td class="test"><small>hhh</small></td> </tr> </table> I use this Python code to extract all <td class="test"> with lxml module. import urllib2 import lxml.html code = urllib.urlopen("http://www.example.com/page.html").read() html = lxml.html.fromstring(code) result = html.xpath('//td[@class="test"][position() = 1 or position() = 4]') It works good! The result is: <td class="test"><b><a href="">aaa</a></b></td> <td class="test"><small>ddd</small></td> <td class="test"><b><a href="">eee</a></b></td> <td class="test"><small>hhh</small></td> (so the first and the fourth column of each <tr>) Now, I have to extract: aaa (the title of the link) ddd (text between <small> tag) eee (the title of the link) hhh (text between <small> tag) How could I extract these values? (the problem is that I have to remove <b> tag and get the title of the anchor on the first column and remove <small> tag on the forth column) Thank you!

Read the article
python/pip error on osx

- by Ibrahim Chawa

I've recently purchased a new hard drive and installed a clean copy of OS X Mavericks. I installed python using homebrew and i need to create a python virtual environment. But when ever i try to run any command using pip, I get this error. I haven't been able to find a solution online for this problem. Any reference would be appreciated. Here is the error I'm getting. ERROR:root:code for hash md5 was not found. Traceback (most recent call last): File "/usr/local/Cellar/python/2.7.8/Frameworks/Python.framework/Versions/2.7/lib/python2.7/hashlib.py", line 139, in <module> globals()[__func_name] = __get_hash(__func_name) File "/usr/local/Cellar/python/2.7.8/Frameworks/Python.framework/Versions/2.7/lib/python2.7/hashlib.py", line 91, in __get_builtin_constructor raise ValueError('unsupported hash type ' + name) ValueError: unsupported hash type md5 ERROR:root:code for hash sha1 was not found. Traceback (most recent call last): File "/usr/local/Cellar/python/2.7.8/Frameworks/Python.framework/Versions/2.7/lib/python2.7/hashlib.py", line 139, in <module> globals()[__func_name] = __get_hash(__func_name) File "/usr/local/Cellar/python/2.7.8/Frameworks/Python.framework/Versions/2.7/lib/python2.7/hashlib.py", line 91, in __get_builtin_constructor raise ValueError('unsupported hash type ' + name) ValueError: unsupported hash type sha1 ERROR:root:code for hash sha224 was not found. Traceback (most recent call last): File "/usr/local/Cellar/python/2.7.8/Frameworks/Python.framework/Versions/2.7/lib/python2.7/hashlib.py", line 139, in <module> globals()[__func_name] = __get_hash(__func_name) File "/usr/local/Cellar/python/2.7.8/Frameworks/Python.framework/Versions/2.7/lib/python2.7/hashlib.py", line 91, in __get_builtin_constructor raise ValueError('unsupported hash type ' + name) ValueError: unsupported hash type sha224 ERROR:root:code for hash sha256 was not found. Traceback (most recent call last): File "/usr/local/Cellar/python/2.7.8/Frameworks/Python.framework/Versions/2.7/lib/python2.7/hashlib.py", line 139, in <module> globals()[__func_name] = __get_hash(__func_name) File "/usr/local/Cellar/python/2.7.8/Frameworks/Python.framework/Versions/2.7/lib/python2.7/hashlib.py", line 91, in __get_builtin_constructor raise ValueError('unsupported hash type ' + name) ValueError: unsupported hash type sha256 ERROR:root:code for hash sha384 was not found. Traceback (most recent call last): File "/usr/local/Cellar/python/2.7.8/Frameworks/Python.framework/Versions/2.7/lib/python2.7/hashlib.py", line 139, in <module> globals()[__func_name] = __get_hash(__func_name) File "/usr/local/Cellar/python/2.7.8/Frameworks/Python.framework/Versions/2.7/lib/python2.7/hashlib.py", line 91, in __get_builtin_constructor raise ValueError('unsupported hash type ' + name) ValueError: unsupported hash type sha384 ERROR:root:code for hash sha512 was not found. Traceback (most recent call last): File "/usr/local/Cellar/python/2.7.8/Frameworks/Python.framework/Versions/2.7/lib/python2.7/hashlib.py", line 139, in <module> globals()[__func_name] = __get_hash(__func_name) File "/usr/local/Cellar/python/2.7.8/Frameworks/Python.framework/Versions/2.7/lib/python2.7/hashlib.py", line 91, in __get_builtin_constructor raise ValueError('unsupported hash type ' + name) ValueError: unsupported hash type sha512 Traceback (most recent call last): File "/usr/local/bin/pip", line 9, in <module> load_entry_point('pip==1.5.6', 'console_scripts', 'pip')() File "build/bdist.macosx-10.9-x86_64/egg/pkg_resources.py", line 356, in load_entry_point File "build/bdist.macosx-10.9-x86_64/egg/pkg_resources.py", line 2439, in load_entry_point File "build/bdist.macosx-10.9-x86_64/egg/pkg_resources.py", line 2155, in load File "/usr/local/Cellar/python/2.7.8/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pip-1.5.6-py2.7.egg/pip/__init__.py", line 10, in <module> from pip.util import get_installed_distributions, get_prog File "/usr/local/Cellar/python/2.7.8/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pip-1.5.6-py2.7.egg/pip/util.py", line 18, in <module> from pip._vendor.distlib import version File "/usr/local/Cellar/python/2.7.8/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pip-1.5.6-py2.7.egg/pip/_vendor/distlib/version.py", line 14, in <module> from .compat import string_types File "/usr/local/Cellar/python/2.7.8/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pip-1.5.6-py2.7.egg/pip/_vendor/distlib/compat.py", line 31, in <module> from urllib2 import (Request, urlopen, URLError, HTTPError, ImportError: cannot import name HTTPSHandler If you need any extra information from me let me know, this is my first time posting a question here. Thanks.

Read the article

< Previous Page | 3 4 5 6 7