urllib - Page 7 - Developer IT

GAE AttributeError

- by awegawef

My GAE app runs fine from my computer, but when I upload it, I start getting an AttributeError, specifically: AttributeError: 'dict' object has no attribute 'item' I am using the pylast interface (an API for last.fm--link). Specifically, I am accessing a list of variables of this type: SimilarItem = _namedtuple("SimilarItem", ["item", "match"]) I have a variable of this type, call it sim, and I am trying to access sim.item when I get the attribute error. I should note that I am using Python 2.6 on my computer, and I understand that GAE runs on Python 2.5. Would that make a difference here? I thought they were backwards-compatible. Lastly, I think it could be a possible problem with the modules that pylast imports--maybe they don't work with GAE or something? I did some research but I didn't get any results. Here are the imports: import hashlib import httplib import urllib import threading from xml.dom import minidom import xml.dom import time import shelve import tempfile import sys import htmlentitydefs I would appreciate any help with this frustrating issue. Thanks in advance.

Read the article

Unknown error when submit a REST request to Liferay json API

- by r.rodriguez

I'm writing an script in Python to automatically update the structures in my Liferay portal and I want to do it via the json REST API. I make a request to get an structure (method getStructure), and it worked. But when I try to do an structure update in the portal it shows me the following error: ValueError: Content-Length should be specified for iterable data of type class 'dict' {'serviceContext': "{'prueba'}", 'serviceClassName': 'com.liferay.portlet.journal.service.JournalStructureServiceUtil', 'name': 'FOO', 'xsd': '... THE XSD OBTAINED VIA JSON ...', 'serviceParameters': '[groupId,structureId,parentStructureId,name,description,xsd,serviceContext]', 'description': 'FOO Structure', 'serviceMethodName': 'updateStructure', 'groupId': '10133'} What I'm doing is the next: urllib.request.Request(url = URL, data = data_update, headers = headers) URL is http://localhost:8080/tunnel-web/secure/json The headers are configured with basic authentication (it works, it is tested with the getStructure method). Data is: data_update = { "serviceClassName" : "com.liferay.portlet.journal.service.JournalStructureServiceUtil", "serviceMethodName" : "updateStructure", "serviceParameters" : "[groupId,structureId,parentStructureId,name,description,xsd,serviceContext]", "groupId" : 10133, "name" : FOO, "description" : FOO Structure, "xsd" : ... THE XSD OBTAINED VIA JSON ..., "serviceContext" : "{}" } Does anybody know the solution? Have I to specify the length for the dictionary and how? Or this is a bug?

Read the article

"image contains error", trying to create and display images using google app engine

- by bert

Hello all the general idea is to create a galaxy-like map. I run into problems when I try to display a generated image. I used Python Image library to create the image and store it in the datastore. when i try to load the image i get no error on the log console and no image on the browser. when i copy/paste the image link (including datastore key) i get a black screen and the following message: The image “view-source:/localhost:8080/img?img_id=ag5kZXZ-c3BhY2VzaW0xMnINCxIHTWFpbk1hcBgeDA” cannot be displayed because it contains errors. the firefox error console: Error: Image corrupt or truncated: /localhost:8080/img?img_id=ag5kZXZ-c3BhY2VzaW0xMnINCxIHTWFpbk1hcBgeDA import cgi import datetime import urllib import webapp2 import jinja2 import os import math import sys from google.appengine.ext import db from google.appengine.api import users from PIL import Image #SNIP #class to define the map entity class MainMap(db.Model): defaultmap = db.BlobProperty(default=None) #SNIP class Generator(webapp2.RequestHandler): def post(self): #SNIP test = Image.new("RGBA",(100, 100)) dMap=MainMap() dMap.defaultmap = db.Blob(str(test)) dMap.put() #SNIP result = db.GqlQuery("SELECT * FROM MainMap LIMIT 1").fetch(1) if result: print"item found<br>" #debug info if result[0].defaultmap: print"defaultmap found<br>" #debug info string = "<div><img src='/img?img_id=" + str(result[0].key()) + "' width='100' height='100'></img>" print string else: print"nothing found<br>" else: self.redirect('/?=error') self.redirect('/') class Image_load(webapp2.RequestHandler): def get(self): self.response.out.write("started Image load") defaultmap = db.get(self.request.get("img_id")) if defaultmap.defaultmap: try: self.response.headers['Content-Type'] = "image/png" self.response.out.write(defaultmap.defaultmap) self.response.out.write("Image found") except: print "Unexpected error:", sys.exc_info()[0] else: self.response.out.write("No image") #SNIP app = webapp2.WSGIApplication([('/', MainPage), ('/generator', Generator), ('/img', Image_load)], debug=True) the browser shows the "item found" and "defaultmap found" strings and a broken imagelink the exception handling does not catch any errors Thanks for your help Regards Bert

Read the article

python thread prob after build

- by Apache

hi expert, i'm having task to scan wifi at specific interval and send it to the server, i've it in python and its works fine when i run manually, then build it to package and when run there is no progress at all, i already ask this question before at http://stackoverflow.com/questions/2735410/python-scritp-problem-once-build-and-package-it, then, i re-modify my code as below, then i found that thread is not functioning once i build, #!/usr/bin/env python import subprocess,threading,... configFile = open('/opt/Jemapoh_Wifi/config.txt', 'r') url = configFile.readline().strip() intervalTime = configFile.readline().strip() status = configFile.readline().strip() print "url "+url print "intervalTime "+intervalTime print "Status "+status.strip() def getMacAddress(): proc = subprocess.Popen('ifconfig -a wlan0 | grep HWaddr | sed \'/^.*HWaddr */!d; s///;q\'', shell=True, stdout=subprocess.PIPE, ) macAddress = proc.communicate()[0].strip() return macAddress def getTimestamp(): from time import strftime timeStamp = strftime("%Y-%m-%d %H:%M:%S") return timeStamp def scanWifi(): try: print "Scanning..." proc = subprocess.Popen('iwlist scan 2>/dev/null', shell=True, stdout=subprocess.PIPE, ) stdout_str = proc.communicate()[0] stdout_list=stdout_str.split('\n') essid=[] rssi=[] preQuality=[] for line in stdout_list: line=line.strip() match=re.search('ESSID:"(\S+)"',line) if match: essid.append(match.group(1)) match=re.search('Quality=(\S+)',line) if match: preQuality.append(match.group(1)) for qualityConversion in preQuality: qualityConversion = qualityConversion.split()[0].split('/') temp = str(int(round(float(qualityConversion[0]) / float(qualityConversion[1]) * 100))).rjust(2) rssi.append(temp) dataToPost = '{"userId":"' + getMacAddress() + '","timestamp":"' + getTimestamp() + '","wifi":[' for no in range(len(essid)): dataToPost += '{"ssid":"' + essid[no] + '","rssi":"' + rssi[no] + '"}' if no+1 == len(essid): pass else: dataToPost += ',' dataToPost += ']}' query_args = {"data":dataToPost} request = urllib2.Request(url) request.add_data(urllib.urlencode(query_args)) request.add_header('Content-Type', 'application/x-www-form-urlencoded') print "Waiting for server response..." print urllib2.urlopen(request).read() print "Data Sent @ " + getTimestamp() print "------------------------------------------------------" t = threading.Timer(int(intervalTime), scanWifi).start() except Exception, e: print e t = threading.Timer(int(intervalTime), scanWifi) t.start() once build, its not reaching the thread, do can anyone help, why the thread is not working after build thanks

Read the article

Authentication using cookie key with asynchronous callback

- by greg

I need to write authentication function with asynchronous callback from remote Auth API. Simple authentication with login is working well, but authorization with cookie key, does not work. It should checks if in cookies present key "lp_login", fetch API url like async and execute on_response function. The code almost works, but I see two problems. First, in on_response function I need to setup secure cookie for authorized user on every page. In code user_id returns correct ID, but line: self.set_secure_cookie("user", user_id) does't work. Why it can be? And second problem. During async fetch API url, user's page has loaded before on_response setup cookie with key "user" and the page will has an unauthorized section with link to login or sign on. It will be confusing for users. To solve it, I can stop loading page for user who trying to load first page of site. Is it possible to do and how? Maybe the problem has more correct way to solve it? class BaseHandler(tornado.web.RequestHandler): @tornado.web.asynchronous def get_current_user(self): user_id = self.get_secure_cookie("user") user_cookie = self.get_cookie("lp_login") if user_id: self.set_secure_cookie("user", user_id) return Author.objects.get(id=int(user_id)) elif user_cookie: url = urlparse("http://%s" % self.request.host) domain = url.netloc.split(":")[0] try: username, hashed_password = urllib.unquote(user_cookie).rsplit(',',1) except ValueError: # check against malicious clients return None else: url = "http://%s%s%s/%s/" % (domain, "/api/user/username/", username, hashed_password) http = tornado.httpclient.AsyncHTTPClient() http.fetch(url, callback=self.async_callback(self.on_response)) else: return None def on_response(self, response): answer = tornado.escape.json_decode(response.body) username = answer['username'] if answer["has_valid_credentials"]: author = Author.objects.get(email=answer["email"]) user_id = str(author.id) print user_id # It returns needed id self.set_secure_cookie("user", user_id) # but session can's setup

Read the article

Django/jQuery - read file and pass to browser as file download prompt

- by danspants

I've previously asked a question regarding passing files to the browser so a user receives a download prompt. However these files were really just strings creatd at the end of a function and it was simple to pass them to an iframe's src attribute for the desired effect. Now I have a more ambitious requirement, I need to pass pre existing files of any format to the browser. I have attempted this using the following code: def return_file(request): try: bob=open(urllib.unquote(request.POST["file"]),"rb") response=HttpResponse(content=bob,mimetype="application/x-unknown") response["Content-Disposition"] = "attachment; filename=nothing.xls" return HttpResponse(response) except: return HttpResponse(sys.exc_info()) With my original setup the following jQuery was sufficient to give the desired download prompt: jQuery('#download').attr("src","/return_file/"); However this won't work anymore as I need to pass POST values to the function. my attempt to rectify that is below, but instead of a download prompt I get the file displayed as text. jQuery.get("/return_file/",{"file":"c:/filename.xls"},function(data) { jQuery(thisButton).children("iframe").attr("src",data); }); Any ideas as to where I'm going wrong? Thanks!

Read the article

My python program always brings down my internet connection after several hours running, how do I debug and fix this problem?

- by Shane

I'm writing a python script checking/monitoring several server/websites status(response time and similar stuff), it's a GUI program and I use separate thread to check different server/website, and the basic structure of each thread is using an infinite while loop to request that site every random time period(15 to 30 seconds), once there's changes in website/server each thread will start a new thread to do a thorough check(requesting more pages and similar stuff). The problem is, my internet connection always got blocked/jammed/messed up after several hours running of this script, the situation is, from my script side I got urlopen error timed out each time it's requesting a page, and from my FireFox browser side I cannot open any site. But the weird thing is, the moment I close my script my Internet connection got back on immediately which means now I can surf any site through my browser, so it must be the script causing all the problem. I've checked the program carefully and even use del to delete any connection once it's used, still get the same problem. I only use urllib2, urllib, mechanize to do network requests. Anybody knows why such thing happens? How do I debug this problem? Is there a tool or something to check my network status once such situation occurs? It's really bugging me for a while... By the way I'm behind a VPN, does it have something to do with this problem? Although I don't think so because my network always get back on once the script closed, and the VPN connection never drops(as it appears) during the whole process.

Read the article

Python - Problems using mechanize to log into a difficult website

- by user1781599

× 139886 I am trying to log in to betfair.com by using mechanize. I have tried several ways but it always fail. This is the code I have developed so far, can anyone help me to identify what is wrong with it and how I can improve it to log into my betfair account? Thanks, import cookielib import urllib import urllib2 from BeautifulSoup import BeautifulSoup import mechanize from mechanize import Browser import re bf_username_name = "username" bf_password_name = "password" bf_form_name = "loginForm" bf_username = "xxxxx" bf_password = "yyyyy" urlLogIn = "http://www.betfair.com/" accountUrl = "https://myaccount.betfair.com/account/home?rlhm=0&" # This url I will use to verify if log in has been successful br = mechanize.Browser(factory=mechanize.RobustFactory()) br.addheaders = [("User-Agent","Mozilla/5.0 (Macintosh; Intel Mac OS X 10_5_8) AppleWebKit/537.1 (KHTML, like Gecko) Chrome/21.0.1180.90 Safari/537.1")] br.open(urlLogIn) br.select_form(nr=0) print br.form br.form[bf_username_name] = bf_username br.form[bf_password_name] = bf_password print br.form #just to check username and psw have been recorded correctly responseSubmit = br.submit() response = br.open(accountUrl) text_file = open("LogInResponse.html", "w") text_file.write(responseSubmit.read()) #this file should show the home page with me logged in, but it show home page as if I was not logged it text_file.close() text_file = open("Account.html", "w") text_file.write(response.read()) #this file should show my account page, but it should a pop up with an error text_file.close()

Read the article

ActiveMQ 5.2.0 + REST + HTTP POST = java.lang.OutOfMemoryError

- by Bruce Loth

First off, I am a newbie when it comes to JMS & ActiveMQ. I have been looking into a messaging solution to serve as middleware for a message producer that will insert XML messages into a queue via HTTP POST. The producer is an existing system written in C++ that cannot be modified (so Java and the C++ API are out). Using the "demo" examples and some trial and error, I have cobbled together a working example of what I want to do (on a windows box). The web.xml I configured in a test directory under "webapps" specifies that the HTTP POST messages received from the producer are to be handled by the MessageServlet. I added a line for the text app in "activemq.xml" ('ow' is the test app dir): I created a test script to "insert" messages into the queue which works well. The problem I am running into is that it as I continue to insert messages via REST/HTTP POST, the memory consumption and thread count used by ActiveMQ continues to rise (It happens when I have timely consumers as well as slow or non-existent consumers). When memory consumption gets around 250MB's and the thread count exceeds 5000 (as shown in windows task manager), ActiveMQ crashes and I see this in the log: Exception in thread "ActiveMQ Transport Initiator: vm://localhost#3564" java.lang.OutOfMemoryError: unable to create new native thread It is as if Jetty is spawning a new thread to handle each HTTP POST and the thread never dies. I did look at this page: http://activemq.apache.org/javalangoutofmemory.html and tried but that didn't fix the problem (although I didn't fully understand the implications of the change either). Does anyone have any ideas? Thanks! Bruce Loth PS - I included the "test message producer" python script below for what it is worth. I created batches of 100 messages and continued to run the script manually from the command line while watching the memory consumption and thread count of ActiveMQ in task manager. def foo(): import httplib, urllib body = "<?xml version='1.0' encoding='UTF-8'?>\n \ <ROOT>\n \ [snip: xml deleted to save space] </ROOT>" headers = {"content-type": "text/xml", "content-length": str(len(body))} conn = httplib.HTTPConnection("127.0.0.1:8161") conn.request("POST", "/ow/message/RDRCP_Inbox?type=queue", body, headers) response = conn.getresponse() print response.status, response.reason data = response.read() conn.close() ## end method definition ## Begin test code count = 0; while(count < 100): # Test with batches of 100 msgs count += 1 foo()

Read the article

A Python Wrapper for Shutterfly. Uploading an Image

- by iJames

I'm working on a Django app in which I want to order prints through Shutterfly's Open API: http://www.shutterfly.com/documentation/start.sfly So far I've been able to build the appropriate POSTs and GETs using the modules and suggested code snippets including httplib, httplib2, urllib, urllib2, mimetype, etc. But I'm stuck on the image uploading when placing an order (the ordering process is not the same process as uploading images to albums which I haven't tried.) From what I can tell, I'm supposed to basically create the multipart form data by concatenating the HTTP request body together with the binary data of the image. I take the strings: --myuniqueboundary1273149960.175.1 Content-Disposition: form-data; name="AuthenticationID" auniqueauthenticationid --myuniqueboundary1273149960.175.1 Content-Disposition: file; name="Image.Data"; filename="1_41_orig.jpg" Content-Type: image/jpeg and I put this data into it and end with the final boundary: ...\xb5|\xf88\x1dj\t@\xd9\'\x1f\xc6j\x88{\x8a\xc0\x18\x8eGaJG\x03\xe9J-\xd8\x96[\x91T\xc3\x0eTu\xf4\xaa\xa5Ty\x80\x01\x8c\x9f\xe9Z\xad\x8cg\xba# g\x18\xe2\xaa:\x829\x02\xb4["\x17Q\xe7\x801\xea?\xad7j\xfd\xa2\xdf\x81\xd2\x84D\xb6)\xa8\xcb\xc8O\\\x9a\xaf(\x1cqM\x98\x8d*\xb8\'h\xc8+\x8e:u\xaa\xf3*\x9b\x95\x05F8\xedN%\xcb\xe1B2\xa9~Tw\xedF\xc4\xfe\xe8\xfc\xa9\x983\xff\xd9... That ends up making it look like this (when I use print to debug): ... --myuniqueboundary1273149960.175.1 Content-Disposition: file; name="Image.Data"; filename="1_41_orig.jpg" Content-Type: image/jpeg ????q?ExifMM* ? ??(1?2?<??i?b?NIKON CORPORATIONNIKON D40HHQuickTime 7.62009:02:17 13:05:25Mac OS X 10.5.6%??????"?'??0220?????? ???? ? ?|_???,b???50??5 ... --myuniqueboundary1273149960.175.1-- My code for grabbing the binary data is pretty much this: filedata = open('myjpegfile.jpeg','rb').read() Which I then add to the rest of the body. I've see something like this code everywhere. I'm then using this to post the full request (with the headers too): response = urllib2.urlopen(request).read() This seems to me to be the standard way that form POSTS with files happens. Am I missing something here? At some point I might be able to make this into a library worth posting up on github, but this problem has stopped me cold in my tracks. Thanks for any insight!

Read the article

python getelementbyid from string

- by matthewgall

Hey, I have the following program, that is trying to upload a file (or files) to an image upload site, however I am struggling to find out how to parse the returned HTML to grab the direct link (contained in a ). I have the code below: #!/usr/bin/python # -*- coding: utf-8 -*- import pycurl import urllib import urlparse import xml.dom.minidom import StringIO import sys import gtk import os import imghdr import locale import gettext try: import pynotify except: print "Please install pynotify." APP="Uploadir Uploader" DIR="locale" locale.setlocale(locale.LC_ALL, '') gettext.bindtextdomain(APP, DIR) gettext.textdomain(APP) _ = gettext.gettext ##STRINGS uploading = _("Uploading image to Uploadir.") oneimage = _("1 image has been successfully uploaded.") multimages = _("images have been successfully uploaded.") uploadfailed = _("Unable to upload to Uploadir.") class Uploadir: def __init__(self, args): self.images = [] self.urls = [] self.broadcasts = [] self.username="" self.password="" if len(args) == 1: return else: for file in args: if file == args[0] or file == "": continue if file.startswith("-u"): self.username = file.split("-u")[1] #print self.username continue if file.startswith("-p"): self.password = file.split("-p")[1] #print self.password continue self.type = imghdr.what(file) self.images.append(file) for file in self.images: self.upload(file) self.setClipBoard() self.broadcast(self.broadcasts) def broadcast(self, l): try: str = '\n'.join(l) n = pynotify.Notification(str) n.set_urgency(pynotify.URGENCY_LOW) n.show() except: for line in l: print line def upload(self, file): #Try to login cookie_file_name = "/tmp/uploadircookie" if ( self.username!="" and self.password!=""): print "Uploadir authentication in progress" l=pycurl.Curl() loginData = [ ("username",self.username),("password", self.password), ("login", "Login") ] l.setopt(l.URL, "http://uploadir.com/user/login") l.setopt(l.HTTPPOST, loginData) l.setopt(l.USERAGENT,"User-Agent: Uploadir (Python Image Uploader)") l.setopt(l.FOLLOWLOCATION,1) l.setopt(l.COOKIEFILE,cookie_file_name) l.setopt(l.COOKIEJAR,cookie_file_name) l.setopt(l.HEADER,1) loginDataReturnedBuffer = StringIO.StringIO() l.setopt( l.WRITEFUNCTION, loginDataReturnedBuffer.write ) if l.perform(): self.broadcasts.append("Login failed. Please check connection.") l.close() return loginDataReturned = loginDataReturnedBuffer.getvalue() l.close() #print loginDataReturned if loginDataReturned.find("<li>Your supplied username or password is invalid.</li>")!=-1: self.broadcasts.append("Uploadir authentication failed. Username/password invalid.") return else: self.broadcasts.append("Uploadir authentication successful.") #cookie = loginDataReturned.split("Set-Cookie: ")[1] #cookie = cookie.split(";",0) #print cookie c = pycurl.Curl() values = [ ("file", (c.FORM_FILE, file)) ] buf = StringIO.StringIO() c.setopt(c.URL, "http://uploadir.com/file/upload") c.setopt(c.HTTPPOST, values) c.setopt(c.COOKIEFILE, cookie_file_name) c.setopt(c.COOKIEJAR, cookie_file_name) c.setopt(c.WRITEFUNCTION, buf.write) if c.perform(): self.broadcasts.append(uploadfailed+" "+file+".") c.close() return self.result = buf.getvalue() #print self.result c.close() doc = urlparse.urlparse(self.result) self.urls.append(doc.getElementsByTagName("download")[0].childNodes[0].nodeValue) def setClipBoard(self): c = gtk.Clipboard() c.set_text('\n'.join(self.urls)) c.store() if len(self.urls) == 1: self.broadcasts.append(oneimage) elif len(self.urls) != 0: self.broadcasts.append(str(len(self.urls))+" "+multimages) if __name__ == '__main__': uploadir = Uploadir(sys.argv) Any help would be gratefully appreciated. Warm regards,

Read the article

How to get a html elements with python lxml

- by Damiano

Hello! I have this html code: <table> <tr> <td class="test"><b><a href="">aaa</a></b></td> <td class="test">bbb</td> <td class="test">ccc</td> <td class="test"><small>ddd</small></td> </tr> <tr> <td class="test"><b><a href="">eee</a></b></td> <td class="test">fff</td> <td class="test">ggg</td> <td class="test"><small>hhh</small></td> </tr> </table> I use this Python code to extract all <td class="test"> with lxml module. import urllib2 import lxml.html code = urllib.urlopen("http://www.example.com/page.html").read() html = lxml.html.fromstring(code) result = html.xpath('//td[@class="test"][position() = 1 or position() = 4]') It works good! The result is: <td class="test"><b><a href="">aaa</a></b></td> <td class="test"><small>ddd</small></td> <td class="test"><b><a href="">eee</a></b></td> <td class="test"><small>hhh</small></td> (so the first and the fourth column of each <tr>) Now, I have to extract: aaa (the title of the link) ddd (text between <small> tag) eee (the title of the link) hhh (text between <small> tag) How could I extract these values? (the problem is that I have to remove <b> tag and get the title of the anchor on the first column and remove <small> tag on the forth column) Thank you!

Read the article

Apache access.log interpretation

- by Pantelis Sopasakis

In the log file of apache (access.log) I find log entries like the following: 10.20.30.40 - - [18/Mar/2011:02:12:44 +0200] "GET /index.php HTTP/1.1" 404 505 "-" "Opera/9.80 (Windows NT 6.1; U; en) Presto/2.7.62 Version/11.01" Whose meaning is clear: The client with IP 10.20.30.40 applied a GET HTTP method on /index.php (that is to say http://mysite.org/index.php) receiving a status code 404 using Opera as client/browser. What I don't understand is entries like the following: 174.34.231.19 - - [18/Mar/2011:02:24:56 +0200] "GET http://www.siasatema.com HTTP/1.1" 200 469 "-" "Python-urllib/2.4" So here what I see is that someone (client with IP 174.34.231.19) accessed http://www.siasatema.com and got a 200 HTTP status code(?). It doesn't make sense to me... the only interpretation I can think of is that my apache server acts like proxy! Here are some other requests that don't have my site as destination... 187.35.50.61 - - [18/Mar/2011:01:28:20 +0200] "POST http://72.26.198.222:80/log/normal/ HTTP/1.0" 404 491 "-" "Octoshape-sua/1010120" 87.117.203.177 - - [18/Mar/2011:01:29:59 +0200] "CONNECT 64.12.244.203:80 HTTP/1.0" 405 556 "-" "-" 87.117.203.177 - - [18/Mar/2011:01:29:59 +0200] "open 64.12.244.203 80" 400 506 "-" "-" 87.117.203.177 - - [18/Mar/2011:01:30:04 +0200] "telnet 64.12.244.203 80" 400 506 "-" "-" 87.117.203.177 - - [18/Mar/2011:01:30:09 +0200] "64.12.244.203 80" 400 301 "-" "-" I believe that all these are related to some kind of attack or abuse of the server. Could someone explain to may what is going on and how to cope with this situation? Update 1: I disabled mod_proxy to make sure that I don't have an open proxy: # a2dismod proxy Where from I got the message: Module proxy already disabled I made sure that there is no file proxy.conf under $APACHE/mods-enabled. Finally, I set on my browser (Mozzila) my IP as a proxy and tried to access http://google.com. I was not redirected to google.com but instead my web page appeared. The same happened with trying to access http://a.b (!). So my server does not really work as a proxy since it does not forward the requests... But I think it would be better if somehow I could configure it to return a status code 403. Here is my apache configuration file: <VirtualHost *:80> ServerName mysite.org ServerAdmin webmaster@localhost DocumentRoot /var/www/ <Directory /> Options FollowSymLinks AllowOverride None </Directory> <Directory /var/www/> Options Indexes FollowSymLinks MultiViews AllowOverride None Order allow,deny allow from all </Directory> ScriptAlias /cgi-bin/ /usr/lib/cgi-bin/ <Directory "/usr/lib/cgi-bin"> AllowOverride None Options +ExecCGI -MultiViews +SymLinksIfOwnerMatch Order allow,deny Allow from all </Directory> ErrorLog /var/log/apache2/error.log LogLevel warn CustomLog /var/log/apache2/access.log combined Alias /doc/ "/usr/share/doc/" <Directory "/usr/share/doc/"> Options Indexes MultiViews FollowSymLinks AllowOverride None Order deny,allow Deny from all Allow from 127.0.0.0/255.0.0.0 ::1/128 </Directory> </VirtualHost> Update 2: Using a block, I restrict the use of other methods than GET and POST... <Limit POST PUT CONNECT HEAD OPTIONS DELETE PATCH PROPFIND PROPPATCH MKCOL COPY MOVE LOCK UNLOCK> Order deny,allow Deny from all </Limit> <LimitExcept GET> Order deny,allow Deny from all </LimitExcept> Now methods other that GET are forbidden (403). My only question now is whether there is some trick to boot those how try to use my server as a proxy out...

Read the article

Twitter traffic might not be what it seems

- by Piet

Are you using bit.ly stats to measure interest in the links you post on twitter? I’ve been hearing for a while about people claiming to get the majority of their traffic originating from twitter these days. Now, I’ve been playing with the twitter ruby gem recently, doing various experiments which I’ll not go into detail here because they could be regarded as spamming… if I’d conduct them on a large scale, that is. It’s scary to see people actually engaging with @replies crafted with some regular expressions and eliza-like trickery on status updates found using the twitter api. I’m wondering how Twitter is going to contain the coming spam-flood. When posting links I used bit.ly as url shortener, since this one seems to be the de-facto standard on twitter. A nice thing about bit.ly is that it shows some basic stats about the redirects it performs for your shortened links. To my surprise, most links posted almost immediately resulted in several visitors. Now, seeing that I was posting the links together with some information concerning what the link is about, I concluded that the people who were actually clicking the links should be very targeted visitors. This felt a bit like free adwords, and I suddenly started to understand why everyone was raving about getting traffic from twitter. How wrong I was! (and I think several 1000 online marketers with me) On the destination site I used a traffic logging solution that works by including a little javascript snippet in your pages. It seemed that somehow all visitors disappeared after the bit.ly redirect and before getting to the site, because I was hardly seeing any visitors there. So I started investigating what was happening: by looking at the logfiles of the destination site, and by making my own ’shortened’ urls by doing redirects using a very short domain name I own. This way, I could check the apache access_log before the redirects. Most user agents turned out to be bots without a doubt. Here’s an excerpt of user-agents awk’ed from apache’s access_log for a time period of about one hour, right after posting some links: AideRSS 2.0 (postrank.com) Java/1.6.0_13 Java/1.6.0_14 libwww-perl/5.816 MLBot (www.metadatalabs.com/mlbot) Mozilla/4.0 (compatible;MSIE 5.01; Windows -NT 5.0 - real-url.org) Mozilla/5.0 (compatible; Twitturls; +http://twitturls.com) Mozilla/5.0 (compatible; Viralheat Bot/1.0; +http://www.viralheat.com/) Mozilla/5.0 (Danger hiptop 4.6; U; rv:1.7.12) Gecko/20050920 Mozilla/5.0 (X11; U; Linux i686; en-us; rv:1.9.0.2) Gecko/2008092313 Ubuntu/9.04 (jaunty) Firefox/3.5 OpenCalaisSemanticProxy PycURL/7.18.2 PycURL/7.19.3 Python-urllib/1.17 Twingly Recon twitmatic Twitturly / v0.6 Wget/1.10.2 (Red Hat modified) Wget/1.11.1 (Red Hat modified) Of the few user-agents that seem ‘real’ at first, half are originating from an ip-address used by Amazon EC2. And I doubt people are setting op proxies on there. Oh yeah, Googlebot (the real deal, from a legit google owned address) is sucking up posted links like fresh oysters. I guess google is trying to make sure in advance to never be beaten by twitter in the ‘realtime search’ department. Actually, I think it’d be almost stupid NOT to post any new pages/posts/websites on Twitter, it must be one of the fastest ways to get a Googlebot visit. Same experiment with a real, established twitter account Now, because I was posting the url’s either as ’status’ messages or directed @people, on a test-account with hardly any (human) followers, I checked again using the twitter accounts from a commercial site I’m involved with. These accounts all have between 500 and 1000 targeted (I think) followers. I checked the destination access_logs and also added ‘my’ redirect after the bit.ly redirect: same results, although seemingly a bit higher real visitor/bot ratio. Btw: one of these account was ‘punished’ with a 1 week lock recently because the same (1 one!) status update was sent that was sent right before using another account. They got an email explaining the lock because the account didn’t act according to their TOS. I can’t find anything in their TOS about it, can you? I don’t think Twitter is on the right track punishing a legit account, knowing the trickery I had been doing with it’s api went totally unpunished. I might be wrong though, I often am. On the other hand: this commercial site reported targeted traffic and actual signups from visitors coming from Twitter. The ones that are really real visitors are also very targeted. I’m just not sure if the amount of work involved could hold up against an adwords campaign. Reposting the same link over and over again helps On thing I noticed: It helps to keep on reposting the same links with regular intervals. I guess most people only look at their first page when checking out recent posts of the ones they’re following, or don’t look too far back when performing a search. Now, this probably isn’t according to the twitter TOS. Actually, it might be spamming but no-one is obligated to follow anyone else of course. This way, I was getting more real visitors and less bots. To my surprise (when my programmer’s hat is on) there were still repeated visits from the same bots coming from the same ip-addresses. Did they expect to find something else when visiting for a 2nd or 3rd time? (actually,this gave me an idea: you can’t change a link once it’s posted, but you can change where it redirects to) Most bots were smart enough not to follow the same link again though. Are you successful in getting real visitors from Twitter? Are you only relying on bit.ly to provide traffic stats?

Read the article

How does this decorator make a call to the 'register' method?

- by BryanWheelock

I'm trying to understand what is going on in the decorator @not_authenticated. The next step in the TraceRoute is to the method 'register' which is also located in django_authopenid/views.py which I just don't understand because I don't see anywhere that register is even mentioned in signin() How is the method 'register' called? def not_authenticated(func): """ decorator that redirect user to next page if he is already logged.""" def decorated(request, *args, **kwargs): if request.user.is_authenticated(): next = request.GET.get("next", "/") return HttpResponseRedirect(next) return func(request, *args, **kwargs) return decorated @not_authenticated def signin(request,newquestion=False,newanswer=False): """ signin page. It manage the legacy authentification (user/password) and authentification with openid. url: /signin/ template : authopenid/signin.htm """ request.encoding = 'UTF-8' on_failure = signin_failure next = clean_next(request.GET.get('next')) form_signin = OpenidSigninForm(initial={'next':next}) form_auth = OpenidAuthForm(initial={'next':next}) if request.POST: if 'bsignin' in request.POST.keys() or 'openid_username' in request.POST.keys(): form_signin = OpenidSigninForm(request.POST) if form_signin.is_valid(): next = clean_next(form_signin.cleaned_data.get('next')) sreg_req = sreg.SRegRequest(optional=['nickname', 'email']) redirect_to = "%s%s?%s" % ( get_url_host(request), reverse('user_complete_signin'), urllib.urlencode({'next':next}) ) return ask_openid(request, form_signin.cleaned_data['openid_url'], redirect_to, on_failure=signin_failure, sreg_request=sreg_req) elif 'blogin' in request.POST.keys(): # perform normal django authentification form_auth = OpenidAuthForm(request.POST) if form_auth.is_valid(): user_ = form_auth.get_user() login(request, user_) next = clean_next(form_auth.cleaned_data.get('next')) return HttpResponseRedirect(next) question = None if newquestion == True: from forum.models import AnonymousQuestion as AQ session_key = request.session.session_key qlist = AQ.objects.filter(session_key=session_key).order_by('-added_at') if len(qlist) > 0: question = qlist[0] answer = None if newanswer == True: from forum.models import AnonymousAnswer as AA session_key = request.session.session_key alist = AA.objects.filter(session_key=session_key).order_by('-added_at') if len(alist) > 0: answer = alist[0] return render('authopenid/signin.html', { 'question':question, 'answer':answer, 'form1': form_auth, 'form2': form_signin, 'msg': request.GET.get('msg',''), 'sendpw_url': reverse('user_sendpw'), }, context_instance=RequestContext(request)) Looking at the request, it seems that account/register/ does reference the register method with 'PATH_INFO': u'/account/register/' Here is the request: <WSGIRequest GET:<QueryDict: {}>, POST:<QueryDict: {u'username': [u'BryanWheelock'], u'email': [u'[email protected]'], u'bnewaccount': [u'Signup']}>, COOKIES:{'__utma': '127460431.1218630960.1266769637.1266769637.1266864494.2', '__utmb': '127460431.3.10.1266864494', '__utmc': '127460431', '__utmz': '127460431.1266769637.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none)', 'sessionid': 'fb15ee538320170a22d3a3a324aad968'}, META:{'CONTENT_LENGTH': '74', 'CONTENT_TYPE': 'application/x-www-form-urlencoded', 'DOCUMENT_ROOT': '/usr/local/apache2/htdocs', 'GATEWAY_INTERFACE': 'CGI/1.1', 'HTTP_ACCEPT': 'application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5', 'HTTP_ACCEPT_CHARSET': 'ISO-8859-1,utf-8;q=0.7,*;q=0.3', 'HTTP_ACCEPT_ENCODING': 'gzip,deflate,sdch', 'HTTP_ACCEPT_LANGUAGE': 'en-US,en;q=0.8', 'HTTP_CACHE_CONTROL': 'max-age=0', 'HTTP_CONNECTION': 'close', 'HTTP_COOKIE': '__utmz=127460431.1266769637.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); __utma=127460431.1218630960.1266769637.1266769637.1266864494.2; __utmc=127460431; __utmb=127460431.3.10.1266864494; sessionid=fb15ee538320170a22d3a3a324aad968', 'HTTP_HOST': 'workproject.com', 'HTTP_ORIGIN': 'http://workproject.com', 'HTTP_REFERER': 'http://workproject.com/account/signin/complete/?next=%2F&janrain_nonce=2010-02-22T18%3A49%3A53ZG2KXci&openid.ns=http%3A%2F%2Fspecs.openid.net%2Fauth%2F2.0&openid.mode=id_res&openid.op_endpoint=https%3A%2F%2Fwww.google.com%2Faccounts%2Fo8%2Fud&openid.response_nonce=2010-02-22T18%3A49%3A53Znxxxxxxxxxw&openid.return_to=http%3A%2F%2Fworkproject.com%2Faccount%2Fsignin%2Fcomplete%2F%3Fnext%3D%252F%26janrain_nonce%3D2010-02-22T18%253A49%253A53ZG2KXci&openid.assoc_handle=AOQobUepU4xs-kGg5LiyLzfN3RYv0I0Jocgjf_1odT4RR9zfMFpQVpMg&openid.signed=op_endpoint%2Cclaimed_id%2Cidentity%2Creturn_to%2Cresponse_nonce%2Cassoc_handle&openid.sig=Jf76i2RNhqpLTJMjeQ0nnQz6fgA%3D&openid.identity=https%3A%2F%2Fwww.google.com%2Faccounts%2Fo8%2Fid%3Fid%3DAItxxxxxxxxxs9CxHQ3PrHw_N5_3j1HM&openid.claimed_id=https%3A%2F%2Fwww.google.com%2Faccounts%2Fo8%2Fid%3Fid%3DAItOaxxxxxxxxxxx4s9CxHQ3PrHw_N5_3j1HM', 'HTTP_USER_AGENT': 'Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_5_8; en-US) AppleWebKit/532.9 (KHTML, like Gecko) Chrome/5.0.307.7 Safari/532.9', 'HTTP_X_FORWARDED_FOR': '96.8.31.235', 'PATH': '/usr/bin:/bin', 'PATH_INFO': u'/account/register/', 'PATH_TRANSLATED': '/home/spirituality/webapps/work/spirit_app.wsgi/account/register/', 'QUERY_STRING': '', 'REMOTE_ADDR': '127.0.0.1', 'REMOTE_PORT': '59956', 'REQUEST_METHOD': 'POST', 'REQUEST_URI': '/account/register/', 'SCRIPT_FILENAME': '/home/spirituality/webapps/spirituality/spirit_app.wsgi', 'SCRIPT_NAME': u'', 'SERVER_ADDR': '127.0.0.1', 'SERVER_ADMIN': '[no address given]', 'SERVER_NAME': 'workproject.com', 'SERVER_PORT': '80', 'SERVER_PROTOCOL': 'HTTP/1.0', 'SERVER_SIGNATURE': '', 'SERVER_SOFTWARE': 'Apache/2.2.12 (Unix) mod_wsgi/2.5 Python/2.5.4', 'mod_wsgi.application_group': 'www.workProject.com|', 'mod_wsgi.callable_object': 'application', 'mod_wsgi.listener_host': '', 'mod_wsgi.listener_port': '25931', 'mod_wsgi.process_group': '', 'mod_wsgi.reload_mechanism': '0', 'mod_wsgi.script_reloading': '1', 'mod_wsgi.version': (2, 5), 'wsgi.errors': <mod_wsgi.Log object at 0xb7ce0038>, 'wsgi.file_wrapper': <built-in method file_wrapper of mod_wsgi.Adapter object at 0xb7e94b18>, 'wsgi.input': <mod_wsgi.Input object at 0x999cc78>, 'wsgi.multiprocess': True, 'wsgi.multithread': False, 'wsgi.run_once': False, 'wsgi.url_scheme': 'http', 'wsgi.version': (1, 0)}>

Read the article

Python and mechanize login script

- by Perun

Hi fellow programmers! I am trying to write a script to login into my universities "food balance" page using python and the mechanize module... This is the page I am trying to log into: http://www.wcu.edu/11407.asp The website has the following form to login: <FORM method=post action=https://itapp.wcu.edu/BanAuthRedirector/Default.aspx><INPUT value=https://cf.wcu.edu/busafrs/catcard/idsearch.cfm type=hidden name=wcuirs_uri> <P><B>WCU ID Number<BR></B><INPUT maxLength=12 size=12 type=password name=id> </P> <P><B>PIN<BR></B><INPUT maxLength=20 type=password name=PIN> </P> <P></P> <P><INPUT value="Request Access" type=submit name=submit> </P></FORM> From this we know that I need to fill in the following fields: 1. name=id 2. name=PIN With the action: action=https://itapp.wcu.edu/BanAuthRedirector/Default.aspx This is the script I have written thus far: #!/usr/bin/python2 -W ignore import mechanize, cookielib from time import sleep url = 'http://www.wcu.edu/11407.asp' myId = '11111111111' myPin = '22222222222' # Browser #br = mechanize.Browser() #br = mechanize.Browser(factory=mechanize.DefaultFactory(i_want_broken_xhtml_support=True)) br = mechanize.Browser(factory=mechanize.RobustFactory()) # Use this because of bad html tags in the html... # Cookie Jar cj = cookielib.LWPCookieJar() br.set_cookiejar(cj) # Browser options br.set_handle_equiv(True) br.set_handle_gzip(True) br.set_handle_redirect(True) br.set_handle_referer(True) br.set_handle_robots(False) # Follows refresh 0 but not hangs on refresh > 0 br.set_handle_refresh(mechanize._http.HTTPRefreshProcessor(), max_time=1) # User-Agent (fake agent to google-chrome linux x86_64) br.addheaders = [('User-agent','Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/535.11 (KHTML, like Gecko) Chrome/17.0.963.56 Safari/535.11'), ('Accept', 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8'), ('Accept-Encoding', 'gzip,deflate,sdch'), ('Accept-Language', 'en-US,en;q=0.8'), ('Accept-Charset', 'ISO-8859-1,utf-8;q=0.7,*;q=0.3')] # The site we will navigate into br.open(url) # Go though all the forms (for debugging only) for f in br.forms(): print f # Select the first (index two) form br.select_form(nr=2) # User credentials br.form['id'] = myId br.form['PIN'] = myPin br.form.action = 'https://itapp.wcu.edu/BanAuthRedirector/Default.aspx' # Login br.submit() # Wait 10 seconds sleep(10) # Save to a file f = file('mycatpage.html', 'w') f.write(br.response().read()) f.close() Now the problem... For some odd reason the page I get back (in mycatpage.html) is the login page and not the expected page that displays my "cat cash balance" and "number of block meals" left... Does anyone have any idea why? Keep in mind that everything is correct with the header files and while the id and pass are not really 111111111 and 222222222, the correct values do work with the website (using a browser...) Thanks in advance EDIT Another script I tried: from urllib import urlopen, urlencode import urllib2 import httplib url = 'https://itapp.wcu.edu/BanAuthRedirector/Default.aspx' myId = 'xxxxxxxx' myPin = 'xxxxxxxx' data = { 'id':myId, 'PIN':myPin, 'submit':'Request Access', 'wcuirs_uri':'https://cf.wcu.edu/busafrs/catcard/idsearch.cfm' } opener = urllib2.build_opener() opener.addheaders = [('User-agent','Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/535.11 (KHTML, like Gecko) Chrome/17.0.963.56 Safari/535.11'), ('Accept', 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8'), ('Accept-Encoding', 'gzip,deflate,sdch'), ('Accept-Language', 'en-US,en;q=0.8'), ('Accept-Charset', 'ISO-8859-1,utf-8;q=0.7,*;q=0.3')] request = urllib2.Request(url, urlencode(data)) open("mycatpage.html", 'w').write(opener.open(request)) This has the same behavior...

Search Results

Search found 166 results on 7 pages for 'urllib'.

Page 7/7 | < Previous Page | 3 4 5 6 7

- by awegawef

- by r.rodriguez

- by bert

- by Apache

- by greg

- by danspants

- by Shane

- by user1781599

- by Bruce Loth

- by iJames

- by matthewgall

- by Damiano

- by Pantelis Sopasakis

- by Piet

- by BryanWheelock

- by Perun

< Previous Page | 3 4 5 6 7