Having trouble scraping an ASP .NET web page
Posted
by Seth
on Stack Overflow
See other posts from Stack Overflow
or by Seth
Published on 2010-03-24T11:42:47Z
Indexed on
2010/03/27
21:53 UTC
Read the original article
Hit count: 333
I am trying to scrape an ASP.NET website but am having trouble getting the results from a post. I have the following python code and am using httplib2 and BeautifulSoup:
conn = Http()
# do a get first to retrieve important values
page = conn.request(u"http://somepage.com/Search.aspx", "GET")
#event_validation and viewstate variables retrieved from GET here...
body = {"__EVENTARGUMENT" : "",
"__EVENTTARGET" : "" ,
"__EVENTVALIDATION": event_validation,
"__VIEWSTATE" : viewstate,
"ctl00_ContentPlaceHolder1_GovernmentCheckBox" : "On",
"ctl00_ContentPlaceHolder1_NonGovernmentCheckBox" : "On",
"ctl00_ContentPlaceHolder1_SchoolKeyValue" : "",
"ctl00_ContentPlaceHolder1_SchoolNameTextBox" : "",
"ctl00_ContentPlaceHolder1_ScriptManager1" : "ctl00_ContentPlaceHolder1_UpdatePanel1|cct100_ContentPlaceHolder1_SearchImageButton",
"ct100_ContentPlaceHolder1_SearchImageButton.x" : "375",
"ct100_ContentPlaceHolder1_SearchImageButton.y" : "11",
"ctl00_ContentPlaceHolder1_SuburbTownTextBox" : "Adelaide,SA,5000",
"hiddenInputToUpdateATBuffer_CommonToolkitScripts" : 1}
headers = {"Content-type": "application/x-www-form-urlencoded"}
resp, content = conn.request(url,"POST", headers=headers, body=urlencode(body))
When I print content
I still seem to be getting the same results as the "GET" or is there a fundamental concept I'm missing to retrieve the result values of an ASP .NET post?
© Stack Overflow or respective owner