Having trouble scraping an ASP .NET web page

Posted by Seth on Stack Overflow See other posts from Stack Overflow or by Seth
Published on 2010-03-24T11:42:47Z Indexed on 2010/03/27 21:53 UTC
Read the original article Hit count: 333

Filed under:
|

I am trying to scrape an ASP.NET website but am having trouble getting the results from a post. I have the following python code and am using httplib2 and BeautifulSoup:

conn = Http()
# do a get first to retrieve important values
page = conn.request(u"http://somepage.com/Search.aspx", "GET")

#event_validation and viewstate variables retrieved from GET here...

body = {"__EVENTARGUMENT" : "",
        "__EVENTTARGET" : "" ,
        "__EVENTVALIDATION": event_validation,
        "__VIEWSTATE" : viewstate,
        "ctl00_ContentPlaceHolder1_GovernmentCheckBox" : "On",
        "ctl00_ContentPlaceHolder1_NonGovernmentCheckBox" : "On",
        "ctl00_ContentPlaceHolder1_SchoolKeyValue" : "",
        "ctl00_ContentPlaceHolder1_SchoolNameTextBox" : "",
        "ctl00_ContentPlaceHolder1_ScriptManager1" : "ctl00_ContentPlaceHolder1_UpdatePanel1|cct100_ContentPlaceHolder1_SearchImageButton",
        "ct100_ContentPlaceHolder1_SearchImageButton.x" : "375",
        "ct100_ContentPlaceHolder1_SearchImageButton.y" : "11",
        "ctl00_ContentPlaceHolder1_SuburbTownTextBox" : "Adelaide,SA,5000",
        "hiddenInputToUpdateATBuffer_CommonToolkitScripts" : 1}

headers = {"Content-type": "application/x-www-form-urlencoded"}
resp, content = conn.request(url,"POST", headers=headers, body=urlencode(body))

When I print content I still seem to be getting the same results as the "GET" or is there a fundamental concept I'm missing to retrieve the result values of an ASP .NET post?

© Stack Overflow or respective owner

Related posts about ASP.NET

Related posts about python