Scraping ASP.NET site with Ruby

Posted by JillianK on Stack Overflow See other posts from Stack Overflow or by JillianK
Published on 2010-05-12T21:36:50Z Indexed on 2010/05/12 21:54 UTC
Read the original article Hit count: 303

Filed under:

screen-scraping

|

ASP.NET

|

ruby

I would like to scrape the search results of this ASP.NET site using Ruby and preferably just using Hpricot (I cannot open an instance of Firefox): http://www.ngosinfo.gov.pk/SearchResults.aspx?name=&foa=0

However, I am having trouble figuring out how to go through each page of results. Basically, I need simulate clicking on links like these:

<a href="javascript:__doPostBack('ctl00$ContentPlaceHolder1$Pager1$2','')" class="blue_11" id="ctl00_ContentPlaceHolder1_Pager1">2</a>
<a href="javascript:__doPostBack('ctl00$ContentPlaceHolder1$Pager1$3','')" class="blue_11" id="ctl00_ContentPlaceHolder1_Pager1">3</a>

etc.

I tried using Net::HTTP to handle the post, but while that received the correct HTML, there were no search results (I'm probably not doing that correctly). In addition, the URL of the page does not contain any parameters indicating page, so it is not possible to force the results that way.

Any help would be greatly appreciated.

© Stack Overflow or respective owner

Related posts about screen-scraping

PHP Screen Scraping Class

as seen on Bradino - Search for 'Bradino'
After some positive feedback I have decided to continue to develop the PHP Screen Scraping class. This post will server as the permanent home for the class. Download PHP Screen Scraping Class Updates 20009-07-30 Added setHeader() function >>> More
Screen scraping over SSL with .NET

as seen on Stack Overflow - Search for 'Stack Overflow'
What solutions exist for screen scraping a site over SSL for use with .NET? My use case is that I need to login to a partner website (https), navigate through a dynamic hierarchy, and download a zipped file of reports. I certainly could use other screen scrapers if there are no good viable options… >>> More
looking for alternative to Webzinc .NET , screen scraping, web automation library for .net

as seen on Stack Overflow - Search for 'Stack Overflow'
i came across this .net library http://www.webzinc.com/online/faq.aspx however, i was wondering if there was a free alternative out there ? >>> More
Screen-scraping of a secure page of any site on https:// with asp.net in C#

as seen on Stack Overflow - Search for 'Stack Overflow'
I've done site scraping of secure page of any site on http:// but when I am trying to scrap any site on https:// then i always scrape the login page not secure page. Please advice what should i do for scraping a secure page of any site on https://. >>> More
How different is mashup from screenscraping and consuming webservices

as seen on Stack Overflow - Search for 'Stack Overflow'
From what I understand, Mashup is aggregating data from separate sources and providing a single view. How different is mashup when compared to screenscraping or using webservices to get data from external sources? >>> More

Related posts about ASP.NET

Migrating ASP.NET MVC 1.0 applications to ASP.NET MVC 2 RTM

as seen on ASP.net Weblogs - Search for 'ASP.net Weblogs'
Note: ASP.NET MVC 2 RTM isn’t yet released! But this tool will help you get your ASP.NET MVC 1.0 applications ready for when it is! I have updated the MVC App Converter to convert projects from ASP.NET MVC 1.0 to ASP.NET MVC 2 RTM. This should be last the last major change to the MVC App Converter… >>> More
April 14th Links: ASP.NET, ASP.NET MVC, ASP.NET Web API and Visual Studio

as seen on ASP.net Weblogs - Search for 'ASP.net Weblogs'
Here is the latest in my link-listing blog series: ASP.NET Easily overlooked features in VS 11 Express for Web: Good post by Scott Hanselman that highlights a bunch of easily overlooked improvements that are coming to VS 11 (and specifically the free express editions) for web development: unit… >>> More
Use ASP.NET 4 Browser Definitions with ASP.NET 3.5

as seen on ASP.net Weblogs - Search for 'ASP.net Weblogs'
We updated the browser definitions files included with ASP.NET 4 to include information on recent browsers and devices such as Google Chrome and the iPhone. You can use these browser definition files with earlier versions of ASP.NET such as ASP.NET 3 Read More......(read more) >>> More
ASP.NET webforms + ASP.NET Ajax versus ASP.NET MVC and Ajax framework freedom

as seen on Stack Overflow - Search for 'Stack Overflow'
If given the choice, which path would you take? ASP.NET Webforms + ASP.NET AJAX or ASP.NET MVC + JavaScript Framework of your Choice Are there any limitations that ASP.NET Webforms / ASP.NET AJAX has vis-a-vis MVC? >>> More
ASP.NET MVC 2 Released

as seen on ASP.net Weblogs - Search for 'ASP.net Weblogs'
I’m happy to announce that the final release of ASP.NET MVC 2 is now available for VS 2008/Visual Web Developer 2008 Express with ASP.NET 3.5. You can download and install it from the following locations: Download ASP.NET MVC 2 using the Microsoft Web Platform Installer Download… >>> More