Parsing a website
- by Phenom
I want to make a program that takes as user input a website address. The program then goes to that website, downloads it, and then parses the information inside. It outputs a new html file using the information from the website.
Specifically, what this program will do is take certain links from the website, and put the links in the output html file, and it will discard everything else.
Right now I just want to make it for websites that don't require a login, but later on I want to make it work for sites where you have to login, so it will have to be able to deal with cookies.
I'll also want to later on have the program be able to explore certain links and download information from those other sites.
What are the best programming languages or tools to do this?