Web Crawler C# .Net
Posted
by
sora0419
on Stack Overflow
See other posts from Stack Overflow
or by sora0419
Published on 2013-06-27T15:49:35Z
Indexed on
2013/06/27
16:21 UTC
Read the original article
Hit count: 180
c#
|web-crawler
I'm not sure if this is actually called the web crawler, but this is what I'm trying to do.
I'm building a program in visual studio 2010 using C# .Net.
I want to find all the urls that has the same first part.
Say I have a homepage: www.mywebsite.com
, and there are several subpage: /tab1
, /tab2
, /tab3
, etc.
Is there a way to get a list of all urls that begins with www.mywebsite.com
?
So by providing www.mywebsite.com
, the program returns www.mywebsite.com/tab1
, www.mywebsite.com/tab2
, www.mywebsite.com/tab3
, etc.
ps. I do not know how many total sub pages there are.
--edit at 12:04pm--
sorry for the lack of explanation.
I want to know how to write a crawler in C# that do the above task.
All I know is the main url www.mywebsite.com
, and the goal is to find all its sub pages.
-- edit at 12:16pm--
Also, there is no links on the main page, the html is basically blank.
I just know that the subpages exist, but have no way to link to it except for providing the exact urls.
© Stack Overflow or respective owner