Page 1 of 1

How to make a sitemap of any website using a crawler

Posted: Sat Apr 10, 2010 3:49 am
by Neo
There is just a few steps.

Write a function as below.

Code: Select all

FUNCTION ScrapeURL (url)
{
      Scrape a given page
      Filter all links
      Write sitemap xml onject for given url.
      Call ScrapeURL() for all the urls we found in this page (recursive call: A function that calls itself).
}
Now, you need to call this function as ScrapeURL (HomePage);
At the end of execution, you will get the sitemap for all linked pages for the given website.

Following articles have almost all information required to write this.

How to scrape links on a web page using PHP
How to scrape web sites as feeds using php

If you have written a sitemap code using these articles, I request you to post your code here for the help of our members.