I am proud to present my latest application SiteScan. This application will scan you entire web site and extract all the links on it. You just have to provide it with your home page's URL. Then click the Scan button. The list of URL links will be output to file "sitescan" located in the same directory as the SiteScan program.
Be warned that this application can take a long time to run for large web sites. I have a fast Internet connection and web server where my biggest blog is hosted. SiteScan took 15 minutes to scan the whole blog. Part of the time is because the app has to download and look for links in all your pages. The other time consuming activity is to make sure the scanner does not get stuck in link circles. If Page 1 links to Page 2, which in turn links back to Page 1, then SiteScan has to detect this and not just bounce back between both pages forever.
This application assumes that the site you are scanning is static. That is, if you go and change the web site which the program is running, you may get unexpected results. I plan to kick off SiteScan tonight and have it scan all of MSN. Let's hope this job finishes by morning. I will let you know how it goes. In the mean time, enjoy the SiteScan application.
Work Smarter not Harder
-
We have large data sets in my current project. Every year tons more data is
loaded into the system. So we only keep the majority of data for 4 years.
After...