4/21/2007 Sitemap auto discovery via the robots.txt protocol By: Melanie Prough
All the crawlers currently recognize the robots.txt protocol, so auto discovery was the natural evolution. The top 3 engines, Yahoo; Google; and Ask.com., have announced their support of the sitemap inclusion protocol. So supposedly no more submitting sitemaps manually, but I would still submit new sitemaps for a few months to be safe.Here you can read the 4/11/07 post from Vanessa Fox concerning the development and the protocol. I played around with this for several hours, and to my dismay could not validate the robots file after adding the sitemap. After much searching, posting and reading I found some help and suggestions. Putting all that I read in to force...Below is how to add your sitemap without a syntax error.
Sitemap: http://sitemap
User-agent: *
Disallow: /cgi-bin/
Ok first thing, if your map is titled
# Robots.txt file for www.your_domain.com
Then you will space a line under it before adding the sitemap line. The sitemap line above is accurate for sitemaps.org protocol. If you do not space between the top/title and the sitemap command it will not validate in Google's Webmaster Tools. To avoid any other possible syntax issues, I also spaced a line after the sitemap directive. The spaces in theory mean nothing to a robot.
I went ahead and got on board with this, I will keep this article up to date as the stats develop changes in either direction. Going forward in this early stage is a risk, but also an opportunity for a lower PR to get a leg up.
Melanie Prough [SEOCog.com]