Comments on Google's Sitemap feature

As you might expect, the Google Sitemap feature is being debated among techies, is it a good thing or bad, did they create a new format when an existing one would have been better, etc. Should they have used pinging instead of static files? Well, as you might imagine, I have an opinion about some of these things.;->

1. It's a good thing because it's an API, and apps that have APIs are infinitely more useful than ones that don't. In this case it's an API for a very important application, Google's crawler. So not only is it good, it is very very good. Very.

2. Any badness that comes from having invented a new format is insignificant compared to the goodness in #1. Quit debating it already, and let's figure out where the real problems are, because there are some.

3. Pinging would have been a poor solution, because they're trying to find a way to communicate updates for huge numbers of pages. Pings would be a very inefficient way to communicate that because a ping requires immediate attention, where a static file can be read at your leisure, when you're about to crawl the site. Pings are good for saying this logical entity updated, this blog, this news source, whatever. It's not a good way to communicate change about massive numbers of things.

4. They make the mistake of confusing a domain with a site. I may have a site on www.myfoo.com, but I can't put a file at the top level of the server. They clearly didn't want to repeat one of the mistakes of robots.txt (a fixed file name) but they went ahead and repeated another.

5. Submitting the page to a Google server is dumb (that doesn't mean they're dumb, btw, just the idea). Why not use the convenient link element that HTML already provides. That would solve the problem in #4 as well. Try this:

<link rel="sitechanges" type="text/xml" href="http://www.myfoo.com/mancuso/mychanges.xml">
This puts the info where it belongs, at the site level, that way it's independent of the physical location on the server. And I don't have to submit it to Jeeves, Bloglines, MSN, Tom, Dick and Harry and whatever new search engine might come along that I don't even know about.

# Posted by Dave Winer on 6/8/05; 9:48:19 AM - --