Yesterday, QSSI launched a blog dedicated to search engine optimization news and analysis for government web site managers. The weekly blog will try to inform and educate about the threats and opportunities commercial search engines present government web sites.
The first entry of the QSSI blog expands on my previous comment about the importance of robots.txt for government web sites, and shares the results of a research conducted over 250 federal and state government web sites listed on FirstGov.gov. Of these .gov, .mil, and .us sites:
- 154 do not have robots.txt file
- 4 have robots.txt files which do not validate
- 24 have robots.txt files which do not completely validate
- 68 have robots.txt files which properly validate
The blog concludes that roughly 73% of the government web sites do not prevent commercial search engines from indexing potentially sensitive information. Those interested in the full version of the research are encouraged to contact QSSI directly.