Keep Away Robots!

I can’t wait until we have to apply this kind of thinking to keep away humanoid robots:

The robots exclusion standard, also known as the Robots Exclusion Protocol or robots.txt protocol is a convention to prevent cooperating web spiders and other web robots from accessing all or part of a website which is, otherwise, publicly viewable. Robots are often used by search engines to categorize and archive web sites, or by webmasters to proofread source code….

The protocol, however, is purely advisory. It relies on the cooperation of the web robot, so that marking an area of a site out of bounds with robots.txt does not guarantee privacy. Some web site administrators have tried to use the robots file to make private parts of a website invisible to the rest of the world, but the file is necessarily publicly available and its content is easily checked by anyone with a web browser.


- END -

ASSOCIATED CONTENT @TMBCHR (Auto-Generated)

2 Comments

  1. Julia
    Posted October 18, 2007 at 7:21 pm | Permalink

    often used by search engines to categorize and archive web sites

    Do these searches to websites count as views/hits? If they do, is Google/whoever tracking how many robotic hits you’ve had and deleting this from your total?

  2. Posted October 19, 2007 at 2:57 am | Permalink

    Yeah it’s accounted for separately I think. I don’t know a ton of technical details in that area. Would probably be found a few Wikipedia clicks away though…

Public Domain Where Applicable, Copy Left Where Not, Universal Free Realms Everyware Else for 2009 and for forever.the timboucher experience. No rights reserved.