Robots.txt Guide

Tips and help with HTML, CSS, JavaScript, and site development.
Post Reply
User avatar
ccb056
Site Administrator
Posts: 981
Joined: January 14th, 2004, 11:36 pm
Location: Texas
Contact:

Robots.txt Guide

Post by ccb056 »

This is a guide for some of you newer webmasters who have just started ou and have not really spent too much time in the Search Engine Optimization (SEO) department.

Robots.txt is a text file (obviously) that tells good robors where not to go in your site. Most robots follow the robots.txt file (examples of those that do: Googlebot, Yahoo, Altavista, MSN). However, some robots, especially those used for spam, etc. do not follow the rules placed in Robots.txt

The first rule of robots.txt is that it must be placed in the main directory; this means that you cannot place a robots.txt file in this directory:
http://www.domain.com/thisdirectory/

you can however place a robots.txt in this directory
http://www.domain.com/

The robots will not visit your robots.txt file every time they visit your site, they usually visit the file about once a week.

Lets get down to the basic syntax of the robots.txt file:

To exclude all robots from the entire server

Code: Select all

User-agent: *
Disallow: /
To allow all robots complete access

Code: Select all

User-agent: *
Disallow:
To exclude all robots from part of the server

Code: Select all

User-agent: *
Disallow: /cgi-bin/
Disallow: /tmp/
Disallow: /private/
To exclude a single robot

Code: Select all

User-agent: BadBot
Disallow: /
To allow a single robot

Code: Select all

User-agent: WebCrawler
Disallow:
User-agent: *
Disallow: /
Please not that the * (wildcard) can only be used in the User-Agent: field, and not in the Disallow: field.

Currently, there is no Allow: field for robots.txt
Tebow2000
Registered User
Posts: 1099
Joined: January 19th, 2004, 7:56 am
Location: New Orleans, Louisiana
Contact:

Post by Tebow2000 »

The bots can easily get around that script, right...
Redcode Hosting redcodehosting.com | Unix Shared Hosting | sales[aT]redcodehosting[dOt]com
User avatar
ccb056
Site Administrator
Posts: 981
Joined: January 14th, 2004, 11:36 pm
Location: Texas
Contact:

Post by ccb056 »

yes, if they are bad bots, they ignore it

Don't place sensitive directories in the robots.txt file, it is better to use .htaccess files to block unwanted users
ponpots
Registered User
Posts: 11
Joined: June 26th, 2004, 9:02 pm
Location: catmandu
Contact:

Post by ponpots »

Can anyone post a link to Robot's. txt file tutorial. I mean how to make this file. :?: :?: Any help is appreciated.
Post Reply