How to improve your SEO with Robot.txt

A Robot.txt file is a file specifically written for search engine robots and is used to block, allow or restrict folders or files from being indexed by the search engines.

Why have this?

Search engine crawlers (bots, spiders) scan websites and will index all files or folders it comes across.

Many websites have supporting folders or files that may be used when developing a website but contain no relevant, useful content for your website users eg Popup forms, xml sitemaps, folders with sensitive or autogenerated files. These are the sort of files you would want to block the Bots from reading and indexing.

Robot.txt Examples

Stop search bots from viewing anything within your entire website. This is particularly useful if you are building a new website in a temporary domain and you don't want the search engines to list or index anything.

User-agent: *
Disallow: /

The first line tells which crawler/bot/spider this rule applies to. In this case, User-agent:* means ALL search engines.
The second line instructs the crawler not to idex anything in the entire website.

To block files of a specific file type (for example, .jpg), use the following:

Disallow: /*.jpg$

To block files of a specific file starting name and with any extension (file type eg showImg.jpg or showImg.png):

Disallow: /showImg.*

To block a folder and everything in that folder, follow the folder name with a forward slash.

Disallow: /my-documents/

To block access to all URLs that include a question mark (?) This is really handy if you have multiple page listings or date related pages.

Disallow: /*?

To block a page, list the page.

Disallow: /something.html

An example of what your robot.txt file might look like:

http://yourdomain.com/robot.txt

User-agent: *
Disallow: /favicon.ico
Disallow: /documents/
Disallow: /*?
Disallow: /subscription-form.php
Disallow: /404
Disallow: /blog/tagged/
Disallow: /admin/

Remember, you only want the search engines to read and index folder or files that contain content and images that your website visitors will find useful and of value.