A Robot.txt file is a file specifically written for search engine robots and is used to block, allow or restrict folders or files from being indexed by the search engines.
Search engine crawlers (bots, spiders) scan websites and will index all files or folders it comes across.
Many websites have supporting folders or files that may be used when developing a website but contain no relevant, useful content for your website users eg Popup forms, xml sitemaps, folders with sensitive or autogenerated files. These are the sort of files you would want to block the Bots from reading and indexing.
User-agent: *
Disallow: /
The first line tells which crawler/bot/spider this rule applies to. In this case, User-agent:* means ALL search engines.
The second line instructs the crawler not to idex anything in the entire website.
Disallow: /*.jpg$
Disallow: /showImg.*
Disallow: /my-documents/
Disallow: /*?
Disallow: /something.html
An example of what your robot.txt file might look like:
http://yourdomain.com/robot.txt
User-agent: *
Disallow: /favicon.ico
Disallow: /documents/
Disallow: /*?
Disallow: /subscription-form.php
Disallow: /404
Disallow: /blog/tagged/
Disallow: /admin/
Remember, you only want the search engines to read and index folder or files that contain content and images that your website visitors will find useful and of value.
©2025 All rights reserved