Feb 23 2010

How to Avoid the Robots.txt Writer’s Block!

*This short but very useful tip was submitted by Charly Wargnier.*

When doing SEO changes for large scale companies, implementing a proper robot.txt is crucial. I will not go back to the robot.txt definition bla, bla… millions have done that before me.

No, instead, just a simple formula to use whenever the geek inside you has a Robot.txt Writer’s block! So, type “inurl:robots.txt filetype:txt” and, ta-daa! See what the big names are doing.

You will find the robots.txt file from Google, Wikipedia, WebmasterWorld, the White House, Microsoft, W3.org, Facebook, IBM, Amazon, Ebay, New York Times, CNN, YouTube, etc.

Have Geek Fun.

The guest post is by Charly Wargnier, SEO Head at the London digital agency Euston Digital. You can follow their SEO tips and tricks on ED’s Blog, or their Twitter here

9 Responses to “How to Avoid the Robots.txt Writer’s Block!”

  1. Cheap Logo Design says:

    I always used robot.txt to avoid some of the crawling pages of my website. This one is really great. Thanks for info

  2. Paul Mollin says:

    Hi Charly,

    We’re currently implementing a new site for aalabels and this piece of advice will be a great one for our Devteam! Bookmarked!

  3. Stephen Payne says:

    Nice one! I wouldn’t need to go into that for my site but this is quite fun too!

    And The White House doesn’t seem very concerned about robot.txt!

  4. Ben says:

    I have not used the robots.txt much and have had great SEO results, great reminder to be doing this as standard practice.

  5. David Burnett says:

    Is there an easy way to set up a robot.txt?

  6. Chotrul says:

    You just create the robots.text file, David, and then upload to the root folder of your domain. That is job done.

  7. John S. Britsios (aka Webnauts) says:

    This article is totally misleading. You should avoid blocking bots accessing pages via robots.txt: http://www.youtube.com/user/GoogleWebmasterHelp#p/a/u/2/CJMFYpYQZ0c

  8. John S. Britsios (aka Webnauts) says:

    I found the time to go into more details:

    Don’t block stuff in robots.txt when you want to steer search engine indexing, otherwise you can create PageRank sinks.

    Never ever try to steer indexers in robots.txt until you really, really can’t avoid it. Not even duplicates. Especially not duplicates.

    For Google, Bing and Yahoo, better use X-Robots or Meta Robots Tags directives “noindex,nosnippet,noarchive”.

  9. orm says:

    Thanks for the tip. Sometimes its easier to learn by example.