Subscribe to my RSS feed RSS
March 10, 2009

Crawl Your Site for Broken Links, Errors and Duplicate Content

One very overlooked part of the entire SEO mix is making sure that your site does not have broken outbound (or internal) links which either link to error pages, or do not work at all. Furthermore, if your site delivers error pages or links to non-existent pages or files on your server, then search engines like Google are going to consider your site as being “under construction“, therefore not being useful or relevant to the human user.

Website Under Construction

Your site can have all of the optimized content, titles and headers in the world, but if it is not functioning correctly, then it will not rank to the best of its ability. Working on client SEO accounts, I have run into numerous issues where fixing some internal broken links, outbound links and making sure all of the files on the server are working have boosted a site ranking from pages 3 or 4, to page one.

My favorite tool for checking this information is Xenu’s Link Sleuth.

Xenu Link Sleuth

Xenu is usually the first step I take in on-site SEO research and identifying issues. I like to get these issues tackled from the beginning and when working with an IT team on a client project, handing them a 10 page error report usually puts them in their place, and show’s them that you’re serious about your technical SEO. :)

Xenu’s Link Sleuth checks Web sites for broken links. Link verification is done on “normal” links, images, frames, plug-ins, backgrounds, local image maps, style sheets, scripts and java applets. It displays a continously updated list of URLs which you can sort by different criteria.

Xenu spiders a website in a similar fashion that a search engine will and delivers a report which looks at :

  • Broken links on the site which send the spider to error messages
  • Duplicate content issues such as similar title tags and URL structure
  • Broken files such as images and multimedia content whcih are not loading correctly
  • Images which do not include alt attributes (which can be helpful to SEO)
  • Identifing files and images which may effect page load time
  • Links to server redirected pages or 301 redirects which you can change on your site to link to the real page instead of the redirect command

In addition to checking for links via Xenu’s Link Sleuth, I also recommend doing a basic duplicate content and header error diagnosis. Sure, if you are using Google Webmaster Tools, this can be done easily, but we always don’t have access to Google Webmaster Tools, especially when working on 3rd party sites or performing competitive research.

One free tool which can be used to check for common duplcate content issues is the Virante Duplicate Content tool.

Virante’s tool diagnoses the following :

  • Common www vs non-www duplicate content issue by checking the headers returned by both versions of the url, the current cache in google, and possible PR dispersion.
  • Common default page error where both the / and /index.html (or other default page) return 200/OK headers.
  • Incorrect 404 pages which deliver a 200/OK Header and,
  • Supplemental pages in the Google index

By using tools like these to identify errors on your website, you can enhance your SEO and rankings substantially, especially if you have any critical errors which are keeping your site from ranking properly in the search engines.

What are some of your favorite tools for checking broken links, files and duplicate content issues? Please feel free to share them in the comments (and get a do-follow link back to your site!)

Feed for this Entry | Trackback Address

66 comments already

  1. marcbaumann (Marc Baumann) on 12.31.1969 at 11:59 pm | permalink
  2. Tools for broken links on your site and more (with complete URL this time): http://is.gd/nfqJ

    [Reply]

    Avila Reply:

    Giving good results. sometimes giving weird result

    [Reply]

  3. glenngabe (Glenn Gabe) on 12.31.1969 at 11:59 pm | permalink
  4. Crawl Your Site for Broken Links, Errors and Duplicate Content http://is.gd/nfqJ

    [Reply]

  5. rolandl (Roland Lai) on 12.31.1969 at 11:59 pm | permalink
  6. RT @webaddict @Hakicoma: #Crawl Your Site for Broken #Links, Errors and #Duplicate #Content http://tinyurl.com/bg83jp #webmaster #blogger

    [Reply]

  7. Bravadas (Tracy Hobbs) on 12.31.1969 at 11:59 pm | permalink
  8. RT @webaddict RT @Hakicoma: #Crawl Your Site for Broken #Links, Errors and #Duplicate #Content http://tinyurl.com/bg83jp #webmaster #blogger

    [Reply]

  9. webaddict (Joel Mackey) on 12.31.1969 at 11:59 pm | permalink
  10. RT @Hakicoma: #Crawl Your Site for Broken #Links, Errors and #Duplicate #Content http://tinyurl.com/bg83jp #webmaster #blogger

    [Reply]

  11. ogginnet (Ognian Mladenov) on 12.31.1969 at 11:59 pm | permalink
  12. Crawl Your Site for Broken Links, Errors and Duplicate Content | Daily SEO Tip http://tinyurl.com/bg83jp

    [Reply]

  13. DeathWish808 (DeathWish808) on 12.31.1969 at 11:59 pm | permalink
  14. RT @RandomReTweet: RT @Hakicoma Crawl Your Site for Broken Links, Errors and Duplicate Content http://tinyurl.com/bg83jp

    [Reply]

  15. RandomReTweet (Random Re-Tweet) on 12.31.1969 at 11:59 pm | permalink
  16. RT @Hakicoma Crawl Your Site for Broken Links, Errors and Duplicate Content http://tinyurl.com/bg83jp

    [Reply]

  17. Hakicoma (Catchtheposts.com) on 12.31.1969 at 11:59 pm | permalink
  18. Crawl Your Site for Broken Links, Errors and Duplicate Content http://tinyurl.com/bg83jp

    [Reply]

  19. jen_star (Jen) on 12.31.1969 at 11:59 pm | permalink
  20. RT @TyDowning: When is the last time you checked your site for broken links? http://tinyurl.com/bg83jp

    [Reply]

  21. Brandswag (Brandswag Corp) on 12.31.1969 at 11:59 pm | permalink
  22. RT @TyDowning: When is the last time you checked your site for broken links? http://tinyurl.com/bg83jp

    [Reply]

  23. TyDowning (Ty Downing) on 12.31.1969 at 11:59 pm | permalink
  24. When is the last time you checked your site for broken links? http://tinyurl.com/bg83jp

    [Reply]

  25. cavendochris (Chris) on 12.31.1969 at 11:59 pm | permalink
  26. Checking your site for broken links: http://tinyurl.com/bg83jp Xenu has always been my personal favorite

    [Reply]

  27. jcie (J.Cie Consulting) on 12.31.1969 at 11:59 pm | permalink
  28. Crawl Your Site for Broken Links, Errors and Duplicate Content http://is.gd/nfqJ

    [Reply]

  29. Barry Welford on 03.10.2009 at 1:02 pm | permalink
  30. Yes, Xenu does the trick. Excellent tool.

    To know that you need it, do a regular review of your website via Google Webmaster Tools. It will warn you if you have some of these broken links, incorrect redirects, etc.

    Barry Welford’s last incredible blog post..Since Google Is In Mountain View, CA, Is It A Guru?

    [Reply]

    Loren Baker Reply:

    True Barry, I like using various resources to get a diverse report on the errors and issues with a site, sometimes one will find many more issues than another.

    [Reply]

  31. Steen Ohman on 03.10.2009 at 1:14 pm | permalink
  32. XENU is the best tool I have found for broken links. But must say Webmaster tools are more user friendly for the average webmaster.

    Still it’s a very overlooked topic … keeping your site clean from broken links and duplicate content.

    Normally I review the indexed pages in Google once a month to check for duplicate content - and to check tje quality of the snippets/meta description.

    Steen Ohman’s last incredible blog post..Danish Site Search!

    [Reply]

  33. Kevin Boss on 03.10.2009 at 2:17 pm | permalink
  34. I use XENU on a daily basis. Hands down one of the greatest tools in my arsenal.

    [Reply]

    Duane Brown Reply:

    Checking it on a daily bases is not overkill?

    Duane Brown’s last incredible blog post..Skittles + Social Media = New Power Couple

    [Reply]

  35. Duane Brown on 03.10.2009 at 2:49 pm | permalink
  36. Thanks for the tip and free tool. I’m going to use it this week, I hope, to check my own sites first and then bookmark it for future client work one day.

    Duane Brown’s last incredible blog post..Skittles Scandal: Tracking the Skittles Experiment [Friday March 6th at 11am]

    [Reply]

  37. Josh Millrod on 03.10.2009 at 4:03 pm | permalink
  38. SEOmoz also has a great crawl tool. It’s one of the only free one’s they offer, but it’s super solid.

    http://www.seomoz.org/crawl-test

    Josh Millrod’s last incredible blog post..The 4 Rules of SEO for Online Journalism

    [Reply]

  39. Junaid Ahmed on 03.10.2009 at 4:39 pm | permalink
  40. Thank you so much for this great tool, I’ll see if I can use it on my mac if not then will have to use the windows based tool.

    Junaid Ahmed’s last incredible blog post..Twitter Power

    [Reply]

  41. mihai on 03.10.2009 at 4:51 pm | permalink
  42. incredible tool.but what can i use to check my poosotion in google? i ve tried advancedwebranking but it s too expensive (about600$). what can i use?please i invite you to email me and give me a good solution. on another hand great blog, intresting posts.

    [Reply]

  43. Josh Millrod on 03.10.2009 at 6:41 pm | permalink
  44. SEOmoz also has a pretty great rank check tool. I swear I don’t work for them (I only dream about it). I just dig the tools.

    Josh Millrod’s last incredible blog post..The 4 Rules of SEO for Online Journalism

    [Reply]

    Loren Baker Reply:

    LOL, you can plug SEOmoz all you want, we love them!

    [Reply]

  45. Steen Öhman on 03.10.2009 at 8:30 pm | permalink
  46. For ranking check I use SEOmoz and IBP - International Business Promoter (dont know if a link is ok here - so find it on Google).

    SEOmoz have some good tools, but are not very strong if you want to digg down in the different european markets.

    But … as allways have more than one tool in the box, and use the best one for the job.

    Steen Öhman’s last incredible blog post..Media status marts 2009

    [Reply]

  47. Dana Lookadoo on 03.10.2009 at 9:08 pm | permalink
  48. A vote for Xenu, Google Webmaster Central and SEOmoz tools.

    Use Google Analytics to track 404 error pages, which can result from broken links. Info for setting up tracking:
    http://www.google.com/support/analytics/bin/answer.py?hl=en&answer=86927

    @mihai, @Steen Ranking varies based on demographics, keywords, and history of your Gmail account if searching while logged in. Searching Google is free, and you can use a proxy server to check ranking from various IPs. (Wouldn’t spend too much time on ranking reports, IMHO.)

    @Loren, many thanks for Virante tip for duplicate content!

    Dana Lookadoo’s last incredible blog post..Twitter Engages Professional Cycling Fans

    [Reply]

    Steen Öhman Reply:

    True .. forgot to mention … you have to log-off from Google before checking rankings. It can have quite a difference.

    A good tool will automaticly do this … so your ranking results are not biased by your profile.

    But remember … your client could be on the Google account .. an will then .. maybe get another ranking!

    Steen Öhman’s last incredible blog post..Media status marts 2009

    [Reply]

    Dana Lookadoo Reply:

    Correction: “demographics” should have been “geographics”

    Dana Lookadoo’s last incredible blog post..Twitter Engages Professional Cycling Fans

    [Reply]

    Loren Baker Reply:

    No problem Dana! Thanks for your feedback and additions :)

    [Reply]

  49. Robert Publer on 03.11.2009 at 5:27 am | permalink
  50. Good review, i’m going to use it this week, I hope, to check my own sites first and then bookmark it for future client work one day.

    [Reply]

  51. BG Mahesh on 03.11.2009 at 5:39 am | permalink
  52. Linkscan from elsop.com is a great tool but it is not free.

    The other tool we have developed works on the Google Webmasters Tools error reports which I believe you will find it very useful. Please see http://www.greynium.com/tools/gwt/

    BG Mahesh
    http://www.greynium.com

    BG Mahesh’s last incredible blog post..Is the naming industry in India non-existent?

    [Reply]

    Loren Baker Reply:

    Thanks BG, here’s the direct link to Elsop : http://www.elsop.com/

    [Reply]

  53. Mahesh on 03.11.2009 at 7:14 am | permalink
  54. Thanx for the tool.. Trying it now!

    [Reply]

  55. iCan't Internet on 03.11.2009 at 8:11 am | permalink
  56. Excellent tools that do the job!
    These are links that every webmaster should have bookmarked…

    iCan’t Internet’s last incredible blog post..Advertising on Facebook

    [Reply]

  57. Matt Evans on 03.11.2009 at 2:32 pm | permalink
  58. Xenu is a great tool I’ve used for years. Even when I worked for a top SEM Agency it was a tool we used everyday.

    Matt Evans’s last incredible blog post..Optimizing Local Business Listings Online

    [Reply]

  59. Lisa on 03.11.2009 at 3:11 pm | permalink
  60. XENU for broken links, definitely. For duplicate content I like GSiteCrawler.

    [Reply]

  61. Lohith on 03.12.2009 at 5:55 am | permalink
  62. Thanks for this new duplicate content tool.

    [Reply]

  63. komikya on 03.13.2009 at 2:14 pm | permalink
  64. i like it.))

    Lig tv izle

    komikya’s last incredible blog post..Lig tv izle

    [Reply]

  65. Nate at Plasticprinters on 03.13.2009 at 3:27 pm | permalink
  66. I love Xenu! It’s a great tool, can’t say enough about it.

    [Reply]

  67. AndyW on 03.13.2009 at 7:41 pm | permalink
  68. I use SEOmoz Crawl Test although it is part of their paid package

    AndyW’s last incredible blog post..Review of ezBusinessNeeds DoFollow Searcher Tool

    [Reply]

  69. Online Internet Faxing on 03.13.2009 at 8:52 pm | permalink
  70. Great post. I so love Xenu for both an SEO tool as well as for web development so that I can track all the original pages from a website has been included in the new one. Cheers.

    [Reply]

  71. Bill Cook on 03.13.2009 at 9:23 pm | permalink
  72. copyscape.com is a great tool for checking dup content. I also find it useful when reviewing copy that has been outsourced.

    Bill Cook’s last incredible blog post..The American Sugar Alliance Sweetens its Presence on the Web

    [Reply]

  73. Glenn Crocker on 03.15.2009 at 9:11 pm | permalink
  74. I use Xenu and Web CEO for this, but recently started using a neat tool from Microsys Tools called A1 Website Analyzer. It will spider sites and do much of the same tasks as the others, but will also export title/meta-description (handy for cleaning problems up and sending to IT folks to implement). But the best thing it has is an on-site PageRank simulator that helps identify problems with link structure. Neat stuff!

    Here’s a recent article I wrote about using it for PageRank Sculpting.

    Glenn Crocker’s last incredible blog post..PageRank Sculpting: Why Your Home Page has Low PR

    [Reply]

  75. Nick Stamoulis on 03.16.2009 at 4:51 pm | permalink
  76. You should always take the time to go over your website and make sure all your links are working. It is always the worst when a client brings it to your attention that you have a broken link.

    Nick Stamoulis’s last incredible blog post..Interest-Targeted Content Network Advertising

    [Reply]

  77. Matt on 03.17.2009 at 4:50 am | permalink
  78. here you go folks .. try our broken link checker.

    enjoy

    [Reply]

  79. Anish K.S on 03.17.2009 at 9:08 am | permalink
  80. Xenu is good, me using w3c link checker.

    [Reply]

  81. nette on 03.19.2009 at 4:52 am | permalink
  82. SEO Experts, here’s something very interesting and helpful
    http://www.thesiteflingmafia.com/

    [Reply]

  83. ligtv izle on 04.18.2009 at 6:51 am | permalink
  84. thank you mübarek

    ligtv izle’s last incredible blog post..Ankaraspor Fenerbahçe Maç? Saat 20-00?da Canl? Yay?nlanacakt?r..

    [Reply]

  85. Oliver on 05.18.2009 at 1:00 pm | permalink
  86. This looks like a really good tool and something I will have to try out soon to check the sites that I put live. Thanks for some great information, this will be handy.

    [Reply]

  87. Matt on 05.18.2009 at 11:21 pm | permalink
  88. nice post - you may also want to try our broken link check tool, it’s free!

    thanks
    Matt

    Matt’s last incredible blog post..Link Building - The Fundamentals

    [Reply]

  89. Submitter on 06.09.2009 at 5:28 pm | permalink
  90. Indeed, using a link scanner is a good idea to scan for broken links. Personally I prefer the more passive approach.

    I start a http sniffer and then go through the web pages of my sites. I click through links and when I am done, I pause the sniffer and check the log. I will immediately see the 404 response headers. I know that this method is less user friendly and could seem more complicated than anything out there, but this is how security experts do it.

    Submitter’s last incredible blog post..GET OUT OF DEBT FAST AND FREE

    [Reply]

  91. tolga on 07.01.2009 at 11:57 pm | permalink
  92. Thank You …

    tolga’s last incredible blog post..Öyle bir geçer zaman ki

    [Reply]

  93. sxe on 07.01.2009 at 11:57 pm | permalink
  94. Thank you admin..

    sxe’s last incredible blog post..Lolipoplar

    [Reply]

  95. a?k ?iirleri on 07.07.2009 at 11:18 am | permalink
  96. thank you.

    [Reply]

  97. news on 07.15.2009 at 7:12 am | permalink
  98. i can enhance your SEO and rankings substantially, especially if you have any critical errors which are keeping your site from ranking properly in the search engines.

    [Reply]

  99. Curious George on 07.21.2009 at 10:13 am | permalink
  100. What about 200 redirects?
    Are they ’search engine friendly’ with SEO in mind?

    cheers
    G

    [Reply]

    Ann Smarty Reply:

    200 is the header status that sends “OK” response. It is required for any working URL.

    [Reply]

    Curious George Reply:

    Many thanks Anne,
    That’s sorted me out, cheers.

    [Reply]

  101. tim viec on 08.06.2009 at 8:21 am | permalink
  102. nice tool, thanks so much for sharing goog job.

    [Reply]

  103. nazcar on 08.15.2009 at 3:47 am | permalink
  104. thanks.. Xenu is a great tool

    [Reply]

  105. Facebook Connect Developers on 08.17.2009 at 9:12 am | permalink
  106. Hey its really helpful for webmasters. Nice info. Thanks

    [Reply]

  107. canl? maç izle on 09.20.2009 at 7:16 pm | permalink
  108. it’s really good information but i want to know about the effects.

    [Reply]

  109. sxe on 01.09.2010 at 2:26 pm | permalink
  110. Thanks man

    it’s really good information but i want to know about the effects.

    [Reply]

  111. ps2 ?????? ? ??????? on 02.04.2010 at 1:49 pm | permalink
  112. Hallo everybody.

    Very creative ways to use Twitter.
    I´ll do it also.
    Greetings

    [Reply]

Leave a Comment

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Bad Behavior has blocked 989 access attempts in the last 7 days.