Apr 05 2010

Indexing Barrier Identification

*This is a guest tip by by Peter Ulstrup Hansen.*

SEO consists of a mix of disciplines; link building, copy writing and code optimization just to name a few. At the base of these disciplines is indexing barrier identification and removal.

Indexing barriers are issues that prevent the search engines from crawling and indexing a website properly.

1. Use www.URIValet.com to check server headers and make sure all redirects are set up using server header redirect code 301. Example: Decide weather to use www or non-www for the website, i.e. http://www.domain.com or http://domain.com. Use 301 redirect to the canonical domain (the www or non-www version of your choice).

You can use URIValet as well, for checking loading times which should be as low as possible, less than 5 seconds on a 1.5 Mbps connection.

2. Duplicate content can cause problems for proper indexation, check for risk of duplicate content by using Google Search. Do a search for site:yourdomain.com, go to the last page and expand the search, again, go to the last page and look for duplicates. Another method, still using Google Search, is to copy a small passage from one of your pages and do a search, if the same content is represented at more URLs, there is a problem. Personally I don’t believe the

rel=”canonical” attribute is sufficient to avoid duplicate content.

Common duplicate content causes:

  • Print versions of articles (copy) have separate URLs
  • PDF versions of articles (copy)
  • Session IDs, Click IDs, Time stamps and other IDs in URLs
  • Breadcrumb URLs
  • Site search result URLs
  • Paging

3. Use Web Developer plugin for FireFox to check navigation and internal link structure. Search engine spiders read most JavaScripts, but don’t execute the scripts. As a result JavaScript (including AJAX) and Flash menus prevent the spiders from crawling the website in an organic way.

There might be other indexing barriers but these are the most common.

3 Responses to “Indexing Barrier Identification”

  1. Fedt! Ulstrup er kommet med et g?steindl?g p? Daily SEO Tip – http://dailyseotip.com/indexing-barrier-identification/727/ – tillykke!

  2. Dave Higgs says:

    Very interesting.

    I knew about the flash/ajax etc but had not thought about PDF as duplicate content. Must check :)

    Another place people often make the mistake is putting text into images.

    Cheers,
    Dave

  3. best plugins for wordpress says:

    Thanks for describing step by step. The tips will help a lot to the people who concern about SEO.

    We have listed the best SEO plugins for wordpress in our website.Few of them might be very interesting for you too if you use wordpress platform often.

    Greetings from Germany
    Hendrik