Pagination can cause some serious indexing and ranking problems, yet it is sometimes hard (if not impossible) to avoid. For large database-driven sites and web catalogs it creates duplicate content problems (with multiple pages within one category / tag having one and the same title tags), as well content discovery issues (with the crawl not willing to go deep to the site and thus discovering product pages listed on pages 10-20 or deeper).
There’s one possible solution that might help crawlers better understand the site paging structure and thus treat it accordingly (for example use the pages for discovery without indexing them).
rel=”next” and rel=”prev” are used to describe the position of a document within a series of documents. These link definitions may be used by crawlers to better understand the web site navigation.
<HEAD> ...other head information... <TITLE>Chapter 5</TITLE> <LINK rel="prev" href="chapter4.html"> <LINK rel="next" href="chapter6.html"> </HEAD>
Even if they are not used for navigation, these links may be interpreted in interesting ways. For example, a user agent that prints a series of HTML documents as a single document may use this link information as the basis of forming a coherent linear document.
So far it is unclear how Google and other search engines treat the link attributes but who knows, these attributes won’t hurt for sure.
Here are more tips on avoiding duplicate content with pagination I offered earlier:
- Add a different portion of the title/ description to the beginning. e.g. <title>A-G Blue Widgets</title> (not always possible);
- Add NoIndex meta tag to the each page except the first one to keep search engines from indexing the pages but still allow them to crawl and follow the links (PageRank is still leaked in this case).