Jill Whalen has written an excellent column at Search Engine Land, looking at what she calls the ‘Duplicate Penalty Myth’.
She argues that penalties for duplicate content are not handed out by search engines, at least not in the way most people think.
Jill argues that, while the search engines don’t appreciate duplicate content, they don’t punish specific websites that happen to have some duplication. If your site has some duplicate or very similar content, the likelihood is that the duplicate page won’t show up in searches.
When you input a search query, Google and the other search engines want to show you as much content as possible which is unique and relevant to your search.
To achieve this, Google’s algorithm has to do battle with spammers using invisible content, copycat scraper sites, technically poor websites, and more.
Duplication of content is undoubtedly a problem for search engines when they are trying to return relevant results. To prevent searchers being presented with lots of results which return the same content, filters are necessary to present the best version of a page.
If you have two pages on your site with duplicate or extremely similar content, the likelihood is that one version of that page will show up on search engine results, and the other will not.
Penalties are reserved for those sites which are deliberately trying to fool the search engines in various ways, and those who have been penalised in such a way probably will know why they have been punished.
The main types of duplicate content which are filtered out by Google are:
One duplicate content issue that may concern some is the republishing of online articles. In this case, one of the two versions may be filtered from search results, but this will not incur a penalty.
Essentially, Jill is arguing that Google is merely trying to provide variety of content in its search results, and is not looking to penalise you for duplicate content unless you are obviously a spammer.