Duplicate Content - What It Is, and How to Avoid It
One term thrown around a lot in internet marketing circles is "duplicate content" or "duplicate content penalty". This is the idea that having content on your site that is significantly similar to existing content on the Internet will cause your site to perform badly in the search engines (most notably Google). In this lesson we look at Duplicate Content, and How to Avoid It.
Is there actually a penalty?
The word "penalty" is something of a misnomer - you're not going to drop one hundred places in the search listings just because it displays duplicate content. However your site will not be rewarded for posting duplicate content - in other words, it will not do well in the search engines. Typically content that is deemed to be a duplicate will appear in the 'supplemental results' of a web search (the section that says "pages similar to the ones shown have been omitted") and will therefore not drive any traffic to your site. That's the "penalty".
Avoiding duplicate content penalties
If you're submitting an article to another website and using it on your own site, you obviously want to have your own site appearing in the search engines for that content. The best thing to do is reword the article (either the one on your site or the one you're submitting) so that it's significantly different. This completely removes the need for the search engines to send one of the articles to the supplemental results and gives both articles the chance to rank well in the search engines: win-win!
Your other "option" is to try to predict which of the identical versions the search engines will smile upon and give the listing to. Will it be the first one they index? Will it be the site with the highest PageRank? Will it be the most relevant site? Theories abound, and everyone has one.
How different is "different"?
In this situation it’s a case of "better safe than 'supplementalled'". While rewriting an article to be 20% different to the original will keep the search engines happy now, we recommend rewriting to at least 50% different - this will keep you safe from future algorithm changes, because if there's one thing you can be sure of, it's that the search engines aren't going to relax their criteria, they'll only tighten them!
How can I tell if I'm different enough?
You can use a tool like the Affilorama Article Compare tool. It calculates percentage difference, and even shows you which sections of your article haven't been reworded yet.
DuplicateContent.net is another nice tool that compares two web pages. It offers two comparisons: "Text similarity" is how similar the pages are with their written text and estimates how likely the content is to be filtered out by the duplicate content filter; "Markup similarity" compares the HTML of the pages.
There are other ways of ensuring your web pages are seen as unique content by the search engines. In particular, make sure that the first 50 words are as unique as possible - you could, for example, you might want to add your own introduction to each private label rights article that you use. You could also add images to break up the text. You can get free images that you can use on your website from a Stock Photo Exchange website, such as http://www.sxc.hu (but be sure to check the terms and conditions!).
If you are running a blog, then comments from readers of your post may also help to increase the amount of unique content on your page.
In this lesson you've learned:
- A "duplicate content penalty" means that your content may not be seen
- It is recommended you rewrite your content by at least 20% (but preferably by at least 50%)
- You can make your content unique in multiple ways including:
- Making the first 50 words unique, e.g. with an introduction
- Adding images to break up text
- Getting comments from readers on a blog
- You can check how unique your content is by using tools such as: