Skip to content Skip to footer

A BEGINNER:S GUIDE TO DUPLICATE CONTENT

What is duplicate content?

In plain words, Duplicate Content is exact or similar content available on multiple URLs. As a reader, you might get information from both the sites, but the search engine has to pick which webpage to show because there’s no point showing similar content twice. The serious problem with this is that the search engines do not know which site to list higher in search results. This might result in giving a preference to other websites over both of them.

How is Plagiarism different from Duplicate Content

Plagiarism is stealing someone’s content or idea and later claiming it as your own. Whereas, Duplicate Content is when the same content is present on two different webpages. Simply put, even if you quote someone with proper credits, it is still considered as Duplicate Content by search engines. Crediting the source will not change the fact that it is still duplicate content.

What causes Duplicate Content

HTTP vs. HTTPS

Imagine a case where your site is accessible by following two options

http://www.yoursite.com

https://www.yoursite.com

This is a prevalent scenario. The same content served over both URLs is considered as Duplicate Content.

www vs. non-www

This is the oldest chapter in the book of Duplicate Content. It is important to let your visitors and search engine know which domain you prefer. If not, the search engine is likely to infer that your webpage is full of duplicate content. It may also push your website to lower ranks in searches and hurt your SEO

Session Ids

Every person that visits your site is given a unique Session ID to keep track of what they did on your site. These Session IDs are to be stored somewhere. The best way is to do it with cookies. But, when search engines do not use cookies, they store session IDs in the URL. It adds the internal links to the URL. Because every session is unique to each user, it may end up creating different URLs for the same web page, hence Duplicate content.

Scrapers

A BEGINNER:S GUIDE TO DUPLICATE CONTENT - 1

Content Scrapers have been there since the dawn of blogging. It is equivalent to identify theft. A lot of small websites tend to copy content from popular websites without proper credits. Additional to loss of competitive edge, it creates duplication of your content. The more frustrating part is when the scrapers can trick the Google Panda algorithm into believing that they are the original creator of that content. This way, your website gets pushed to lower ranks and not the thieves’.

Print-friendly and Mobile-friendly URLs

Print-friendly and Mobile-friendly URLs are different with the same content.

yoursite.com/page

yoursite.com/print/page

m.yoursite.com/page

All these URLs are unique, with the same content in the eye of the search engine. This explains why your site might be suffering the consequences of duplicate content.

Why is Duplicate Content bad for your website?

It creates unwanted URLs for a single webpage. Your organic reach might take a toll because users might be less likely to click on a URL that seems unfriendly.

It contributes to backlink dilution. As linking inbounds are considered while ranking sites, this can harm the digital marketing of your website

Sometimes, scraped content is likely to outrank the original content on your site and result in loss of traffic.

Is Duplicate Content Penalty real?

Let’s put this to bed once and for all; Google does not have a penalty for duplicate content. You might want to know that this is not the case every time. Google confirms that the ranking of a site may suffer if it is caught misleading users through creating duplicate content intentionally. It might also invoke Google to remove the site entirely from the Google index. Google answers the question- How does Google differentiate between intentional and unintentional duplicate content? here.

Long story short, duplicate content is most likely to hurt your SEO-even without a real penalty.

Final Thoughts

Relax. This is not as big an issue as you think it is. Most of the issues that lead to duplicate content have a quick fix. Serious Catch-22 conditions are issues like improper implementation of faceted navigation. They can create mayhem on your crawl budget.

Author Bio

This is Sharon Winget, Staff Writer with GoodFirms, a review and rating platform of top IT companies & software. A tech geek at heart, I firmly believe technology can transform societies. I enjoy blogging about web design, email marketing, and content marketing.