Duplicate content is when a webpage or piece of content has the same (or substantially similar) wording as another page on your website. This is bad for your SEO because Google won’t be able to understand which version of that content is the original one or most relevant to a user searching for it.
Duplicate content also negatively impacts your business brand and your audience’s perception of how knowledgeable you and your employees are. If you have duplicate content on your website, you should commit effort to fix it right away. Keep reading to learn what is duplicate content, how to find it, and what to do about it.
What Is Duplicate Content?
Duplicate content is when you have the same groupings of sentences or paragraphs–or a completely duplicated piece of content–on two or more different URLs on your website. If someone browses your site and sees some content on one page, and visits your site later only to find similar content associated with another URL, this is duplicate content.
Why Is Duplicate Content a Problem?
I’ll use an example that I’m sure we all can relate to. Let’s say you just finished a book and you want to find a new one to read. You send a quick email to 10 friends asking them for suggestions and then you anxiously await their replies.
In this example, the best possible outcome is to get 10 different book titles that you can review and decide which one to read. The worst possible outcome would be to get the same exact book title from all 10 of your friends and upon reviewing that title decide it’s not for you, and you have to ask again. In other words, you want to get a variety of different options in order to maximize the chances that you’ll find a book that fits your criteria.
This is similar to Google’s approach to their search results page. Google knows if they give their searchers 10 webpages that are nearly identical, then that severely hurts their chances of providing the searcher with the information that fits her criteria. How frustrating would it be to click on all 10 web pages in the search results only to find the same information?
To combat this problem, Google’s algorithm prevents duplicate pages from showing up for the same search phrases. That means, if your page is nearly identical to another webpage online (either on your site or on another website), then only one of those pages will rank in Google.
Duplicate Content Can Hurt Your SEO
Google openly frowns upon duplicate content; it’s something every SEO knows about, and often within a few months of doing SEO. Years ago, black hat marketers used to publish tons of web pages on a single site, hoping to gain more traffic by way of sheer content volume.
Today, Google and other search engine algorithms are way too smart for that. You can’t publish the same content multiple times and hope to trick search engines or users into thinking your website is more authoritative and original than it actually is.
Google now points users to the most relevant results for their queries. It does this based on context from (literally) hundreds of ranking factors that determine how successful individual pieces of content become. Google looks at:
- Publication date
- URL length and content
- Page’s topical completeness
- Domain authority
- Keyword selection
- Keyword density and placement
These aren’t the only factors Google uses to determine how well a page ranks, but they are among the more important factors. The more original and authoritative your content is within these parameters, the less likely your content is to ever suffer from duplicate content perception.
Duplicate content causes confusion for both users and web crawlers.
How to Tell If Your Site Has Duplicate Content
The free way to determine if your site has duplicate content is to go through it page by page and make sure there isn’t any identical text. If you see a few sentences or series of words here and there that are the exact same, don’t worry about that. You can also take snippets of text and search them in Google in quotes to see if there are other matches.
Google is intelligent enough to know that a few words strung together across multiple pages isn’t real duplication. Once you see entire sentences or paragraphs being repeated, this is your clear sign. (If you’re seeing more problems than duplicate content, it’s wise to conduct a complete SEO content audit.)
Copyscape is our preferred tool for checking for duplicate content. Copyscape is paid but affordable and their “batch” search allows you to quickly see if your website has any duplicate content. Simply create an account, buy credits, and then put your important URLs through the batch analysis tool. Copyscape will show you a list of websites and URLs that have duplicate content if any.
Take note of these sites so you can figure out why any duplicate content was created in the first place. Sometimes it’s a pure accident; other times, it’s proof of someone in your company trying to cut corners – and sometimes you’ll find that another website has copied your content.
Once you’ve identified any instances of duplicate content, you can address them immediately. If you don’t have any duplicate content on your website, celebrate! That means you and everyone at your company understand the value of top-shelf content marketing practices.
The Causes of Duplicate Content
If you have found duplicate content, it helps to understand how it occurred in the first place so you can prevent it moving forward. Here are the most common causes of duplicate content:
- Your content is not original. Before you create a single word of content on your website, you want to make sure you and your team have the time and space to do original work. Most people are ethical and know that all content should be original. Typically the only causes of duplicate work are people who feel burdened to complete lots of projects in a short amount of time. Give your team enough bandwidth to complete work, and you’ll likely never see duplicate content.
- Your content is not structured well. Often times even great writers can miss crucial SEO elements, like making sure that your subheadings are used properly (and that they’re original, just like your content should be). If you use the same headers on all of your pages, this is a form of duplicate content. Google will see unique content and keywords but similar headings, and as a result, downrank otherwise valuable content that could be driving you fresh traffic.
- Your content has never been optimized. Though duplicate content and duplicate SEO optimizations aren’t the same, they often overlap. For example, if you use the same anchor texts for all of your internal links, Google eventually notices. As a result, your content won’t perform as well as it could. Make it a point to leverage all of the SEO ranking factors available to you.
How to Fix Duplicate Content
At this point, I know you’re hoping I’m going to give you the one, quick-and-easy, push-button solution. Unfortunately, it’s not that easy and the correct solution depends on your situation. Here are some possible solutions to help you make the best choice:
1. Revise the content. If you’re going to keep the pages on your website, but the content isn’t original enough, then you’ll want to revise the content to be more original.
2. 301 Redirects to combine pages of similar content. A 301 redirect would automatically forward people from one duplicate page to the other. By 301 redirecting, you effectively eliminate the duplicate content because now there is only 1 page that can be accessed online. This can be a good solution if you have multiple duplicates or overly similar pages that should be combined into 1 page.
3. Rel=canonical tags to tell Google which page is the “main” page among similar pages, without redirecting. You can use the rel=canonical tag on your duplicate pages. This tag tells search engines which URL of your duplicate pages is the primary page that should be included in the search results. The code to add is <link rel=”canonical” href=”http://www.yourdomain.com/the-url-you-want-to-rank-in-Google.html” />.
4. “No Index” to hide duplicate pages from Google. You can add <meta name=”robots” content=”noindex” /> to one of the duplicate pages to tell all search engines you do not want the page included in their search results. In this case, you’re not redirecting the traffic, you’re simply removing one of the pages from being considered in the search results.
As you can see, some of these options are fairly technical so we recommend talking with your web developer to see which one makes the most sense for your website.
How to Prevent Duplicate Content In the Future
Duplicate content must be removed from your site manually, but you’ll want to use a tool like Copyscape moving forward to spot it if it ever occurs again. That way, you can remove it quickly and avoid getting penalized.
The best way to prevent duplicate content in the future is to set standards for publishing new content on your website. Insist that all of your website content be original.
Another Related Challenge: Keyword Cannibalization
Even if your content is not word-for-word duplicate, you may have multiple pages with very similar topics. And this can also make it harder for Google to decide which page on your website should be ranking in Google. This issue is known as keyword cannibalization.
For example, if you have multiple blog articles on similar topics, you should conduct a blog content audit to decide which articles you may want to combine. (As a specific example, when we published this article, we combined an older article about duplicate content with this new draft, so that we wouldn’t have two articles about very similar topics.)
The Bottom Line
Duplicate content is one of the biggest challenges for SEOs, digital marketers, and small business owners. Copyscape.com is our favorite tool for finding duplicate content. If you have duplicate content on your site, you should take action and remove it as soon as possible.
Need Help with SEO?
At SEO Windy City, we offer monthly SEO management services including ongoing auditing, technical fixes, content development, consulting, and reporting.