Duplication of content kills

by Jim December 12, 2014

I caught up with Darren Rowse (@problogger) last week for a recording of Beers Blokes & Business. I’ve always admired the way Darren goes about his work. Diligent, process driven but always very generous with his time and information. For a bloke that is generating 5 million plus page views a month, that is pretty cool. He mentioned casually after the recording that one of his flagship sites digital-photography-school.com was having issues with Google organic. New content didn’t seem to rank that quickly or deliver as much traffic as the old content did. I’d said I’d take a look. The only clues he could give me was that it seemed to be affecting new content and the recent Google updates did not seem to have any effect.

Darren Rowse on Beers Blokes & Business
Darren Rowse on Beers Blokes & Business

It felt good I must say to be able to give back to a bloke that has taught me so much. I only got to discuss this with his developers yesterday so the jury is still out on what effect may changes may have on his traffic from Google. What we do know though, is that the problems I found with his site are the most common that we see with large sites. Duplication.

Why Google Hates Duplication

As explained in this article by Google, duplication can cause multiple problems for a search engines. Back in the day it was because other sites were simply ripping orignal creators and ranking for their content, or they were spreading the same content across multiple domains and ranking in multiple positions for the same keywords just on different sites. This turned the search engines results into noise. No one wants to find 10 results that are exactly same.

Andie McDowell Is Google :)
Andie McDowell Is Google 🙂

Duplication within a domain is a problem too. What page should Google rank if you have 3 that are essentially exactly the same? It’s easier to keep all of them out of the search results. My guess is this is what is happening with digital-photography-school.com . When we fix duplication rankings ALWAYS come back. A lot of people talk about a duplication penalty but it’s more like simply excluding you because you are noisy.

Digital Photography School Duplication

In the case of DPS I found at least 18,000 pages of the site duplicated at a staging domain. This happens a lot. As part of a production process a site is placed on a sub domain so it can be vetted and approved before being pushed live. Usually these are blocked by the developers but it looked like in a recent change someone didn’t close the gate and Google got it in. As I explained a few months back you need to control Google’s access.

Fixing Staging Duplication

We’ve evolved how we do this over the years but the their are two ways to handle it depending on your situation. If your staging site has already been indexed by Google you DON’T want to redirect it. This is a common mistake. The reason it’s a noob error to do this is that it seems logical but in reality it will prolong your time in the search wilderness especially on a larger site. The best way to handle it if the content has already been indexed is to robots.txt block it AND add the NOINDEX tag to every page. This is what we call a belts and braces approach. If you wear both your trousers will never fall down. One fails the other kicks in to protect you from the unwanted exposure.  We did this less than 24 hours ago on DPS and Google has already ditched over 10,000 pages. That is a good thing 🙂

If only the home page of your staging site has been indexed it’s ok to redirect it.

Other Duplication

In the case of DPS the other duplication was all new content was being published with and without a trailing slash. Basically this means /blog-post and /blog-post/. We see that a lot in WordPress sites. The DPS guys did the right thing they were using the Yoast plugin for SEO and they had canonicalization enabled. As you will see from the video though, Google for whatever reason was ignoring it. Rather than try to work out why Google was ignoring it, is simply better to only have one of them published. The fix is simple for your developer. Permanently redirect one to the other.

The Results So Far

The question we always get asked in this situation is “How long till my rankings come back?” Of course Darren was no different. There are a lot of variables however in the case of DPS it’s happening really quickly from what I can see anyway. In addition to the staging site getting de-indexed the main site has gone from around 350k pages to 100k overnight. That’s pretty cool. I’ve actually never seen it happen that quickly before. Typically, ranking returns are directly related to the duplication disappearing. I’ll be very interested to hear what happens with DPS.


« | »
Thank you! Your subscription has been confirmed. You'll hear from us soon.