Last week we noticed that one of our clients had a page deindexed by Google. It happened to be a duplicate page. With the continual roll outs of Panda, we have seen that content duplication is a signal of a bad quality website and now it looks like Google is trying to clean up its index by deleting those duplications. Problem is; you have no control over what it chooses to delete. So now it’s time to get rid of those duplications before Google starts to get rid of chunks of your site.
Using webmaster tools check Search Appearance> HTML Improvements> Duplicate Title Tags. Quite often if you have pages with duplicate titles it can be an indication that the page itself has been duplicated. We see this a lot in ecommerce sites. For example the image below shows that Google has found 4 pages with a duplicate page title. The pages in question have the same text content, the only difference is the images on the page. Google should not be crawling the parameter that is generating different versions of this page.
In the above example I would Google site:websiteaddress inurl:parameter . This reveals that the problem is not just limited to 4 pages as Google is reporting but there is at least another 52 it has indexed that are duplicates meeting this criteria.
Grab a sentence off one of your pages “put it inside quotation marks like this” and Google it. If it is duplicated on your own site go and check those pages and work out why it is duplicated. If it is duplicated on other sites you may want to rewrite your copy so that you are unique.
There are also paid services out there you can use but Google is usually pretty good in finding duplication. What do you use to find duplicates? Are you seeing Google deindexing?
Jim’s been here for a while, you know who he is.