Well I’m back from Big Digital Adelaide. My presentation was hilarious because we had a Google Apps fail and 80% of my slides were devoid of images. Which is a little challenging when you are a low text bloke. Anyway hopefully I was semi-coherent for all who attended. My presentation WITH images is at the bottom of the page 🙂 Check out #BigDigitalADL on Twitter, lots of awesome tips and links there.
Welcome back Rankers! I’m coming to you from Adelaide this week. I’m at the Big Digital conference and so far, it’s been pretty awesome. This morning we heard from some great speakers. We had people talking about A.I. and we had an excellent presentation from Haigh’s Chocolates. If you’re not familiar with Haigh’s Chocolates they’re yummy and probably Australia’s premium brand. It’s probably the only premium brand of chocolate in Australia that’s been around for over 50 years or something.
Interesting presentation and they’re moving to a Magento 2.0 site in the not too distant future. The reason I’m here is to do a presentation on site migration, so I’ll be looking at three different types of site migration. We’re going to look at moving from an old CMS to a new CMS, or maybe you’re upgrading the whole site or a whole new look or feel. So that’s one type of site migration.
The reason I sit those two together in the same bucket is that you need a similar skillset to handle that type of migration. The other two types of migration are HTTPS, which is on everybody’s lips at the moment, and then the one I’ve just gone through, which is moving domain names. That’s another type of site migration. They all have different functions, and they’re all slightly different skillsets, but essentially they are site migrations.
One of the things I was saying to Fiona from Haigh’s who gave the presentation, was that when she does go to Magento 2.0, and awesome speed enhancements on the 2.0, to have a look at the index bloat that the site’s got because you have to fix those things up. When you do launch a new site, don’t launch with the index bloated.
For instance, when I do a site:search for Haigh’s Chocolates, I find 18,800 results. Now I’m thinking that they don’t have that many chocolates, as that’s a lot of chocolate. I drilled down a little bit further and what I found was that the search function on the site is auto-generating all these pages and Google is indexing them. Typically, what we would do in this situation is just block the bot from that search function altogether. I think, back in the day, it was almost a Terms of Service thing where you should block Google from search function on a site.
What I’ve done here is isolate the URL path that is generating all these empty URLs. You can see there are 13,500 of them. What you want to do is remove all of those before you launch if you have a similar sort of issue. It’s a really simple thing to do. Firstly, you have to stop the bot getting into that part of the site and that function. Once you’ve done that, however you want to do it, either robots.txt, noindex tags, whatever it might be, then you can go and do a URL removal. You firstly need to make sure it’s noindexed and make sure Google basically can’t re-index it or re-crawl it. You could do nofollows, but that isn’t guaranteed that Google won’t re-index it because if Google finds it through another link that’s not nofollow, then it will re-index it. Similarly, you can’t just do canonicals here because depending on how the site is structured, canonicals will only work well if you are using them on pages that you are also not giving extra authority to in the site.
For instance, we’ve had clients where they’ve applied the canonical tag to a whole set of pages that were duplicates of individual products. Basically, they were pages that were sorting results differently, or listing products in a different way. Essentially, it was the same content. And so they applied canonicals to all of these, which is what Google said to do, but unfortunately, because of the way the site was designed, it ignored the canonicals. So just remember that canonicals don’t work all the time. They’re purely a suggestion for Google if you like. It’s not a directive like a noindex. It is a suggestion. The reason that Google was indexing all these pages that had been canonicaled was because they were linked to on every page of the site. So google was saying, “You’re linking to all these pages with canonicals, so they must have authority, they must be important pages on your site if you’re linking to them. Therefore we’re going to ignore that canonical tag and put them all in.”
So that’s all about site migration. One other thing I learnt this week too if you are doing a change of domain, moving domains, and you are doing the constant site: check on the old domain to see if the index has gone down. When you find it hovers, keep an eye on search console. When you see impressions at a point in search console where you think that’s negligible impressions. I’ll show you what I mean.
Go to your search console and go to your old domain where you will see a lovely little graph that looks horrific, as it’s meant to. It should look a little like this one.
That’s what you want to see, as it’s the old domain that we’ve moved away from. When you see the impressions get to a negligible level, so for me that was about 13 impressions, over the course of a period. That’s when I decided to do a manual removal of the entire domain. That was scary as hell! One of the concerns when you do that is whether it will remove the new domain as well because they’re linked, and there’s the change of address! It doesn’t. It’s fine.
What you don’t want to be doing is that same thing in a HTTPS migration though. If you’re looking at how many HTTP results you’ve still got left in the search results after you’ve moved to HTTPS, don’t go and manually remove the HTTP results. Because you will also manually remove your HTTPS results, because remember it’s just a protocol and you’d essentially be removing the same one. Whereas in this case it’s not, as it’s two separate domains, so you can actually remove the old domain completely when you’re happy that the impressions are down enough. It also stops you looking every day and wondering why certain pages are still indexed. It just gets rid of that out of your head.
Hopefully that’s helpful. I’ll see you back in Melbourne. Bye for now.