For many of us working in technical SEO for an agency, the first stage of any new client win is to perform a site audit. Whilst many agencies will have their own procedures most experienced professionals will start with crawl-related checks and research. If a site (and pages therein) can’t be crawled, then they won’t appear in any search engine index and if there’s nothing indexed, then there’s nothing to optimise. (So let’s all go home and play CoD.)
As part of the checks and diagnostic procedures we make sure to cover at theMediaFlow, we have a look at any reported crawl errors in Google or Bing Webmaster Tools. I want to share with you a recent example of some unusual errors found as part of such a process; what caused these errors and how to fix them.
URLs with Company Phone Number Appearing as 404
In this part of the process I was looking to identify how many types of 404 errors were in play, (rather than instances of 404 error) and noticed that many of the thousands of total reported 404 errors followed a particular format. To refresh our memories Google Webmaster Tools reports the URL path, post domain…
For the example site in question many of the errors appeared as follows:
——
0800%20111%201111
normal-directory/0800%20111%201111
nothing-to-see-here/normal-directory/0800%20111%201111
honestly-Im-okay/nothing-to-see-here/normal-directory/0800%20111%201111
——
The eagle eyed amongst you will have spotted that there’s a common theme and that:
a) all these 404 errors have a number in the path
b) stripping out %20 (which means that a space has been encoded) would leave us with a phone number format #### ### #### (i.e. four digits, three digits, four digits)
and c) that such phone number formatting used with 0800, 0843, 0845 and other non-geographical types are often used as customer service phone numbers.
So… you may know where this is going… For every URL on the site, there appeared a second version, with an appended directory – that (directory) being the addition of their own phone number.
What time is why?, taken from Know Your Meme
Given that the symptom reported (i.e. a 404 error URL for every genuine URL) logic would suggest there was an error in the mark-up around the phone number in the site header area as opposed to anywhere else it might occur, so this was my first port of call.
In Chrome and using Inspect Element to look at the isolated element (mark-up of the phone number) everything looked hunky-dory. Schema>Organisation mark-up was in place with the correct itemprop, (itemprop=”telephone”) so nothing of concern; however when I looked elsewhere in the code I found the following well-intentioned use of a href to phone number (for click-to-call) mark-up.
Now; referencing the phone number as per above was facilitating click-to-call functionality, so any front-end testing they may have done would show positively that a smart device user could click the phone number to call the company. However, due to the omission of the tel: instruction the syntax had the side-effect of creating a relative file path to the phone number as appendage to the existing URL. Hence generating thousands of annoying 404 errors that could easily be avoided, making for a much more effective crawl process.
Correct Click to Call Mark-up
To correctly reference the phone number and effect click-to-call without generating 404 URLs due to relative file path annoyingness do as follows:
The important part here is the addition of the tel: instruction, as it is the omission of this that also creates the relative URL and thus generates our 404 errors. The addition of +44 (UK dialling code) was an optimisation so that the click-to-call would connect regardless of location. I found
this piece on click-to-call links really useful background reading, particularly that there’s a list of additional native app URI schemes too!
So, not exactly a ground-breaking error or a revolutionary fix that will rocket this site up the SERPs overnight; however this was one of those weird quirky consequences of a simple code omission that could have hindered crawlers and inhibited the business progress a little. I thought it might be worth sharing as the cause of why these 404s appeared wasn’t immediately obvious.
With thanks to Joost for confirming my logic on the syntax omission and for sharing my belief in relative URLs as the root of evil
Tags: technical SEO