Thursday, October 01, 2009

Web Analytics For 404 Errors

404 errors are a fact of life on the internet. 404 error is a message returned by a server in response to a request for a page that does not exist on the server (http 404 error code).

Let me illustrate this with an example:

A visitor comes to a site, bookmarks a page and then leaves the site.
Next week the owner of the site decides to launch a new site completely replacing the old pages with the new pages (urls were different).
Next month the visitor comes back to the site via the bookmark but the page she bookmarked does not exist as it was removed during the site redesign.
She will get a 404 error message from the server.


Reasons for 404 Errors

404 errors can occur due to numerous reasons:

  1. Misspelled links – A misspelled URL in a hyperlink on the site can causing a broken link and hence a 404 error message.
  2. Bad Site Map – A site map is an xml file containing a list of all the pages on the site. It is usually meant for the search engines to index the pages on a site. A misspelling in a site map can cause the search engines to look for the pages that do exist on the server. Generally theses pages (broken links) won’t be visible to the visitors but they will show up in the search engines indexing report such as the Google webmaster tools. Some sites also show the site maps to the visitors as a form of site navigation; in those cases the visitors will see the 404 errors.
  3. Site Redesign – Site redesigns are a leading cause of the missing pages. Site owners redesign the sites, completely replacing the old pages without thoroughly thinking about the pages that might have been linked all over the web, indexed by the search engines, bookmarked by the visitors etc. Visitors who clicks on old links, bookmark etc. are greeted with the 404 error messages when they click on those links to arrive on the site.
  4. Sever Unavailable – 404 error messages can also occur when the server is unavailable.

Below is an example of a standard 404 error message

If standard 404 error page is the first page that a visitor sees when she arrive on a site, what will her reaction? As shown in the picture above, you can’t even tell which site this page belongs to. It is a dead end. Visitors don’t know where to go. What would a visitor do in this situation? She will most likely leave the site. She will go back to where she came from. The site has just failed to engage her.

Let’s imagine a similar situation in the offline world. Think about how you would feel if you enter a local supermarket looking for a toothbrush and are immediately taken to the location in the store where the toothbrushes aisle is suppose to be. When you arrive at that location, not only that you don’t find the aisle because the supermarket recently rearranged the store and move the aisle but also that the whole supermarket goes dark and all you see is the exit door. You will, for sure, run towards the exit door. That’s what a standard 404 error pages does, the site goes dark and the only thing a visitors sees is the back button or the close button on the browser.

Custom 404 Error Pages

Now imagine that instead of the store going dark, the customer sees a friendly associate who politely says “Sorry, we recently rearranged our store and the aisle you are looking for have been moved. May I show you the new location of the aisle” (or some flavor of it). Friendly associate on the web in this situation is called “Custom 404 error page (message)”, which will say “Sorry the page you are looking for does not exists anymore or has been moved, here are few links that might help you” (or some flavor of it).
A custom 404 error page allows the site to provide a message other than a generic server error message (Figure 1). A custom 404 is an opportunity for the sites to engage the visitors whom they might have lost otherwise.

How do you create a custom 404 error page?

Create a page with a message that you want your visitors to see when they encounter 404 error messages and save it as 404.html (you can use other names and the page extensions as well). Web servers have a setting which allows you to set the page that you want the visitors to see when they encounter the 404 errors. In this case you might set it to 404.html. (Contact your IT department or hosting companies to get further details).


Here is an example of a custom 404 error page

There are several ways to customize your 404 error page. Be creative when designing the 404 page, this is your last chance to reengage a visitors. (I will show you some more examples in a future post)

Web Analytics and 404 error page

Another benefit of creating a custom 404 page is that you can put your web analytics tag on the page to report and analyze the 404 pages. Web Analytics reports can show you the pages (links) that are causing 404 error messages on your site. You can also find out which pages have the bad links, what keywords, external links etc. are driving users to those non-existent pages.

Tracking 404 pages in Web Analytics

Here is an example of Google Analytics Code to track the 404 pages

This code appends “404:” in front of the page name that triggers the 404 error so that it is easy for me to filter the Google Analytics reports for the 404 error pages.

The same concept can be used in the other web analytics tools such as Omniture, Webtrends, Unica, Coremetrics etc.

There are two reports that I frequently use to analyze the 404 pages
  1. Top Content

    Since I prefixed my 404 pages with “404:”, I can easily filter out the 404 pages in this report. This report gives me all the pages that are triggering 404 error messages. This report also shows me how big the problem is and if I am losing visitors on these pages or not.
    If your custom 404 page is unable to engage the visitors (high exit rate or bounce rate) then you should consider changing the content/design etc. of the page. (I am looking into how you can conduct A/B testing on a 404 page).

    You can also drill down into each of the page and do further navigational analysis to see the pages that the visitors saw before they got the 404 error page.


    This leads you to the pages that have old/misspelled links. To track down the external links and sources, that have bad links to your site, you will need to look at the top landing pages report.
  2. Top Landing Pages


    A filter on “404:” in this report will show you the landing pages that result in the 404 errors. Use this report to drill down to the external sources of errors e.g. the external links, keywords etc. Below is an example of a report that shows that most of the 404 for a page on this site occurred from links in the emails.


    Further analysis of the emails led me to the malfunction links.
Do you have 404 error messages stories, examples to share? Send them to me.

Questions? Comments?

--------------------------------------------------------------------------

Looking to fill your Web Analytics or Online Marketing position?

Post your open jobs on http://webanalysis.jobamatic.com/a/jbb/find-jobs
--------------------------------------------------------------------------

6 comments:

  1. It's important to mention that from an SEO perspective your 404 pages need to actually return a 404 header code, and not a 200, or via a 301/2 to the pretty custom 404 page in question.

    - Chris

    ReplyDelete
  2. Hi Anil

    Thanks for a good post. It prompted me to look at our own 404 Page. While it is custom, it's a long way from where it could be.

    Thanks for the nudge!

    ReplyDelete
  3. Anil
    Nice post, but I'm wondering why you didn't mention that you should always enforce the basic SEO rule of using a 301 redirect if you change urls on your site so that Search Engines maintain the "seo juice" attached to the original?
    Of course, this also offers a better user experience and would remove the need for the custom 404 page in many instances as the user would be automatically redirected out of the old toothbrush aisle and into the new one.

    ReplyDelete
  4. Great post on a topic I hadn't thought of for GA. Two questions--on the tracking code example, it looks like it would prepend 404: to all pages. Is there something there I'm missing that would only prepend to actual 404 errors? Also, is this a secondary GA profile used to manage this or would you do this in the main profile? Thanks.

    ReplyDelete
  5. Anonymous1:15 PM

    I am wondering what Google likes websites to do with content that is no longer available. If a company can no longer supply a certain product, does Google hate it when the page is just deleted from the site? Do too many 404 errors count against the site?
    Thanks

    ReplyDelete
  6. Good post Anil.

    To include HTTP 404 (or any other status code) in your Unica NetInsight error analysis simply include a name-value pair of:

    sc=404

    The 404 value can be replaced with whichever relevant error code has occurred.

    Thank you,

    Lee Isensee
    Unica/Coremetrics, IBM Companies

    ReplyDelete

I would like to hear your comments and questions.