Google Data Studio

Google Data Studio report to monitor and solve 404 page not found errors

Google Data Studio can help you to address a lot of challenges of dashboarding, reporting or data story telling. You can also use it for technical monitoring.

In this article, we will create a simple Google Data Studio report based on Google Analytics data to monitor and solve loads of pages not found by the web application. To be efficient operationally speaking, we will not be satisfied by a simple flat list of 404 URLs…definitely not…

As usual, the good combination of Data Studio features will help us to create an interactive and responsive dashboard helping us to jump inside available and, above all, prepared data to provide very quickly first information to solve our technical problem.


The goal


A lot of reasons can provoke “page not found” errors:

  • Unpublishing contents without taking care of side effects regarding digital ecosystem using it
  • Launching of traffic acquisition campaign with some corrupted landing URLs
  • Bad redirection plan of URL for SEO purposes
  • Broken internal links
  • Not updated XML products feed for partners usages

With Google Analytics, in most of cases, it is easy to identify it. We just need a report addressing the following steps of reading process:

  1. Focus on the pick of errors in daily or hourly basis regarding the usual & so acceptable number of errors
  2. Filter by reason groups explaining pages not found:
    • External links (acquisition traffic & off site problem)
    • Internal links (content contribution & on site problem)
    • Direct (marginal, taping problem or big technical problem)
  3. Identify the requested URLs not corresponding to an available page
  4. Get a first level of information to launch corrective actions with the right people (writers, SEO agency, traffic manager…)

Let’s see quickly the result in action.


The result



Google Analytics pre-requisites


We will use a basic Google Analytics (GA) measurement plan, providing the required data. We just need a tag of page view and also confirm that our web application behaves as following, and so the related data are collected by Google Analytics:

  • GA page view tag is triggered when a “page not found” error occurs
  • The requested URL is kept without redirection, so stored as GA Page dimension
  • A dedicated page title is used for “page not found” errors (“Page not found”, “404 Page error”, “Page unavailable”…), so stored as GA Page title dimension

If your application doesn’t respect the points above, you can adapt the measurement plan (tracking) to inject the required data with the page view tag.


Configuration


Now, let’s create our report composed with:

  • Time series chart with daily errors
  • Donut chart with triggers group of error loads
  • Table chart with requested URL, drilled down by error details, with the number of related errors
  • A report filter to isolate only page views corresponding to “page not found” loads
  • 7 calculated fields to manage page not found details

Calculated fields


Let’s configure calculated fields: We will combine the Previous Page Path and acquisition dimensions to identify in which case the error had occurred, and so, to provide the related better information.

I used this principle of “responsive” dimension in the past for simple dimensions switch following the selection of another dimension (An example through this Twitter thread). Here, it is a little more complicated because we use intermediary dimensions aggregating several data.

So, let’s create these calculated fields, respecting the formulas and names below. You can also adapt each formula following which data you consider important to help to solve each kind of reason provoking the “page not found” error.

  • “PNF – Trigger” field for the donut chart to create 3 groups of errors (PNF prefix for “page not found”)
CASE
WHEN Previous Page Path = "(entrance)" AND Source = "(direct)" THEN "Direct"
WHEN Previous Page Path = "(entrance)" AND Source != "(direct)" THEN "External link"
ELSE "Internal link"
END
  • “PNF – Detail for internal link” field
CONCAT("Internal: ",Previous Page Path)
  • “PNF – Detail for referral link” field
CONCAT("Referral: ",Source,Referral Path)
  • “PNF – Detail for social link” field
CONCAT("Social: ",Source)
  • “PNF – Detail for campaign link” field
CONCAT("Campaign: ",Default Channel Grouping," / ",Source, " / ",Medium, " / ",Campaign, " / ",Ad Content)
  • “PNF – Detail for organic search” field
CONCAT("Organic: ", Hostname, Page)
  • “PNF – Detail” field. It will be at last used by the table chart and will call the 5 previous fields.
    NB: It is not possible to combine FUNCTIONS like CONCAT inside CASE WHEN statements. That’s why we use intermediary calculated fields.
CASE
WHEN Previous Page Path != "(entrance)"  THEN PNF - Detail for internal link
WHEN Previous Page Path = "(entrance)" AND Source = "(direct)" THEN "Direct"
WHEN Previous Page Path = "(entrance)" AND Medium = "organic" THEN PNF – Detail for organic search
WHEN Previous Page Path = "(entrance)" AND Default Channel Grouping = "Social" THEN PNF - Detail for social link
WHEN Previous Page Path = "(entrance)" AND medium = "referral" THEN PNF - Detail for referral link
ELSE PNF - Detail for campaign link
END

Filter


We need a filter based on the value of the “Page Title” dimension.

For the example, I use GA demo account collecting data from googlemerchandisestore.com. The page title used for error is “Page Unavailable”.

So I use a simple chart filter :
Include / Page Title / Equal to / Page Unavailable
And name it:
“GA – Page Title – Page Unavailable”


Charts


Now our filter and fields are ready, we can create our 3 charts respecting – at least – the configurations below:

  • Time series chart
    • Data
      • Your GA view as data source
      • Date as dimension
      • Pageviews as metric and rename it “Page not found loads” for a better understanding of the context
      • Use “GA – Page Title – Page Unavailable” as time series filter
      • Turn on interactions filter
    • Style
      • Show points if you wish to click on each date easily
      • Show trend line
      • do not show chart header, it is not necessary for this chart
  • Donut chart
    • Data
      • Your GA view as data source
      • PNF – Trigger as dimension
      • Pageviews as metric and rename it “Page not found loads”
      • Use “GA – Page Title – Page Unavailable” as pie chart filter
      • Turn on interactions filter
      • Turn off sorting
    • Style
      • Always show chart header
      • Adapt the donut chart following your report design
  • Table chart
    • Data
      • Your GA view as data source
      • Page as 1st dimension & rename it “Requested page path” for a better understanding of the context
      • PNF – Detail as 2nd dimension & rename it “Page not found – Detail”
      • Turn on “Drill down” option and check that “Requested page path” is selected as Default drill down level
      • Pageviews as metric & rename it “Page not found loads”
      • Show the summary
      • Sort Pageviews descending
      • Use “GA – Page Title – Page Unavailable” as pie chart filter
      • Turn off interactions filter
    • Style
      • Show header
      • Table body: Wrap text
      • Show pagination
      • Display metric as bar with number
      • Always show chart header
      • Adapt the table chart following your report design

It is ready to test it!



Final thoughts and limitations


This solution has some limitations and you could need to take it in account:

  • The model of attribution of Google Analytics (last touch non direct) can create a persistency of the traffic source and so, corrupt in a way this reporting. But the approach is based on pick of errors, so it should be fine for this usage.
  • This reporting is based on page dimension and for cardinalities purposes and content unicity, some technical query parameters can be removed or string case can be changed on your GA master view (through view parameters and view filters). So in some cases, URLs stored in GA master view don’t reflect correct technical URLs that provoke the error. if you are in this situation, you should use raw data view (backup view) for this kind of reporting if you are in this situation…but for that, you should apply some known good practices about GA.

Si je ne dis pas d’erreur…