Google Data Studio

Google Data Studio – Calculated field : Number of occurrences of specific character

Let’s combine available Google Data Studio functions to count the number of occurrences of a specific character and use it as a grouping dimension.


Google Data Studio - Calculated field : Occurrences count of specific character(s)

Use cases

This calculated formula can be used for dimension values with high cardinality. It is based on the presence of specific meaningful character for splitting, so count grouping purposes:

  • Search query, Search engine keyword…: Counting space character + 1 to get the number of words -> for long tail analysis (example explained below)
  • URLs and paths (all data sets): Counting slash character of path to get depth of page in the site map -> for internal linking strategy, SEO indexation
  • Product Category (Enhanced Ecommerce) from Google Analytics data source: Counting slash character to get depth of products categories tree -> for catalog and product marketing analysis

Explained example

In the example below, we count the number of words composing GA site search queries to check if there is a correlation with the related conversion rates.



Calculated field formula

CAST(
	LENGTH(
		REGEXP_REPLACE(My keywords dimension,"[^ ]","")
	)+1
AS TEXT)

Explanations

The formula is made with 3 cascading of Google Data Studio functions :

  • REGEXP_REPLACE: We remove all characters not corresponding to our character separator. “[^ ]” regular expression means “not equals to SPACE character”
  • LENGTH: We count the remaining characters, so our separator
  • +1: For this specific example, we need to add 1 for word count purpose (no space character means that there is only 1 word)
  • CAST: We cast the result of the previous function cascades to TEXT.
    It is a way to force the creation of a dimension field, and not a metric as result of integer outcome of length function. Why here? Simply, to go fast:
    • The edition of field type is not available on the formula editor panel (…🤔…in my opinion, it should be…), but only on data source fields list
    • So using CAST function, I reduce my field creation flow to one action instead of 2, in 2 separate panels
    • > So positive results of my laziness 😩 are :
      Productivity (1 action instead of 2), and after all, coding determinism in formula (field typing like variables types where coding) 😎

1 mouton, 2 moutons, 3 moutons…zzz


PS: I’ve published the short gif version of this post on my Twitter account the 25th of april 2019: @wissi_analytics