Google Data Studio – Calculated field : Number of occurrences of specific character
Let’s combine available Google Data Studio functions to count the number of occurrences of a specific character and use it as a grouping dimension.
Use cases
This calculated formula can be used for dimension values with high cardinality. It is based on the presence of specific meaningful character for splitting, so count grouping purposes:
- Search query, Search engine keyword…: Counting space character + 1 to get the number of words -> for long tail analysis (example explained below)
- URLs and paths (all data sets): Counting slash character of path to get depth of page in the site map -> for internal linking strategy, SEO indexation
- Product Category (Enhanced Ecommerce) from Google Analytics data source: Counting slash character to get depth of products categories tree -> for catalog and product marketing analysis
- …
Explained example
In the example below, we count the number of words composing GA site search queries to check if there is a correlation with the related conversion rates.
Calculated field formula
CAST(
LENGTH(
REGEXP_REPLACE(My keywords dimension,"[^ ]","")
)+1
AS TEXT)
Explanations
The formula is made with 3 cascading of Google Data Studio functions :
- REGEXP_REPLACE: We remove all characters not corresponding to our character separator. “[^ ]” regular expression means “not equals to SPACE character”
- LENGTH: We count the remaining characters, so our separator
- +1: For this specific example, we need to add 1 for word count purpose (no space character means that there is only 1 word)
- CAST: We cast the result of the previous function cascades to TEXT.
It is a way to force the creation of a dimension field, and not a metric as result of integer outcome of length function. Why here? Simply, to go fast:- The edition of field type is not available on the formula editor panel (…🤔…in my opinion, it should be…), but only on data source fields list
- So using CAST function, I reduce my field creation flow to one action instead of 2, in 2 separate panels
- > So positive results of my laziness 😩 are :
Productivity (1 action instead of 2), and after all, coding determinism in formula (field typing like variables types where coding) 😎
1 mouton, 2 moutons, 3 moutons…zzz
PS: I’ve published the short gif version of this post on my Twitter account the 25th of april 2019: @wissi_analytics