Google Data Studio

Google Data Studio – Dynamic cluster of dimension values for interactive statistics grouping

Of course, nothing is better than getting cleaned, organized and grouped data to build beautiful dashboards with Google Data Studio and other data viz tools.

But messy data is also our daily lives. At some point (the best reason), because we are in charge actually to clean it or to create an efficient measurement plan.

Data Studio can also help you – without coding skills – to explore and test any data, at any stage of data analysis. And data exploration can significantly be accelerated with home made tools.

As example, the capability to cluster dynamically existing data is possible with Data Studio. It will help you to identify trends and apply the better options for the future.

How many times you read a table chart, saw the high number of values (with the related long tail of descending metric values) and suspect that a cluster of values with a common denominator could be significant, driving your analysis or your way to organize your data? 🧐

Let’s build a simple & interactive report that help you to cluster your data and create soft grouping inside Data Studio!


Google Data Studio Dynamic cluster of dimension values for interactive statistics grouping

Ingredients

For this post, I handle data from demo account of Google Analytics, using Landing Page dimension with Sessions metric, but you can do exactly the same thing with any other data set. Just take care of the consistency of the applied aggregation on the metric.

So, to build this exploration report, we will combine the following elements:

  • Your main data source providing dimension values and related metrics you need to explore and cluster
  • Some parameters to fill grouping rules and labels (3 groups for the example)
  • A calculated field to take in account dynamic grouping inside our charts
  • A main chart to see the evolution of statistics after your dynamic grouping, with the related
  • A bar chart as a secondary chart to see the impact on the number of values and the long tail

The result

Like keywords, URLs is a good example of dimensions with a high cardinality of values, requiring to create groups to understand if some suspected clusters can get significant proportion of stats, changing the trends, and so the analysis.

Let’s see the result.



1. Parameters

Let’s start the configuration!

First, add the following 5 text parameters – permitting any value – to your main data source. Here their names and IDs:

  • Group #1 label / group_1_label
  • Group #1 rule / group_1_rule
  • Group #2 label / group_2_label
  • Group #2 rule / group_2_rule
  • Group #3 – auto extract / group_3

Data Studio parameters for dynamic cluster of dimension values for interactive statistics grouping

2. Calculated fields

Then, we create a calculated field on our data source to use it as the main dimension of your charts, overwriting the original value of the selected dimension (here Landing page):

  • For the Group #1 and #2, we check if the related parameters are not empty and if the rule corresponds to a regular expression, included in the dimension value, if it is the case, we overwrite the dimension value with the filled label
  • For the Group #3, we auto extract, inside the value, what is filled as a regular expression and create one group by different extraction.
    Warning! To keep simple the use of REGEXP_EXTRACT function, the related parameter should be, at least, filled with a string between parenthesis, or with a real regular expression with capturing group

Grouped landing page
CASE
 WHEN Group #1 rule != "" AND Group #1 label != "" AND REGEXP_CONTAINS(Landing Page,Group #1 rule) THEN CONCAT("# Group: ",Group #1 label)
 WHEN Group #2 rule != "" AND Group #2 label != "" AND REGEXP_CONTAINS(Landing Page,Group #2 rule) THEN CONCAT("# Group: ",Group #2 label)
 WHEN Group #3 - auto extract != "" AND REGEXP_CONTAINS(Landing Page, Group #3 - auto extract) THEN CONCAT("# Group: ", REGEXP_EXTRACT(Landing Page,Group #3 - auto extract))
 ELSE Landing Page
 END

3. Settings panel of the report

Now our fields are ready, let’s build the settings panel we will use to create our clusters dynamically regarding explored data.

We need to insert in the report page our 5 parameters through input box controls, decorated with some section labels and graphic elements representing the cascade of the 3 available rule slots.


Google Data Studio report - Settings panel with input box parameters

4. Charts

At last, we create 2 charts.

The first one is our main chart for exploration works, here a table chart with main metric displayed with bars:

  • Data tab
    • Data source: Your main data source (here GA demo account)
    • Dimension: Grouped landing page
    • Metric: Sessions
      (selecting in my example percent of total as comparison calculation)
    • Sort by Sessions descending
  • Style tab
    • Show row numbers (Table Body section)
    • Show pagination (Table Footer section)
    • Display bar and show number for the first metric
    • (optional) To highlight the groups, in my case, I also applied a conditional formatting for rows with clusters with yellow background

Then, we create a small vertical bar chart to follow the trend of our initial long tail, applying progressively some clusters.

  • Data tab
    • Data source: Your main data source (here GA demo account)
    • Dimension: Grouped landing page
    • Metric: Sessions
    • Sort by Sessions descending
  • Style tab
    • Vertical bars
    • High number of bars
    • A flashy color
    • Axes: Do not show axes
    • Legend: Do not show legend

Configuration review

Not sure about a step described above, don’t worry, it’s time to make a full review of the configuration with the video below.



That’s all. Now, it is your turn to apply this report for your specific context. Custom adaptations are endless. Make this report yours.


“Toi, plus moi, plus eux, plus tous ceux qui le veulent, Plus lui, plus elle…”