Google Data Studio

How to prepare data for overlap analysis with Venn diagram inside Google Data Studio

Overlap analysis with Google Data Studio preparing data and displaying them with Venn diagram

I always prefer preparing data outside Data Studio and I always prefer using builtin chart types when it is possible.

But sometimes, we need something fast, more or less disposable, addressing temporarily visualization needs, to get a first level of insights.

This article describes how to create quickly an overlap report, loading raw data from several data sets, preparing them to respect the required format of Venn diagram input and display them.


1. Concepts and requirements about overlap & Venn diagram

Overlap and Venn diagram considerations

Overlap report cannot be made with aggregated metrics. It requires an identifier, common to each data set, to allow to scan if every value is present (or not) in each data set.

On the other hand, to display disks and related intersections, the viz engine expects a metric value of each data set (disks), and in addition, metric value of each intersection. In other wording, preparing data, we must create also redundancy inside output data set for intersections purposes.


Queries overlap of organic results & Google Ads impressions

For this tutorial, I decide to use Google Search Console (GSC – site data) & Google Ads data of a website to quantify the number terms with impressions:

  • Specific to organic results
  • Specific to Ads
  • Common to both elements, present on the search engine result page (SERP) of Google…so the intersection of Ads disk and GSC disk

I keep it simple using just 2 data sources, but the way it is designed allows to include more sets of data (see the 2nd example at the end of the article).

So regarding what we said about Venn diagram considerations, in our example:

  • Our identifier is Search Term (And query from GSC)
  • Our metric is the count of Search Terms and queries

Of courses, this is just an example and the recipe can be applied to a lot of use cases requiring overlap visualization for clusters, audiences, segments…


2. The result


Overlap analysis through Venn diagram with Google Data Studio

3. Ingredients for the overlap report

First, let’s discover which elements we will build through this Google Data Studio tutorial:

  • Your 2 data sets from 2 segments or data sources. For the example :
    • Impressions from Google Ads account broken down by Search term
    • Impressions from Google Search Console, for site data, broken down by Query
  • Blended data tables with mentioned data to join common identifiers and also identify the ones specific to each data set
  • A custom visualization chart to draw Venn diagram table chart
  • Chart calculated fields, to calculate required dimension and metric, specific to Venn diagram data need


4. Blended data

Them, we blend data and name them “BD – Ads ans organic impressions by query” as described below for our example:

  • Left table – GSC site data
    • Table name: “GSC”
    • Dimension: “Query”
    • Metric: “Impressions”
  • Right table – Google Ads data
    • Table name: “Ads”
    • Dimension: “Search term”
    • Metric: “Impressions”
  • Join configuration
    • Operator: Full outer join
    • join keys: Query for GSC table and Search term for Ads table

Configuration of data blending for overlap analysis needs and related data model

5. Custom visualization for Venn diagram

Venn diagram is an efficient visualization to depict overlap. For our tutorial, I propose to test a free community visualization generously cooked by First but you can build your own custom visualization & maintain it, taking advantage of visualization libraries like D3 or Chart.js.

The more important is the graphic rendering of Venn diagram, easy & fast to understand.

Then, you must to allow the execution of custom visualization ressources for the active report and turn on community visualization access on your data sources.

But first, If you add a partner visualization for the first time, here, how to do it.




6. Chart configuration

Let’s configure the Venn diagram:

  • Char type: Our custom viz Venn diagram, freshly added on report page thanks to the previous step
  • Setup tab
    • Data source:
      • Your blended data “BD – Ads ans organic impressions by query”
    • Dimension:
      • A chart calculated field named “Zone” corresponding to formula below
    • Metric:
      • A chart calculated field named “Queries” corresponding to formula below
      • Select Count (CT) aggregation instead of Count distinct
        (theoretically, we don’t need count distinct for these data sets & it is more resource consuming)
    • Sort:
      • Queries (metric) descending
        (I’ve noticed bugs or difficulty to draw the viz if the sorting was different)
  • Style tab
    • Select the wished color for each disk
    • Hide text and put logos, or just show dimensions and / or metric values

Zone
REGEXP_EXTRACT(
 CONCAT(
     IF(Query IS NOT NULL,",GSC","")
     ,
     IF(Search term IS NOT NULL,",Google Ads","")
 )
 ,"^.(.*)$")

Queries
COALESCE(Query,Search term)

Setup of Venn diagram for overlap analysis inside Data Studio

7. Configuration review

As usual, just to be sure for having configured correctly the chart, let’s make a quick review.



8. Other example with Google Analytics

As already said, this tutorial can be applied for various use cases, including more data sets, the limit is just related to the number of tables & about data blending to join tables.

Here, another example with Google Analytics data, It is made thanks to a website measurement plan, collecting account identifiers for each interaction when the user is logged (user id). For this case, overlap chart with Venn diagram provides a picture of cross device usages with 3 disks, and data blending is a little bit different (same data source filtered with a star data model for table joins purposes).


Venn diagram with cross device users inside Google Data Studio

Tu en as de la Venn, j’ai la solution qu’il te faut !


Share