Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Get inspired! Check out the entries from the Power BI DataViz World Championships preliminary rounds and give kudos to your favorites. View the vizzies.

Reply
rbozz
New Member

GA4 User Aggregation in BigQuery Without Over-Counting in Power BI

Hi all,

I’m building a Power BI dashboard using GA4 data exported to BigQuery, and I’m struggling to accurately calculate the number of unique users over different time periods.

Since I’m working with a large dataset (over 200 million rows if I don’t aggregate user IDs), my approach was to aggregate users at a daily granularity to significantly reduce the dataset size. However, this leads to a major issue:

  • A single user can visit the website multiple times over different days.
  • When summing daily unique users over a larger period (e.g., weekly or monthly), the total ends up much higher than GA4’s reported user count due to duplicated users across days.

The only solution I’ve found so far is to keep the user ID in the final table, but this results in performance issues, slow queries, and frequent dataset refresh failures in Power BI.

Question:

Is there a way to aggregate users without carrying user IDs in the final table, while still maintaining accurate unique user counts over different time periods?

Any suggestions would be greatly appreciated! Thanks!

3 REPLIES 3
v-junyant-msft
Community Support
Community Support

Hi @rbozz ,

Thanks for Sahir_Maharaj's reply!
And @rbozz , you can try to use partitioning and clustering on your BigQuery tables to speed up queries.
For example, you can partition your table by date:

CREATE TABLE `your_project.your_dataset.your_table`
PARTITION BY date
CLUSTER BY user_id AS
SELECT
  date,
  user_id,
  -- other columns
FROM
  `your_project.your_dataset.your_source_table`


And if your dataset is too large to refresh entirely each time, consider using incremental refresh in Power BI. This way, only new data is loaded, reducing the refresh time and resource usage.
https://learn.microsoft.com/en-us/power-bi/connect-data/incremental-refresh-overview 

Best Regards,
Dino Tao
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Hi @v-junyant-msft,

Thanks for the input, but I already clastered and partitioned the table in Big Query, there are no issues in the query execution in BQ. The issue is in Power BI side, I tryed to set up incremental refresh with my semantic model using the Google BQ connector, but it doesn't seem to work, because the refresh times are very high and from time to time the refresh fails. 
I also tryed to use dataflow, and incremental refresh. The incremental on the dataflow side works great, but the refresh of the semantic model connected to dataflow has the same issue as before: huge incremental refresh time.

 

Hope you can give me some other imputs.

Best,

Riccardo 

Sahir_Maharaj
Super User
Super User

Hello @rbozz,

 

Can you please try this approach using pre-aggregation in BigQuery:

WITH DailyUsers AS (
  SELECT 
    date,
    APPROX_COUNT_DISTINCT(user_pseudo_id) AS daily_unique_users
  FROM `your_project.analytics_XXXXX.events_*`
  WHERE _TABLE_SUFFIX BETWEEN '20240101' AND '20241231'
  GROUP BY date
),
WeeklyUsers AS (
  SELECT 
    FORMAT_DATE('%Y-%W', PARSE_DATE('%Y%m%d', date)) AS week,
    APPROX_COUNT_DISTINCT(user_pseudo_id) AS weekly_unique_users
  FROM `your_project.analytics_XXXXX.events_*`
  WHERE _TABLE_SUFFIX BETWEEN '20240101' AND '20241231'
  GROUP BY week
),
MonthlyUsers AS (
  SELECT 
    FORMAT_DATE('%Y-%m', PARSE_DATE('%Y%m%d', date)) AS month,
    APPROX_COUNT_DISTINCT(user_pseudo_id) AS monthly_unique_users
  FROM `your_project.analytics_XXXXX.events_*`
  WHERE _TABLE_SUFFIX BETWEEN '20240101' AND '20241231'
  GROUP BY month
)
SELECT * FROM DailyUsers
UNION ALL
SELECT * FROM WeeklyUsers
UNION ALL
SELECT * FROM MonthlyUsers;

Did I answer your question? Mark my post as a solution, this will help others!

If my response(s) assisted you in any way, don't forget to drop me a "Kudos" 🙂

Kind Regards,
Sahir Maharaj
Data Scientist | Data Engineer | Data Analyst | AI Engineer
P.S. Want me to build your Power BI solution? (Yes, its FREE!)
➤ Lets connect on LinkedIn: Join my network of 15K+ professionals
➤ Join my free newsletter: Data Driven: From 0 to 100
➤ Website: https://sahirmaharaj.com
➤ Email: sahir@sahirmaharaj.com
➤ Want me to build your Power BI solution? Lets chat about how I can assist!
➤ Join my Medium community of 30K readers! Sharing my knowledge about data science and artificial intelligence
➤ Explore my latest project (350K+ views): Wordlit.net
➤ 100+ FREE Power BI Themes: Download Now
LinkedIn Top Voice in Artificial Intelligence, Data Science and Machine Learning

Helpful resources

Announcements
Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!

FebPBI_Carousel

Power BI Monthly Update - February 2025

Check out the February 2025 Power BI update to learn about new features.

Feb2025 NL Carousel

Fabric Community Update - February 2025

Find out what's new and trending in the Fabric community.

Top Solution Authors
Top Kudoed Authors