The ultimate Fabric, Power BI, SQL, and AI community-led learning event. Save €200 with code FABCOMM.
Get registeredCompete to become Power BI Data Viz World Champion! First round ends August 18th. Get started.
Hello Community,
I have massive data flowing in from Google Analytics to Big Query and i want to be able to ingest it onto PowerBI.
The data size is at about 180 million rows for about 4 months of data. And the disk space size roughly 30 GB.
Some of the options explored are :
- Data partitioning - but i do need all the data points on the table and can't partition by dates either.
- Dataflow - I had it setup from BQ but it does take hours to load and did time out, i have it now connected to a PBI file but that results in a PBIX more than 10GB. P2 access is available but the concern is data is going to grow and not entirely sure of limitations after.
- Direct Query - May end up quite a lot on the billing scale (BQ) given the enormous amount of data, queryng each time, audience
- Rolling up data - As of now the lowest grain of the table is per session and grouping or aggregating it based on other dimensions is not an option as it would lose value in terms of reporting.
With that being said, what are any recommendations that you can offer in this case?
Thank you in advance for all your help.
@parry2k Please defer to anyone else who you know may help if this isn't under your domain specifically.
Hi @Sam_Jain ,
Based on your requirements and the limitations you've mentioned, I would recommend considering the following options:
1. Incremental Refresh: Configure incremental refresh in Power BI to load only new or changed data, reducing the amount of data loaded during each refresh. This can help manage the growing data size and improve refresh performance. (Only part of the data is refreshed, which should reduce the cost of BQ)
Incremental refresh in #PowerBI with data from BigQuery
Power BI Premium: If you haven't already, consider upgrading to Power BI Premium, which allows datasets to grow beyond 10 GB in size. This can help accommodate the growing data size and provide additional features and performance improvements. (You already have a P2 license)
Optimize Data Model: Review your data model and remove any unnecessary columns or tables, and consider normalizing high-cardinality columns to reduce the overall size of the dataset.
Please note that these recommendations are based on general best practices and may not fully address your specific situation. It's essential to test and validate these approaches in your environment to ensure they meet your requirements.
Best Regards,
Gao
Community Support Team
If there is any post helps, then please consider Accept it as the solution to help the other members find it more quickly. If I misunderstand your needs or you still have problems on it, please feel free to let us know. Thanks a lot!
How to get your questions answered quickly -- How to provide sample data in the Power BI Forum