The ultimate Fabric, Power BI, SQL, and AI community-led learning event. Save €200 with code FABCOMM.
Get registeredCompete to become Power BI Data Viz World Champion! First round ends August 18th. Get started.
Hi Team,
I experience huge performance issues while getting the data from Google Analytics. It seems that it live-reads it everytime I make a change in power query editor and indeed it reads more than 200mb....it is a killer.
I wonder how can I make it work faster or think of a place to store the data?
Please advise.
Thanks.
Bart
Solved! Go to Solution.
The reason for the performance issues when pulling data from Google Analytics is that Power BI performs a live read every time you make a change in the Power Query Editor. This means each update or modification triggers a full data retrieval, which can be quite heavy, especially when dealing with over 200 MB of data.
Here are some alternatives to improve performance or consider different ways to store the data:
1. **Reduce the Amount of Data Retrieved**: It’s best to filter and define which fields you actually need and what the relevant date range is. This way, you can reduce the amount of data being downloaded each time, significantly cutting down refresh time.
2. **Incremental Refresh**: Set up incremental refresh to fetch only new or updated data instead of reloading everything from scratch on each refresh. After the initial load, Power BI will only pull in new data, which can greatly reduce refresh times.
3. **External ETL Process**: Consider using an ETL process with external tools (like Azure Data Factory, Google Cloud Dataflow, or other ETL services) to move the data from Google Analytics to a more efficient storage option:
- **Database (Azure SQL / SQL Server)**: Store the data in a database for faster querying and better control over refresh schedules.
- **Cloud Storage (Azure Blob / Google Cloud Storage)**: Save the data as files (e.g., CSV, Parquet) and connect Power BI to those files. This can improve performance when working with a data lake approach.
- **Big Data Processing Services (Azure Databricks / Google BigQuery)**: Manage and process large datasets more efficiently before feeding them into Power BI.
4. **Dataflows in Power BI**: Create a Dataflow that handles the extraction and transformation of the data, so it’s cached and the pull from Google Analytics is lighter, improving performance when making changes.
5. **Scheduled Refresh**: Avoid live connections to Google Analytics and schedule refreshes at more convenient times (e.g., off-peak hours) so data loads are less frequent, and performance during development is smoother.
These are some ways to help improve performance and better manage large datasets in Power BI.
If this post helps, then please consider Accepting it as the solution to help the other members find it more quickly
Were you able to find a solution? As a workaround, maybe you can connect your GA data using an external connector which pulls data from the API. I've tried windsor.ai, supemetrics and funnel.io. I stayed with windsor because it is much cheaper so just to let you know other options. In case you wonder, to make the connection first search for the GA4 connector in the data sources list:
After that, just grant access to your GA4 account using your credentials, then on preview and destination page you will see a preview of your GA4 fields:
There just select the fields you need. Finally, just select PBI as your data destination and finally just copy and paste the url on PBI --> Get Data --> Web --> Paste the url.
Hi @bartek_pepper ,
Try unchecking the "Allow data previews to download in the background" option.
Best Regards,
Gao
Community Support Team
If there is any post helps, then please consider Accept it as the solution to help the other members find it more quickly.
If I misunderstand your needs or you still have problems on it, please feel free to let us know. Thanks a lot!
How to get your questions answered quickly -- How to provide sample data in the Power BI Forum
If this helped, a Kudos 👍 or Solution mark would be great! 🎉
Cheers,
Kedar
Connect on LinkedIn
The reason for the performance issues when pulling data from Google Analytics is that Power BI performs a live read every time you make a change in the Power Query Editor. This means each update or modification triggers a full data retrieval, which can be quite heavy, especially when dealing with over 200 MB of data.
Here are some alternatives to improve performance or consider different ways to store the data:
1. **Reduce the Amount of Data Retrieved**: It’s best to filter and define which fields you actually need and what the relevant date range is. This way, you can reduce the amount of data being downloaded each time, significantly cutting down refresh time.
2. **Incremental Refresh**: Set up incremental refresh to fetch only new or updated data instead of reloading everything from scratch on each refresh. After the initial load, Power BI will only pull in new data, which can greatly reduce refresh times.
3. **External ETL Process**: Consider using an ETL process with external tools (like Azure Data Factory, Google Cloud Dataflow, or other ETL services) to move the data from Google Analytics to a more efficient storage option:
- **Database (Azure SQL / SQL Server)**: Store the data in a database for faster querying and better control over refresh schedules.
- **Cloud Storage (Azure Blob / Google Cloud Storage)**: Save the data as files (e.g., CSV, Parquet) and connect Power BI to those files. This can improve performance when working with a data lake approach.
- **Big Data Processing Services (Azure Databricks / Google BigQuery)**: Manage and process large datasets more efficiently before feeding them into Power BI.
4. **Dataflows in Power BI**: Create a Dataflow that handles the extraction and transformation of the data, so it’s cached and the pull from Google Analytics is lighter, improving performance when making changes.
5. **Scheduled Refresh**: Avoid live connections to Google Analytics and schedule refreshes at more convenient times (e.g., off-peak hours) so data loads are less frequent, and performance during development is smoother.
These are some ways to help improve performance and better manage large datasets in Power BI.
If this post helps, then please consider Accepting it as the solution to help the other members find it more quickly
Thanks a lot!
I tried incremental refresh but I did not have an option of the native query there.
I will discuss those suggestions with the business.
Highly appreciated!