Join us at FabCon Atlanta from March 16 - 20, 2026, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.
Register now!The Power BI Data Visualization World Championships is back! Get ahead of the game and start preparing now! Learn more
Hi,
I have to collect data massively from a database.
I know that there is the option to connect directly the database to powerbi o work in cloud, but i don't have the permission to connect the database 😞
So trying to figure out a workaround i'm working on my powerbi desktop.
I'm exporting tables in a csv.gz format and then importing to my model.
I need all, the data of the last 5 years so some tables such the transaction ones have 10+ million rows.
In this way download and importing data is quite time consuming.
I have seen some video on incremental refreshing. I got a Pro license but what i am trying to figure out is how to make it work with csv.gz file.
So how do I teach to Powerbi that the data in the csv has to be appendend on the existing table and that historical data has not to be overwritten? Furthermore I want that if in the historic data is already present transaction 1234561 and in the csv extraction it is present also transaction 1234561 then Powerbi has to do NOT import again.
Thank you very much
Solved! Go to Solution.
To effectively manage datasets with 10m+ rows you will need some kind of database solution, Incremental refresh needs to use query folding. If you stick with csv files it's always going to be slow.
If you go with the local database approach you would just need a process to insert the data into your DB table.
I would have an extra column in the table for the insert date and use that for the incremental refresh process.
If you need a free tool to help you with the ETL then KNIME might be worth looking at https://www.knime.com/
To effectively manage datasets with 10m+ rows you will need some kind of database solution, Incremental refresh needs to use query folding. If you stick with csv files it's always going to be slow.
If you go with the local database approach you would just need a process to insert the data into your DB table.
I would have an extra column in the table for the insert date and use that for the incremental refresh process.
If you need a free tool to help you with the ETL then KNIME might be worth looking at https://www.knime.com/
You could possibly set up a database and personal gateway on your local machine (if you have permission to do that).
Then use that DB as a source for the incremental refresh.
You would just need a method to append your new data to the local DB every time there is an update
Thanks for your reply.
another question. If i am able to create each day a new file with just the data of the last day there is no way to append to the historic table on powerbi, checking for duplicate as incremental refresh?
Because actually when i use simple append powerbi start to calculate the appended table each time and being millions and million of rows it takes a lot of time. I would like just to append eache day the new data
The Power BI Data Visualization World Championships is back! Get ahead of the game and start preparing now!
| User | Count |
|---|---|
| 38 | |
| 37 | |
| 33 | |
| 32 | |
| 29 |
| User | Count |
|---|---|
| 129 | |
| 88 | |
| 79 | |
| 68 | |
| 63 |