Join us at FabCon Atlanta from March 16 - 20, 2026, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.
Register now!To celebrate FabCon Vienna, we are offering 50% off select exams. Ends October 3rd. Request your discount now.
Hi Everyone,
I am looking for options to address the following scenario:
We are connecting to a table in SQLServer and pull the data into PowerBI desktop in 'Import' mode. The table has nearly 100 million records. Considering the volume of the records, I am trying to see if there is any option in PowerBI to filter and import only a subset(i.e. last 4 years of data).
Also, as the data will continue to increase, I feel there has to be some steps needed to ensure the model does not have memory overflow in the future.
It would be great if you can suggest some options to handle such high data volume in 'import' mode.
Thanks,
PK
Hi @Anonymous
If you want to publish your report into Power BI Service, you can try Incremental refresh.
Here is the offical blog, I hope it could help you.
For reference: Incremental refresh for datasets
Best Regards,
Rico Zhou
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
Thanks @Greg_Deckler for your prompt response.
I have never had to 'import' 100 million records in PowerBI before. So, wanted to make sure I consider all factors before diving into it. I will try and see how it goes I guess.
'Import' option is our implementation approach owing to high volumes. Aggregated Source tables are not an option and hence 'DirectQuery is off the table.
Thanks again,
PK
@Anonymous i would also recommned you to use native query option which will ensure that even before data is loaded in your powerquery window it is filtered. this way you can reduce number of records or amount of data you will be importing to your powerbi model. As advise by @Greg_Deckler try using native query option, i have also used it personally for few of my projects. I wrote query in my query browser of db to test the output and then use those quries in the native query winow in powerbi.
try going through below links to know more option native query.
https://docs.microsoft.com/en-us/power-query/native-database-query
https://blog.pragmaticworks.com/power-bi-checking-query-folding-with-view-native-query
https://www.mssqltips.com/sqlservertip/4563/power-bi-native-query-and-query-folding/
Proud to be a Super User!
@Anonymous I've done Import models at that scale before. It is feasible but there are so many factors that go into it. For something of that scale, I would definitely recommend doing all of your data manipulation pre Power Query because you are really going to want query folding taking place. Right-click on the last step in your query and make sure View Native Query is active. Also, you will likely want to avoid or minimize calculated DAX columns because those will also slow down load times.
So, for example, if you have Qty Sold and Unit Price and all you really care about is the Sale dollar amounts, do the calculation in your view [Qty Sold] * [Unit Price] versus in PQ or DAX and only include the Sale column and not the other 2 columns.
@Anonymous This is a broad subject. Importing 100million records into Power BI is not necessarily a problem unless you run into size limitations. Such as 1GB dataset in Pro. In Premium, you are limited only to the size of the memory in your capacity.
Ruling out DirectQuery since you specify Import, I would recommend creating a view in your SQL database that holds the last four years. You should have a clustered index on that view.
Now, beyond that, you need to look at the columns that comprise your data and determine high cardinality columns that aren't likely to compress well. If you don't need those, get rid of them in the view. Also, definitely recommend turning off auto date/time as this will bloat your model.
Now, you don't have to use a SQL view, you could use Power Query but you are almost certain to break "Native Query" or query folding behavior and that would not be good for speed.
Is there a reason DirectQuery is off-the-table? DirectQuery models with Aggregation tables and be quite fast.