March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount! Early bird discount ends December 31.
Register NowBe one of the first to start using Fabric Databases. View on-demand sessions with database experts and the Microsoft product team to learn just how easy it is to get started. Watch now
I did some basic research they're multiple ways to load data into powerBi. My challenge is to load millions and billions of records into powerBI the fastest and least intrusive way. Considering that loading data by file isn't an option because it's simply too slow. On top of all that the data is also dynamic and new data is added every day. What would be the best way to do this?
Possible solutions I came up with:
DirectQuery This is a pretty neet solution as its fast, can handle massive data.
API We are working with a API that returns the latest JSON in the past we just used this API imported it and click load it takes about 80 minutes to import it and that every day because of the new data that is added we can't keep on using this because everyday the time it takes will be longer.
Incremental API The data changes everyday but it changes incrementally this could possible be used to only refresh the data in powerBI and leave the old data?
Real-time streaming From what i have seen this way of loading data only allows for a very small amount of data to be loaded in per request (Max 15MB) and on top of that it can only be used to view statistics and not some more in depth data analysis. I could be wrong on this but to my understanding this is used for graphs like amount of users registered in the last hour for example. I could be wrong here tho?
Data refresh "If you save your files on OneDrive for Business or SharePoint - Team Sites, you can then connect to or import them into Power BI. Then, your dataset, reports, and dashboard will always be up-to-date. Because both OneDrive and Power BI are in the cloud, Power BI can connect directly to your saved file. It connects about once every hour and checks for updates. The dataset and any visualizations refresh automatically if there are any updates." This could work but they're a few limitations how ever I could possibly make it so that it uses multiple one drive files and just split my entire database up in to lets say 5 files for example?
The are my possible solutions to tackle this problem I am looking for advice for people who work with this kind of data everyday how is this done within your organisation? Thx for taking the time to read my question looking forward to you'r replies.
Solved! Go to Solution.
Hi @Anonymous
When you have millions and billions of records, it is not recommended to bring all of them into Power BI. No matter you use Import or DirectQuery, loading all data would make it very slow. In addition, DirectQuery has a 1 million rows per query limit by default, see the following description (reference doc😞
Limit of 1 million rows returned on any query: There's a fixed limit of 1 million rows placed on the number of rows that can be returned in any single query to the underlying source. This limit generally has no practical implications, and visuals themselves aren't going to display that many points. However, the limit can occur in cases where Power BI isn't fully optimizing the queries sent, and there's some intermediate result being requested that exceeds the limit. It can also occur while building a visual, on the path to a more reasonable final state. For example, including Customer and TotalSalesQuantity would hit this limit if there were more than 1 million customers, until some filter were applied.
The 1 million row limit can be exceeded in Premium capacities. For more information, see max intermediate row set count.
You may try the suggestions from the following blogs with regarding to largedatasets in Power BI.
Change your approach with large datasets in Power BI | by Salvatore Cagliari | Towards Data Science
Power BI — How to Fit 200 Million Rows in Less than 1GB | by Nikola Ilic | Towards Data Science
Best Regards,
Community Support Team _ Jing
If this post helps, please Accept it as Solution to help other members find it.
Power BI applies a limit of 1 million rows per query sent to the database in DirectQuery mode. This means that any visual can only retrieve up to 1 million rows at a time from the data source. how we analyse hundreds of milloins rows?
Hi @Anonymous
When you have millions and billions of records, it is not recommended to bring all of them into Power BI. No matter you use Import or DirectQuery, loading all data would make it very slow. In addition, DirectQuery has a 1 million rows per query limit by default, see the following description (reference doc😞
Limit of 1 million rows returned on any query: There's a fixed limit of 1 million rows placed on the number of rows that can be returned in any single query to the underlying source. This limit generally has no practical implications, and visuals themselves aren't going to display that many points. However, the limit can occur in cases where Power BI isn't fully optimizing the queries sent, and there's some intermediate result being requested that exceeds the limit. It can also occur while building a visual, on the path to a more reasonable final state. For example, including Customer and TotalSalesQuantity would hit this limit if there were more than 1 million customers, until some filter were applied.
The 1 million row limit can be exceeded in Premium capacities. For more information, see max intermediate row set count.
You may try the suggestions from the following blogs with regarding to largedatasets in Power BI.
Change your approach with large datasets in Power BI | by Salvatore Cagliari | Towards Data Science
Power BI — How to Fit 200 Million Rows in Less than 1GB | by Nikola Ilic | Towards Data Science
Best Regards,
Community Support Team _ Jing
If this post helps, please Accept it as Solution to help other members find it.
March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!
Arun Ulag shares exciting details about the Microsoft Fabric Conference 2025, which will be held in Las Vegas, NV.
User | Count |
---|---|
108 | |
75 | |
53 | |
52 | |
44 |
User | Count |
---|---|
161 | |
112 | |
69 | |
61 | |
50 |