Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

The Power BI Data Visualization World Championships is back! It's time to submit your entry. Live now!

Reply
Anonymous
Not applicable

What would be the best way to load a massive amount of data in powerBI?

I did some basic research they're multiple ways to load data into powerBi. My challenge is to load millions and billions of records into powerBI the fastest and least intrusive way. Considering that loading data by file isn't an option because it's simply too slow. On top of all that the data is also dynamic and new data is added every day. What would be the best way to do this?

Possible solutions I came up with:

 

DirectQuery This is a pretty neet solution as its fast, can handle massive data.

 

API We are working with a API that returns the latest JSON in the past we just used this API imported it and click load it takes about 80 minutes to import it and that every day because of the new data that is added we can't keep on using this because everyday the time it takes will be longer.

 

Incremental API The data changes everyday but it changes incrementally this could possible be used to only refresh the data in powerBI and leave the old data?

 

Real-time streaming From what i have seen this way of loading data only allows for a very small amount of data to be loaded in per request (Max 15MB) and on top of that it can only be used to view statistics and not some more in depth data analysis. I could be wrong on this but to my understanding this is used for graphs like amount of users registered in the last hour for example. I could be wrong here tho?

 

Data refresh "If you save your files on OneDrive for Business or SharePoint - Team Sites, you can then connect to or import them into Power BI. Then, your dataset, reports, and dashboard will always be up-to-date. Because both OneDrive and Power BI are in the cloud, Power BI can connect directly to your saved file. It connects about once every hour and checks for updates. The dataset and any visualizations refresh automatically if there are any updates." This could work but they're a few limitations how ever I could possibly make it so that it uses multiple one drive files and just split my entire database up in to lets say 5 files for example?

 

The are my possible solutions to tackle this problem I am looking for advice for people who work with this kind of data everyday how is this done within your organisation? Thx for taking the time to read my question looking forward to you'r replies.

1 ACCEPTED SOLUTION
v-jingzhang
Community Support
Community Support

Hi @Anonymous 

 

When you have millions and billions of records, it is not recommended to bring all of them into Power BI. No matter you use Import or DirectQuery, loading all data would make it very slow. In addition, DirectQuery has a 1 million rows per query limit by default, see the following description (reference doc😞 

 

Limit of 1 million rows returned on any query: There's a fixed limit of 1 million rows placed on the number of rows that can be returned in any single query to the underlying source. This limit generally has no practical implications, and visuals themselves aren't going to display that many points. However, the limit can occur in cases where Power BI isn't fully optimizing the queries sent, and there's some intermediate result being requested that exceeds the limit. It can also occur while building a visual, on the path to a more reasonable final state. For example, including Customer and TotalSalesQuantity would hit this limit if there were more than 1 million customers, until some filter were applied.

 

The 1 million row limit can be exceeded in Premium capacities. For more information, see max intermediate row set count.

 

You may try the suggestions from the following blogs with regarding to largedatasets in Power BI. 

Change your approach with large datasets in Power BI | by Salvatore Cagliari | Towards Data Science

Power BI — How to Fit 200 Million Rows in Less than 1GB | by Nikola Ilic | Towards Data Science

 

Best Regards,
Community Support Team _ Jing
If this post helps, please Accept it as Solution to help other members find it.

View solution in original post

3 REPLIES 3
mufy_cw
New Member

The link shared does not solve my problem. Every month I receive a data set which has 5 years of data where there are 4 columns Zone, Region, City, Metro and next 60 columns are sale value, next 60 columns are no. of units sold and next 60 columns are Qty sold, total 185 columns. Each 60 columns have 1st date of every month as a header. As I said earlier it is a 5 years data. The number of rows are approx 19 lakh which is divided into 4 excel files in 4 zones. I merge them using power query. The real challenge is that this is a wide format data when I unpivot as I need date in rows instead of header and also to create a calendar table the number of rows explode to approx 300+ million rows and I have another table which has product division and product code. Each product can be sold by multiple division so I dont create relation else the amount will get divided among three divisions. I want to see the same sale amount among each division so i do left join to my 180 column with this product table so now total rows are approx 378 million rows. Power bi import the data the issue is it takes over 2.5 hours to referesh the data. Is there any faster way of doing it? I cannot partition the data or filter the data all data together as I need zone, metro, city, region to be used in slicers, I cannot filter months columns as I create 5 MAT, 5 YTD, 13 Quarters, 25 month aggregation dax. Can anyone suggest any efficient way of hangling such data. Thank you.

Ashe_-123
Frequent Visitor

Power BI applies a limit of 1 million rows per query sent to the database in DirectQuery mode. This means that any visual can only retrieve up to 1 million rows at a time from the data source. how we analyse hundreds of milloins rows?

v-jingzhang
Community Support
Community Support

Hi @Anonymous 

 

When you have millions and billions of records, it is not recommended to bring all of them into Power BI. No matter you use Import or DirectQuery, loading all data would make it very slow. In addition, DirectQuery has a 1 million rows per query limit by default, see the following description (reference doc😞 

 

Limit of 1 million rows returned on any query: There's a fixed limit of 1 million rows placed on the number of rows that can be returned in any single query to the underlying source. This limit generally has no practical implications, and visuals themselves aren't going to display that many points. However, the limit can occur in cases where Power BI isn't fully optimizing the queries sent, and there's some intermediate result being requested that exceeds the limit. It can also occur while building a visual, on the path to a more reasonable final state. For example, including Customer and TotalSalesQuantity would hit this limit if there were more than 1 million customers, until some filter were applied.

 

The 1 million row limit can be exceeded in Premium capacities. For more information, see max intermediate row set count.

 

You may try the suggestions from the following blogs with regarding to largedatasets in Power BI. 

Change your approach with large datasets in Power BI | by Salvatore Cagliari | Towards Data Science

Power BI — How to Fit 200 Million Rows in Less than 1GB | by Nikola Ilic | Towards Data Science

 

Best Regards,
Community Support Team _ Jing
If this post helps, please Accept it as Solution to help other members find it.

Helpful resources

Announcements
Power BI DataViz World Championships

Power BI Dataviz World Championships

The Power BI Data Visualization World Championships is back! It's time to submit your entry.

January Power BI Update Carousel

Power BI Monthly Update - January 2026

Check out the January 2026 Power BI update to learn about new features.

FabCon Atlanta 2026 carousel

FabCon Atlanta 2026

Join us at FabCon Atlanta, March 16-20, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.