Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

The Power BI Data Visualization World Championships is back! Get ahead of the game and start preparing now! Learn more

Reply
Deepak_
Helper I
Helper I

Best way to handle large dataset in Power BI

Hi Community Members,

 

We are currently handling a single semantic model close to 5GB containing only D365 CRM data. Once the ERP data is integrated, this dataset might grow to 10GB and even more, considering this will contain 10 years of historical data.

We are going to build reports for this dataset. Any way we can handle this dataset more efficiently in Power BI?

Thanks in advance!

1 ACCEPTED SOLUTION
Martin_D
Super User
Super User

Hi @Deepak_ ,

 

With larget semantic model, this size is not a problem per se https://learn.microsoft.com/en-us/power-bi/enterprise/service-premium-large-models

 

Nevertheless you can apply best practices to keep your dataset small and fast. What is possible and reasonable for your dataset depends on what you already have optimized and what are your requirements.
Besides the official recommendations https://learn.microsoft.com/en-us/power-bi/guidance/import-modeling-data-reduction I'd like to add:

  • Talk with your users about the actual requirements. I frequently see excessive large data models with a lot of information that was kept in there just because the developers didn't really clarify what users need. Or the users didn't even know. In this case, I'd recommend to start small, with what is absolutely needed now, keep the data models clean, extendable star schemas, and add as needed. Don't expect to build the one large final dataset. A BI project is never finished anyway.
  • Consider splitting the large dataset into smaller datasets or provide datamarts with smaller datasets from the large dataset on a data warehouse level.
  • Don't use text columns for relationships like the Data Verse GUID columns. Replace GUIDs by integer numbers for relationships. If you need to display the GUID somewhere, you can keep it in the table on the "one" end of  the relationship, but don't use it for relationships. If not needed, remove GUID columns completely.
  • Here are two real-life examples of what was achieved by reasonably balancing business requirements and costs:
  • Make it a competition for you how small you can get the datasets, not how large you can get the datasets.

BR

Martin

View solution in original post

3 REPLIES 3
Martin_D
Super User
Super User

Hi @Deepak_ ,

 

With larget semantic model, this size is not a problem per se https://learn.microsoft.com/en-us/power-bi/enterprise/service-premium-large-models

 

Nevertheless you can apply best practices to keep your dataset small and fast. What is possible and reasonable for your dataset depends on what you already have optimized and what are your requirements.
Besides the official recommendations https://learn.microsoft.com/en-us/power-bi/guidance/import-modeling-data-reduction I'd like to add:

  • Talk with your users about the actual requirements. I frequently see excessive large data models with a lot of information that was kept in there just because the developers didn't really clarify what users need. Or the users didn't even know. In this case, I'd recommend to start small, with what is absolutely needed now, keep the data models clean, extendable star schemas, and add as needed. Don't expect to build the one large final dataset. A BI project is never finished anyway.
  • Consider splitting the large dataset into smaller datasets or provide datamarts with smaller datasets from the large dataset on a data warehouse level.
  • Don't use text columns for relationships like the Data Verse GUID columns. Replace GUIDs by integer numbers for relationships. If you need to display the GUID somewhere, you can keep it in the table on the "one" end of  the relationship, but don't use it for relationships. If not needed, remove GUID columns completely.
  • Here are two real-life examples of what was achieved by reasonably balancing business requirements and costs:
  • Make it a competition for you how small you can get the datasets, not how large you can get the datasets.

BR

Martin

Thanks @Martin_D ! This Helps.

My client wants to keep all of their data in one dataset. Hence, the data size is getting bigger. and the connection we are using for the reports is a live connection so if I perform some data cleaning for example removing some columns/rows that not has been used in the given report, other reports may be affected. So I want to look for any possible workaround.

You can use this tool to analyse all reports in the tenant which columns they actually use: https://en.brunner.bi/measurekiller

 

Local pbix files that users have not published are not captured. But even with existing reports in place, you can ask your users about their requirements and which columns they actually use. Of course, capturing all columns used, once you have made available everything, is harder than growing the dataset more restictively from the beginning.

 

Some of the concepts above can also be applied to exising datasets without affecting existing reports, like using integer type relationship columns.

Helpful resources

Announcements
Power BI DataViz World Championships

Power BI Dataviz World Championships

The Power BI Data Visualization World Championships is back! Get ahead of the game and start preparing now!

December 2025 Power BI Update Carousel

Power BI Monthly Update - December 2025

Check out the December 2025 Power BI Holiday Recap!

FabCon Atlanta 2026 carousel

FabCon Atlanta 2026

Join us at FabCon Atlanta, March 16-20, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.