Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Join us for an expert-led overview of the tools and concepts you'll need to become a Certified Power BI Data Analyst and pass exam PL-300. Register now.

Reply
Maerkus
Frequent Visitor

Best practices for optimizing performance when working with large fact and dimension tables

One of my customers has a data platform where they have a large number of fact and dimension tables with huge amounts of data. They are struggling with developing reports in Power BI desktop, as performance is dramatic and it often freezes. 

Question is, what kind of best practices are there when it comes to developing reports and what are they doing right and what are they doing wrong:

  • They load the fact and dimension tables they need as is, without any transformation in M query.
  • They already switched from import to directquery mode hoping this would enhance performance. 
  • The data comes from Azure Databricks.
  • They only load the fact tables and dimension tables they need and try to use relationships that represent an inner or left join.
  • They use relatively simple DAX queries.

At other companies, do they perhaps create per use case a specific view or dataset that is limited for that purpose only, so that Power BI only gets what is needed for that use case? Meaning all aggregation and such is done prior to doing visualization in Power BI? or are there other best practices to consider? 

1 ACCEPTED SOLUTION
Anonymous
Not applicable

Hi @Maerkus 

What They Are Doing Right

  • Switching to DirectQuery Mode: This can be a good practice when dealing with very large datasets, as it prevents Power BI from loading the entire dataset into memory. Instead, queries are sent to the source database, which can be more efficient if the database is optimized for such queries.
  • Loading Only Necessary Tables: This is also a best practice. Loading only the fact and dimension tables needed for analysis reduces the complexity and size of the model, potentially improving performance.
  • Using Relationships: Properly defining relationships between tables is crucial for accurate and efficient data modeling in Power BI. It ensures that DAX calculations and visuals are based on the correct data context.

Areas for Improvement

  • No Transformation in M Query: While it's important to keep transformations efficient, not leveraging Power Query (M) to preprocess and clean data can lead to inefficient data models and slower reports. Preprocessing can remove unnecessary columns, filter rows, or perform early aggregations, which can significantly improve performance.
  • Simple DAX Queries: While simple DAX is generally good for performance, it's also important to ensure that DAX queries are optimized. Even simple DAX queries can be inefficient if not written properly. It’s vital to review and optimize DAX calculations for performance.

Best Practices and Suggestions

  1. Optimize Data Model: Streamline the data model by removing unnecessary columns, combining tables where possible, and ensuring that the model is as simple and flat as possible. This can improve both the performance and the ease of use in report development.

  2. Use Aggregated Tables: For large datasets, consider creating aggregated tables that summarize the data at a higher level. Power BI can automatically use these tables for visuals that don't require the detail level, improving performance.

  3. Incremental Data Loading: If using Import mode in some scenarios, consider implementing incremental data refreshes to reduce the volume of data loaded and processed during each refresh.

  4. Optimize DirectQuery: When using DirectQuery, it's crucial to optimize the source database for the queries Power BI will execute. This might involve creating indexes, optimizing query performance in the database, and minimizing the complexity of DAX queries that translate into complex SQL queries.

  5. Avoid Bi-directional Relationships: Unless absolutely necessary, avoid bi-directional relationships as they can cause performance issues and model complexity. Stick to single-direction relationships.

 

 

 

 

 

 

Best Regards,

Jayleny

 

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

 

 

View solution in original post

4 REPLIES 4
Anonymous
Not applicable

Hi @Maerkus 

What They Are Doing Right

  • Switching to DirectQuery Mode: This can be a good practice when dealing with very large datasets, as it prevents Power BI from loading the entire dataset into memory. Instead, queries are sent to the source database, which can be more efficient if the database is optimized for such queries.
  • Loading Only Necessary Tables: This is also a best practice. Loading only the fact and dimension tables needed for analysis reduces the complexity and size of the model, potentially improving performance.
  • Using Relationships: Properly defining relationships between tables is crucial for accurate and efficient data modeling in Power BI. It ensures that DAX calculations and visuals are based on the correct data context.

Areas for Improvement

  • No Transformation in M Query: While it's important to keep transformations efficient, not leveraging Power Query (M) to preprocess and clean data can lead to inefficient data models and slower reports. Preprocessing can remove unnecessary columns, filter rows, or perform early aggregations, which can significantly improve performance.
  • Simple DAX Queries: While simple DAX is generally good for performance, it's also important to ensure that DAX queries are optimized. Even simple DAX queries can be inefficient if not written properly. It’s vital to review and optimize DAX calculations for performance.

Best Practices and Suggestions

  1. Optimize Data Model: Streamline the data model by removing unnecessary columns, combining tables where possible, and ensuring that the model is as simple and flat as possible. This can improve both the performance and the ease of use in report development.

  2. Use Aggregated Tables: For large datasets, consider creating aggregated tables that summarize the data at a higher level. Power BI can automatically use these tables for visuals that don't require the detail level, improving performance.

  3. Incremental Data Loading: If using Import mode in some scenarios, consider implementing incremental data refreshes to reduce the volume of data loaded and processed during each refresh.

  4. Optimize DirectQuery: When using DirectQuery, it's crucial to optimize the source database for the queries Power BI will execute. This might involve creating indexes, optimizing query performance in the database, and minimizing the complexity of DAX queries that translate into complex SQL queries.

  5. Avoid Bi-directional Relationships: Unless absolutely necessary, avoid bi-directional relationships as they can cause performance issues and model complexity. Stick to single-direction relationships.

 

 

 

 

 

 

Best Regards,

Jayleny

 

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

 

 

Thank you for all your suggestions everyone, I will look into it further, but this answer confirms my thoughts and gives me some new ideas.

Daniel29195
Super User
Super User

@Maerkus 

 

how many rows does the fact contains ? 

 

and the dimensions  ? 

 

Syk
Super User
Super User

For performance, they shouldn't be brining in tables (or columns) that are not used in the report. It seems counter intuitive because "what if I want to see it this way?" always comes up. But, you need to design it for performance as well! For instance if you are designing a shoe and want it to be fast, you wouldn't include built-in snow shoes in case it might snow... If the use case calls for a fast snow shoe, just redesign (be sure to emphesize this isn't always easy!).

Helpful resources

Announcements
Join our Fabric User Panel

Join our Fabric User Panel

This is your chance to engage directly with the engineering team behind Fabric and Power BI. Share your experiences and shape the future.

June 2025 Power BI Update Carousel

Power BI Monthly Update - June 2025

Check out the June 2025 Power BI update to learn about new features.

June 2025 community update carousel

Fabric Community Update - June 2025

Find out what's new and trending in the Fabric community.