Solved: Performant Power BI Model and Model Subset

henry_gonsalves · ‎05-02-2025

I'm looking to produce a set of reports that will be based off a dataset which could grow to be relatively large 40M+ rows. We were planning to use Import mode and a nightly scheduled refresh from databricks.

SMEs often need to drill down to row level detail in the reports or based on queries against the semantic model connection.

What should be the main things we need to consider in order to ensure the model and reports are performant for users? What measures can be employed to ensure it is quick to use? I was planning on using import mode, would it be beneficial to use direct query in conjunction with database indexes/ partitions in some cases?

I've been considering whether we produce a full semantic model for all the data, and then the produce a second model which is exactly the same but only a subset of data e.g. the last 12 months which will be smaller in size to use for row level drill down and analysis.

Is there a way to easily produce a secondary model which is aligned to the first (all the same underlying objects, relationships, etc) but is just filtered down to a subset of the data? How can I easily keep the models otherwise aligned (other than the quantity of data involved)?

andrewsommer · ‎05-02-2025

First, I do not see any need here for a secondary model with a subset of data.

Import Mode is the default for performance; especially with scheduled refreshes and DAX optimizations and is a great starting choice. But it has memory limits (especially on Pro vs Premium capacity) and can slow down with high cardinality or frequent drilldowns.

In general, its less about the size of the model and more about how the model is constructed.

Make sure you are doing these things:

Use a star schema
Reduce column cardinality
Get rid of unnecessary fields
Always use explicit measures

Please mark this post as solution if it helps you. Appreciate Kudos.

View solution in original post

Anonymous · ‎05-04-2025

Hi @henry_gonsalves ,
Thank you @andrewsommer for the helpful insights!

In addition to the andrewsommer suggestion, since you are working with 40M+ rows and SME users need frequent drill downs, it is worth considering a few additional strategies. Partitioning your fact tables by date and enabling incremental refresh can help reduce memory usage and refresh duration.

Your idea of a secondary model filtered to recent data is actually a viable approach when you need faster performance for detail-level analysis. You may keep it aligned with your main model by reusing the same dataflows or Lakehouse views, and cloning the original model as a .pbit file.

If real time access becomes important, Direct Query could work, but with some performance trade-offs, so backend optimization is key. Also, tools like the Monitoring Hub and DAX Studio can help you fine.

Refer the documents here for more inforamtion:
https://learn.microsoft.com/en-us/power-bi/create-reports/desktop-templates
https://learn.microsoft.com/en-us/power-bi/create-reports/desktop-performance-analyzer
https://learn.microsoft.com/en-us/power-bi/connect-data/incremental-refresh-overview

Hope this resolve your query.If so,consider accepting it as solution.

Regards,
Pallavi.

View solution in original post

Anonymous · ‎05-18-2025

Hi @henry_gonsalves ,

Could you please confirm if the issue has been resolved on your end? If a solution has been found, it would be greatly appreciated if you could share your insights with the community. This would be helpful for other members who may encounter similar issues.
Thank you for your understanding and assistance.

Anonymous · ‎05-15-2025

Hi @henry_gonsalves ,
Following up to check whether you got a chance to review the suggestion given.II hope it helps.f so,consider accepting it as solution,it will be helpful for other members of the community who have similar problems as yours to solve it faster. Glad to help.
Thank you.

Anonymous · ‎05-09-2025

Hi @henry_gonsalves ,
Has the issue been resolved on your end? If so, please share your solution and mark it as "Accept as Solution." This will assist others in the community who are dealing with similar problems and help them find a solution more quickly.
Thank you.

Anonymous · ‎05-04-2025

Hi @henry_gonsalves ,
Thank you @andrewsommer for the helpful insights!

In addition to the andrewsommer suggestion, since you are working with 40M+ rows and SME users need frequent drill downs, it is worth considering a few additional strategies. Partitioning your fact tables by date and enabling incremental refresh can help reduce memory usage and refresh duration.

Your idea of a secondary model filtered to recent data is actually a viable approach when you need faster performance for detail-level analysis. You may keep it aligned with your main model by reusing the same dataflows or Lakehouse views, and cloning the original model as a .pbit file.

If real time access becomes important, Direct Query could work, but with some performance trade-offs, so backend optimization is key. Also, tools like the Monitoring Hub and DAX Studio can help you fine.

Refer the documents here for more inforamtion:
https://learn.microsoft.com/en-us/power-bi/create-reports/desktop-templates
https://learn.microsoft.com/en-us/power-bi/create-reports/desktop-performance-analyzer
https://learn.microsoft.com/en-us/power-bi/connect-data/incremental-refresh-overview

Hope this resolve your query.If so,consider accepting it as solution.

Regards,
Pallavi.

henry_gonsalves · ‎05-09-2025

Thanks Pallavi. Is there a better way to filter the first dataset down to produce the second? Would using a template as above generate a second dataset?

Anonymous · ‎05-11-2025

Hi @henry_gonsalves ,
Thank you for the follow-up.I would be happy to assist you!

A .pbit template will create a new dataset when published to the Power BI service, retaining the model structure without data. When you apply a filter (like "Last 12 months") in Power Query, only that subset is loaded as a new dataset. To keep models aligned, reuse the same dataflows or Lakehouse views, apply the filter in Power Query (e.g., = Table.SelectRows(Source, each [Date] >= Date.AddMonths(DateTime.LocalNow(), -12))), and consider using deployment pipelines for synchronization. Learn more here: Create and use Power BI template files (.pbit).

I hope this helps.If so,consider accepting it as solution.

Regards,
Pallavi.

andrewsommer · ‎05-02-2025

First, I do not see any need here for a secondary model with a subset of data.

Import Mode is the default for performance; especially with scheduled refreshes and DAX optimizations and is a great starting choice. But it has memory limits (especially on Pro vs Premium capacity) and can slow down with high cardinality or frequent drilldowns.

In general, its less about the size of the model and more about how the model is constructed.

Make sure you are doing these things:

Use a star schema
Reduce column cardinality
Get rid of unnecessary fields
Always use explicit measures

Please mark this post as solution if it helps you. Appreciate Kudos.

henry_gonsalves · ‎05-09-2025

Thanks Andrew. We were planning on doing those things anyway which is good to know.

Interested by point 4. Can you speak to the performance difference of explicit vs implicit measures?

Performant Power BI Model and Model Subset

Helpful resources

Power BI Dataviz World Championships

Power BI Monthly Update - December 2025

FabCon Atlanta 2026

How to Get Your Question Answered Quickly

FabCon is coming to Atlanta