Solved: Azure databricks with direct lake semantic model

alisha3495 · ‎10-28-2025

Hi,

We are currently trying to evaluate the approach of mirroring databricks unity catalog to fabric or materialising our gold/platinum level tables into a lakehouse within fabric.

Is there a difference in the query performance and CU's that are utilised when creating a direct lake semantic model in either approach?

We are finding that when we are performing a few interactive operations at the same time in reports connected to the model, the performance is slow and eventually stops completely as we are using 100% of capacity though we are on an F8 capacity. I have scaled up to an F16 but not sure if this will solve the problem.

It is really difficult to analyse, using the capacity metrics app, what exactly is causing the spikes other than knowing that it is an interaction with the reports.

What we don't know is whether the model will perform better if the tables were materialised in a lakehouse rather than mirrored.

tayloramy · ‎10-28-2025

Hi @alisha3495,

You can drill through to the timepoint details page on Capacity Metrics to get item level details on exactly what artifacts are using capacity: https://learn.microsoft.com/en-us/fabric/enterprise/metrics-app-timepoint-page

You can also use the estimator for getting a loose idea about what size capacity you may need: https://www.microsoft.com/en-us/microsoft-fabric/capacity-estimator

At my org we're doing something similar on a fairly small databricks enviornment (though small/large is a little subjective, I have 50TB databases in my org so I may be biased in thinking it is small) and we are using an F128 which is keeping up well, but I couldn't imagine trying to do it on an F8 or F16.

If you found this helpful, consider giving some Kudos. If I answered your question or solved your problem, mark this post as the solution.

View solution in original post

v-kpoloju-msft · ‎10-30-2025

Hi @alisha3495,

Thank you for reaching out to the Microsoft Fabric Community Forum. Also, thanks to @tayloramy, @lbendlin, for those inputs on this thread.

When using Direct Lake with mirrored Databricks Unity Catalog tables, queries still read from the Databricks source, which can increase compute usage (CUs) under heavy interaction.

If you materialize your gold/platinum tables into a Fabric Lakehouse, Direct Lake can access them natively within One Lake, improving performance and reducing CU spikes especially when tables are compacted, partitioned, and optimized (Z-order).

In the Fabric Capacity Metrics app, check the Compute and Operations pages to confirm if the spikes are from Semantic model queries.
https://learn.microsoft.com/en-us/fabric/enterprise/metrics-app
https://learn.microsoft.com/en-us/fabric/enterprise/metrics-app-install?tabs=1st

Import or pre-aggregate the most-used summary tables (Direct Lake + Import composite models). Reduce the number of high-cardinality visuals in the same page or use slicers with “Apply” button mode.
https://learn.microsoft.com/en-gb/fabric/fundamentals/direct-lake-overview

Copy your key gold/platinum tables into a Fabric Lakehouse and optimize their file structure (compaction, partitioning, Z-order). Then compare performance using the same report interactions.
https://learn.microsoft.com/en-us/fabric/mirroring/azure-databricks

Scaling from F8 → F16 gives more headroom, but it’s best done after confirming your tables and queries are tuned efficiently.
https://learn.microsoft.com/en-us/fabric/enterprise/scale-capacity

Hope this clears it up. Let us know if you have any doubts regarding this. We will be happy to help.

Thank you for using the Microsoft Fabric Community Forum.

View solution in original post

v-kpoloju-msft · ‎10-30-2025

Hi @alisha3495,

Thank you for reaching out to the Microsoft Fabric Community Forum. Also, thanks to @tayloramy, @lbendlin, for those inputs on this thread.

When using Direct Lake with mirrored Databricks Unity Catalog tables, queries still read from the Databricks source, which can increase compute usage (CUs) under heavy interaction.

If you materialize your gold/platinum tables into a Fabric Lakehouse, Direct Lake can access them natively within One Lake, improving performance and reducing CU spikes especially when tables are compacted, partitioned, and optimized (Z-order).

In the Fabric Capacity Metrics app, check the Compute and Operations pages to confirm if the spikes are from Semantic model queries.
https://learn.microsoft.com/en-us/fabric/enterprise/metrics-app
https://learn.microsoft.com/en-us/fabric/enterprise/metrics-app-install?tabs=1st

Import or pre-aggregate the most-used summary tables (Direct Lake + Import composite models). Reduce the number of high-cardinality visuals in the same page or use slicers with “Apply” button mode.
https://learn.microsoft.com/en-gb/fabric/fundamentals/direct-lake-overview

Copy your key gold/platinum tables into a Fabric Lakehouse and optimize their file structure (compaction, partitioning, Z-order). Then compare performance using the same report interactions.
https://learn.microsoft.com/en-us/fabric/mirroring/azure-databricks

Scaling from F8 → F16 gives more headroom, but it’s best done after confirming your tables and queries are tuned efficiently.
https://learn.microsoft.com/en-us/fabric/enterprise/scale-capacity

Hope this clears it up. Let us know if you have any doubts regarding this. We will be happy to help.

Thank you for using the Microsoft Fabric Community Forum.

v-kpoloju-msft · ‎11-02-2025

Hi @alisha3495,

Just checking in to see if the issue has been resolved on your end. If the earlier suggestions helped, that’s great to hear! And if you’re still facing challenges, feel free to share more details happy to assist further.

Thank you.

v-kpoloju-msft · ‎11-04-2025

Hi @alisha3495,

Just wanted to follow up. If the shared guidance worked for you, that’s wonderful hopefully it also helps others looking for similar answers. If there’s anything else you'd like to explore or clarify, don’t hesitate to reach out.

Thank you.

tayloramy · ‎10-28-2025

Hi @alisha3495,

You can drill through to the timepoint details page on Capacity Metrics to get item level details on exactly what artifacts are using capacity: https://learn.microsoft.com/en-us/fabric/enterprise/metrics-app-timepoint-page

You can also use the estimator for getting a loose idea about what size capacity you may need: https://www.microsoft.com/en-us/microsoft-fabric/capacity-estimator

At my org we're doing something similar on a fairly small databricks enviornment (though small/large is a little subjective, I have 50TB databases in my org so I may be biased in thinking it is small) and we are using an F128 which is keeping up well, but I couldn't imagine trying to do it on an F8 or F16.

If you found this helpful, consider giving some Kudos. If I answered your question or solved your problem, mark this post as the solution.

lbendlin · ‎10-28-2025

Unless you have a very very small ADBx instance you need to plan bigger, much bigger. Even a F64 is way too small for a regular sized Databricks and its linkage.

You want to be clear about the enormous additional cost of mirroring ADBx into Fabric. In effect you are duplicating one data warehouse into another.