The ultimate Fabric, Power BI, SQL, and AI community-led learning event. Save €200 with code FABCOMM.
Get registeredCompete to become Power BI Data Viz World Champion! First round ends August 18th. Get started.
Hi!
I have a hybrid model DQ/Import with a large fact table that is in Direct Query for the data source I'm using databricks SQL warehouse.
I've created aggregation tables and configured the aggregations to respond to the most used query patterns and it's working well. But even if a report doesn't need to go to DQ I still see queries in databricks like:
Listing cross references
Listing primary keys
Listing columns 'catalog: ...
That are taking time and slowing the report.
One other major issue is queries that go to DQ that take ~20seg or less to run and return only few rows are causing an increase in the CU% and sometimes cause throttling.
Why are DQ queries consuming so much capacity units and what are those "Listing..." queries? Is there a way to improve this behaviour?
This way it's not possible for the report to be used by multiple users.
Thanks in advance.
Solved! Go to Solution.
A lot of those extra metadata queries that are generated in DirectQuery mode are unavoidable. However there is a well-known trick for Snowflake DirectQuery that I have heard works for Databricks too which might help: make sure the identity that you are connecting to Databricks from Power BI has only got permissions to access the tables you are using in your semantic model. This should improve the performance of these metadata queries.
There's no way to reduce the CUs used in DirectQuery mode. You can try to reduce the number of DirectQuery queries generated by reducing the number of visuals on each report page though, for example by using small multiples instead of separate visuals.
This video might also give you some ideas:
https://blog.crossjoin.co.uk/2024/03/24/best-practices-for-power-bi-on-databricks-webinar/
A lot of those extra metadata queries that are generated in DirectQuery mode are unavoidable. However there is a well-known trick for Snowflake DirectQuery that I have heard works for Databricks too which might help: make sure the identity that you are connecting to Databricks from Power BI has only got permissions to access the tables you are using in your semantic model. This should improve the performance of these metadata queries.
There's no way to reduce the CUs used in DirectQuery mode. You can try to reduce the number of DirectQuery queries generated by reducing the number of visuals on each report page though, for example by using small multiples instead of separate visuals.
This video might also give you some ideas:
https://blog.crossjoin.co.uk/2024/03/24/best-practices-for-power-bi-on-databricks-webinar/
"The Italians" have articles and videos on the topic of debugging query plans where aggregations are ignored. Read up on that at sqlbi.com and modify your queries and/or aggregations.