Don't miss your chance to take the Fabric Data Engineer (DP-700) exam on us!
Learn moreWe've captured the moments from FabCon & SQLCon that everyone is talking about, and we are bringing them to the community, live and on-demand. Starts on April 14th. Register now
I am connecting to Databricks through Native Query that uses left joins and has where clause.
Based on the Microsodt Document - Optimize by Expanding Columns it seems that merge queries are better than using left joins.
Also based on the article on Query folding in Azure Databricks it is suggested that Dtabricks.Query be used instead of the Databricks.Catalog connector that does not allow query folding. Additionaly, it is suggested that teh WHERE clause be separated out.
Based on the above:
1. Are merge queries better than left joins used in native SQL queries? Why so? When the technique mentioned in the article was used, I could see that "View Native Query" option was enabled i.e not greyed out meaning query folding was happening. Wouldn't the entire query including the joins and where clauses get executed at the source -Databricks?
2. According to the article, its better to split the query into 2 parts - WHERE clause separately and also parametarise the query if possible. What advantage does this bring (if any)? My query still folds with just one long query including the WHERE clause.
3. What about the Databricks.Catalogs is not allowing query folding to happen ? Even when Enable query folding was set to "TRUE", the option to view native query was disabled as shown below:
Value.NativeQuery(Databricks.Catalogs(host_path, warehouse_path)[Catalog="hive_metastore", Database=null, EnableAutomaticProxyDiscovery=null]) {[Name="hive_metastore",Kind="Database"]}[Data], "select * from schema.table A left join another_table B on A.key_col = B.key_col where B.key_col = "some_value"", null, [EnableFolding=true])
Which is a preferable way to optimise for efficiency?
1. Import separate tables and the merge them into one ?
2. Use Databricks.Query and use a SQL with left join and seperate out the where clause ?
@Hoping did you ever fix this? running into the same issues now with query folding and databricks. Whatever I try, it doesn't work even though databricks itself explicitly mention that query folding should be possible now:
https://www.databricks.com/blog/databricks-power-bi-connector-now-supports-native-query
If you have recently started exploring Fabric, we'd love to hear how it's going. Your feedback can help with product improvements.
A new Power BI DataViz World Championship is coming this June! Don't miss out on submitting your entry.
Share feedback directly with Fabric product managers, participate in targeted research studies and influence the Fabric roadmap.
| User | Count |
|---|---|
| 5 | |
| 4 | |
| 3 | |
| 3 | |
| 2 |
| User | Count |
|---|---|
| 8 | |
| 7 | |
| 6 | |
| 6 | |
| 5 |