Don't miss your chance to take the Fabric Data Engineer (DP-600) exam for FREE! Find out how by attending the DP-600 session on April 23rd (pacific time), live or on-demand.
Learn moreNext up in the FabCon + SQLCon recap series: The roadmap for Microsoft SQL and Maximizing Developer experiences in Fabric. All sessions are available on-demand after the live show. Register now
Hi all,
I'm currently working on combining several datasets into one query. I am wondering if doing multiple merges in Power Query will substantially slow down my refresh time? I need all of the fields to be in one query, so I guess my real question is....Is it better to do this with merges in power query and not "enable load" for those tables? Or load them all and use the RELATED function to create custom columns? I hope this makes sense! Just looking for "best practices"!
Solved! Go to Solution.
Hi @SHOOKANSON ,
Power Query merges are very resource-expensive so, yes, multiple merges will definitely slow your refresh down considerably, although this work will be borne by your gateway(s) so not necessarily the end of the world, depending on how frequently you plan to refresh.
Your proposed alternative of using RELATED to create calculated columns is similarly resource-expensive, the difference though being that you will be offloading the work to the enduser, therefore potentially causing poor report performance and/or resource failures. The use of calculated columns in general is strongly advised against for this reason.
In terms of best practice, you should be looking to pass all of the individual tables to the data model, then use relationships to create 'virtual merges' between them, and measures to calculate across them.
You want to be aiming for a STAR or SNOWFLAKE schema model as this will best leverage Power BI's working structure.
Pete
Proud to be a Datanaut!
Hi @SHOOKANSON ,
Power Query merges are very resource-expensive so, yes, multiple merges will definitely slow your refresh down considerably, although this work will be borne by your gateway(s) so not necessarily the end of the world, depending on how frequently you plan to refresh.
Your proposed alternative of using RELATED to create calculated columns is similarly resource-expensive, the difference though being that you will be offloading the work to the enduser, therefore potentially causing poor report performance and/or resource failures. The use of calculated columns in general is strongly advised against for this reason.
In terms of best practice, you should be looking to pass all of the individual tables to the data model, then use relationships to create 'virtual merges' between them, and measures to calculate across them.
You want to be aiming for a STAR or SNOWFLAKE schema model as this will best leverage Power BI's working structure.
Pete
Proud to be a Datanaut!
If you have recently started exploring Fabric, we'd love to hear how it's going. Your feedback can help with product improvements.
A new Power BI DataViz World Championship is coming this June! Don't miss out on submitting your entry.
| User | Count |
|---|---|
| 5 | |
| 3 | |
| 3 | |
| 3 | |
| 2 |
| User | Count |
|---|---|
| 7 | |
| 5 | |
| 5 | |
| 5 | |
| 4 |