Check your eligibility for this 50% exam voucher offer and join us for free live learning sessions to get prepared for Exam DP-700.
Get StartedDon't miss out! 2025 Microsoft Fabric Community Conference, March 31 - April 2, Las Vegas, Nevada. Use code MSCUST for a $150 discount. Prices go up February 11th. Register now.
Hi all,
Let's say I need to do data transformations in Power Query for my new report.
Let's say I am focusing specifically on one table in my semantic model, which requires some transformations in Power Query before I can load it into my semantic model.
There may be different alternative routes I can take (in terms of which Power Query functions to use, sequence of transformation steps in Power Query, etc.) in order to achieve the same data output to be loaded into my semantic model.
I want to compare my different alternative routes (i.e. different M code scripts) in terms of how they will impact the capacity unit (CU) usage on my premium capacity.
I want to choose the M query (M code) which minimizes CU usage.
How can I compare the CU usage of two (or more) alternative Power Query (M) queries, which are all giving me the same required data output (the same output table) but they do so by using different transformation logic.
I am willing to create two or more alternative versions of the .pbix file (and similarly, two or more alternative versions of a dataflow) in order to compare the CU usage of the alternative M codes, if necessary, while I am developing the solution.
What will be the useful ways of comparing the CU usage?
Should I do the comparison using Power BI desktop, or should I publish the alternative .pbix files to the premium service and do the comparison of CU usage there (to get real world numbers)?
Should I
Thank you 😀
Thank you, @lbendlin!
I am curious, what will be the benefit of placing the pbix files in different workspaces?
My plan is to continue to explore this topic.
I also see Log Analytics mentioned in some blog posts, however Log Analytics is not included in the Power BI (Fabric) license (so it will be an additional cost) and the Log Analytics only seems to cover semantic models (not dataflows) from what I read.
So I guess I will use the Fabric Capacity Metrics App, or create my own report connected to the Fabric Capacity Metrics App semantic model (however that will require an additional effort).
If I want to use the CU based approach.
And also for me, it makes totally sense to use the CU based approach. Because my primary aim in this optimization is to ensure I am not wasting the precious CU's in my capacity 😉
I assume that adhering to best practices (like taking advantage of query folding, removing unnecessary data before complex transformations, use incremental refresh, etc.) will help me a lot to avoid excessive usage of CU.
But other than that, there are still some cases where I am not sure about what is the best choice between two alternative M code scripts (regarding CU usage). So it will be interesting to test the alternative M scripts.
Thank you for your insights!
Additional suggestions are also very welcome 😀
what will be the benefit of placing the pbix files in different workspaces?
To reduce measurement bias. Same reason why you shouldn't place the metrics app into a premium/fabric workspace.
I'm not sure if I understand how that will reduce measurement bias in this case. My first thought is that I should put the pbix files in the same workspace, to give the pbix files the same playing field (give all the pbix files equal test conditions). I may be missing something here. Could you please elaborate on why placing the pbix files in different workspaces will reduce the measurement bias?
I am also trying to understand why placing the Fabric Capacity Metrics App in a Premium/Fabric workspace introduces measurement bias.
Per my understanding, the data source of the Fabric Capacity Metrics App is a Kusto database which is placed somewhere else outside of the workspace.
So I don't understand why placing the Fabric Capacity Metrics App in a pro or premium/fabric workspace will make a difference. I may be missing something here. Could you please elaborate on this as well?
Thank you 😃
Best regards, Frithjof
You want to run the refreshes in parallel. You can choose to do so in the same workspace if you think that they will not impact each other.
Putting the watcher into the watched environment violates a basic tenet of monitoring. You just don't do that.
Thank you @lbendlin !
I don't think I have read or heard anywhere about items in the same workspace impacting each other when it comes to "competing for compute resources".
After all, the items will still be on the same capacity even if I put them into separate workspaces.
My current understanding is that all items on the same capacity are "competing for the compute resources", regardless of whether they are in the same workspace or in separate workspaces.
Maybe I'm missing something here. Please do let me know if you have any further information/insights on this.
Best regards, Frithjof
As I said - you can choose to do that in the same workspace. Your call.
The answer is a definitive maybe.
- Ideally place your comparative pbix into different workspaces
- Ideally run all of the tests at the same time
- repeat your tests at least ten times, throw away the outliers and average the rest
(yes, assuming you have Premium/Fabric capacity and are using the metric app)
Or - examine your Power Query code and optimize it manually. There are lots of things you can do like rearranging the order of the steps and liberally applying Table.Buffer and its siblings.
I still like the CU based approach more as it gives you a true cost number. It will be a combination of time spent and computational complexity.
March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount! Prices go up Feb. 11th.
Check out the January 2025 Power BI update to learn about new features in Reporting, Modeling, and Data Connectivity.
User | Count |
---|---|
14 | |
13 | |
12 | |
12 | |
12 |