Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Join us for an expert-led overview of the tools and concepts you'll need to become a Certified Power BI Data Analyst and pass exam PL-300. Register now.

Reply
Vegard1985
Helper I
Helper I

Advise on Power BI Premium capacity performance, RLS and group wide "golden dataset" vs localized

Hi there,

 

I'm coming here to seek some advise on performance tuning and best practices, as I've noticed our P1-capacity is reaching 

"100 percent" quite often lately - and I'm trying to find ways to avoid potential performance reduction.

 

We're a group with around 50 legal companies operating in 10 countries, with around 300-400 Power BI users currently, but this will grow. We comprise of two groups that merged fairly recently, and the Power BI roll-out and strategy is in it's build-phase still I would say.

 

Our main strategy has been to have a centralized workspace with group wide datasets, that only those in charge of model development have edit-access to (governed by IT). Then we create either "thin" reports on top, utilizing only the datasets as is, or composite models utilizing multiple datasets, or adding something from the side - it all depends a bit on the need and scenario.

This is a quick drawing I did some time ago showcasing parts of our centralized workflow. I've been under the impression that having a "central / golden dataset" is something to strive for, so that has been somewhat a corner stone idea.

 

Power BI Service.png

 

Everything is on a need to know basis, so we have to utilize RLS quite extensively, and it's also fairly complex with multiple dimensions involved. Previously I've been working with quite simple RLS rules, like "you have access to this list of company codes", and therefor using Dynamic RLS with Azure AD security groups, which was really neat. But now with the RLS rules being so complex, some people have access on company level, some based on customer related dimensions, some based on material related dimensions, some on a combination, some also time sensitive.. So I haven't been able to use Dynamic RLS, at least not yet unfortunately.

 

Question 1. 

Of course Dynamic RLS is a desirable approach when it comes to access grants, as we can use Azure AD and a normal "access workflow routine". But, in terms of performance, is Dynamic RLS better than "manual RLS"? For Dynamic RLS to work, you have to enforce some sort of "filter both ways" deal on the facts, and I've read somewhere that filtering both ways is degrading performance a bit. Is that true, and if so - purely talking from a performance perspective, would a manual RLS be better?

I'm only questioning this, since from an access perspective I would like it, but if it affects performance in a bad way-- mmm, not so sure anymore.

 

Question 2. 

About the alert itself, "reaching 100 percent" of the capacity. I've tried looking into the "Microsoft Fabric Capacity Metrics", but I'm not certain about what I'm looking at. Could someone help me understand what I should be looking at, and if it's easy to find some "main culprits"? 

The first item on this list is the dataset I'm using as an example afterwards, as it's being used a lot and is fairly big in size.

Which one of these items should I be mostly conserned about, is it the red and orange one? Even though not used as much, is the performance so bad that it strangles the capacity?  How should I read this? If I click on these items, I'm not able to see any more detailed information.

 

The last item visible on this list is a report that is using a central dataset (dataset is the first item on this list). I see theres 19 overloaded minutes on the report, 6 min overloaded on the dataset. How should I read that? That the report DAX is too complex or resource heavy? What is Performance delta?

 

metric2.png

 

 

 

 

Now regarding performance on dataset/report level, and let me just use this as an example.

I suppose many of you are familiar with ERP systems like SAP. One of the most used datasets/reports we have is related to COPA data. This can be a pretty hefty sized table, and we're loading key dimensions (8) and pretty much all the valuefields in our system (40). This dataset is on a daily, company, customer, product level, with around 30 million rows in Power BI after load.

We have 4 different business currencies, so we're also adding FX calculation so people can select one of the four currencies when looking at the data, this is done in Power BI using DAX, as I don't believe persisting 4 different currencies on 40 valuefields sounds like a good plan.

 

However, the report users are everywhere from top management to a local sales rep. Since I've been under the impression that the "golden dataset" approach is adviseable, it's only recently I've been silently questioning (up untill now!) if it's a good deal or not in this setup.

For instance, a fair amount of the people using this report, only have access to some companies, and these companies operate in the same country, with the same currency. Let's say the RLS affecting one local team, could effectively filter the dataset to 500,000 rows out of these 30 million. They would only be interested in their local currency, which is the base for the COPA valuefields. So, the somewhat complex formulas used in the current report, could have been just a SUM(), and it just feels very counterintuitive  and heavy to force the data through that DAX.

 

Question 3.

"Golden dataset", IS it the best performance approach? Or should we carve out the datasets and reports for "local" usage, so for instance in my example over, the whole dataset is 500,000 rows instead of 30 million for this local team. And if that team had it's own local report version, using just SUM() local currency values? This should probably ease up the calculation and capacity usage, at least in my head. But with how much? And is it worthwhile to split it up and lose the convenience of developing "one central dataset" and "one central report"?

 

Top management need to use this "one central report" because they need to see the whole picture and looking at one currency. But it's the local report users I'm concerned about and feel their usage is clogging up the capacity.

 

Also, when it comes to report consuming, currently we have one central app distributed to everyone, since everything is RLS based, this works brilliantly, and together with the Audience function in the app, people get the reports they need. But if we start splitting up reports, we need to split up the app as well, no?

 

Please, if you have some input and experience on this topic, I would be very happy to hear your thoughts.

 

Best regards

 

 

1 REPLY 1
Brunner_BI
Super User
Super User

Question 1:
Why do you need the bi-directional filter?
Can't you have the RLS table as a normal dimensional table to your fact table?
The bi-directional filter could for sure have performance implications, especially if you have large RLS and fact tables.

 

Question 2:
Im not that familiar with the new metrics but the overloaded minutes should be looked at.
Generally the CUs are the main way of measurement now, unless for example it is a dataflow refreshing in the middle of the night i would not care so much about it.

 

Question 3: If really 90% of the users do not need the full dataset, think about creating two versions.

You can still keep the app if you just have more reports it does not matter.

Generally, the golden dataset is a good approach but of course many users and a complex and large dataset will take its toll.

 

Something else that you can analyze is if everything in your data model is actually used in the reports.
For example take the #1 dataset in terms of CUs (the first one in the metrics in your case) and see which report were built on it.
Then get all the reports.
Download and install Measure Killer (the free version should be enough).
Run it against the dataset and all the reports.
See how much of the data model is actually unused (could be 30%) and then think about kicking it out for now if nobody really needs it.

------------------------------------
Brunner BI focusing on Microsoft Power BI development and consulting
Developers of external tool "Measure Killer"
My blog

Helpful resources

Announcements
Join our Fabric User Panel

Join our Fabric User Panel

This is your chance to engage directly with the engineering team behind Fabric and Power BI. Share your experiences and shape the future.

June 2025 Power BI Update Carousel

Power BI Monthly Update - June 2025

Check out the June 2025 Power BI update to learn about new features.

June 2025 community update carousel

Fabric Community Update - June 2025

Find out what's new and trending in the Fabric community.