Solved: Re: Issue with Power BI Capacity Unit (CU) Overuse...

kemalfaisal · ‎10-21-2024

Hi Experts,

I'm encountering an issue with our Capacity Unit (CU) frequently exceeding 100% of the allocated capacity. After reviewing the Fabric Capacity Metric, I discovered that background processes consistently consume at least 50% of the capacity. I understand that this is due to throttling and smoothing mechanisms, which allow the system to "pay back" capacity that has already been borrowed. Additionally, the high volume of user interactions is contributing to the CU utilization, as there’s significant demand on Power BI from our users.

The Fabric Capacity Metric does provide insights into which datasets are consuming the most CU, but it doesn’t offer detailed process-level information to help fine-tune these datasets.

With this in mind, I'm trying to explore ways to optimize our Power BI datasets to reduce CU usage. Specifically, I’d like to know how we can monitor and decrease CU usage, as well as establish best practices (Do’s and Don’ts) for creating new datasets to prevent future overuse.

Any advice would be greatly appreciated.

Regards,
Kemal

lbendlin · ‎10-23-2024

Lots of things to unpack here

 is there a way to understand how CUs are utilized during and after a dataset refresh?

CUs are incurred by a combination of duration and computational complexity. This includes both Power Query transforms and calculated columns and tables. You can have a long running refresh that doesn't cost much, and a short refresh with ginormous computational cost.

how CUs are distributed or calculated between background and interactive processes

The main difference is the smoothing period - 24 hrs for background, and (most of the time) 5 minutes for interactive.

moving refreshes to off-peak hours

That has very limited benefits, as the smoothing period is the same 24 hrs no matter when your run the refresh. It is also not meaningful in an enterprise environment operating 24x7.

To emphasize again: Reducing the schedule is important, but not the most important. The most important is to reduce both the runtime and the complexity of each refresh. Your primary KPI should be the cumulative CU consumption per developer per day. Optimization of interactive queries would be a secondary KPI.

View solution in original post

lbendlin · ‎10-21-2024

I understand that this is due to throttling and smoothing mechanisms, which allow the system to "pay back" capacity that has already been borrowed.

Only the 24 hr smoothing contributes. Throttling is an unrelated activity.

Think of it from the perspective of the end users. Let's say your backend processes consume 50%. Your interactive processes consume 10%. Bluntly speaking, 80% of your refreshes are wasted. You have semantic models and dataflows that are refreshed, and then nobody is looking at them. Refresh them less frequently, and only when they will actually be used.

This is massively oversimplifying the situation but I hope you get my point. At a minimum background and interactive should balance out. Ideally interactive should be much bigger than background. (Most interactive is smoothing within 5 minutes rather than the 24 hrs for background)

kemalfaisal · ‎10-21-2024

Hi Ibendlin,

Thank you for your response.

I have multiple datasets that refresh frequently, every 5-10 minutes. This might be why the dataset is using most of the CU capacity according to the Fabric Capacity Metric. In cases like this, it might be better to use a direct connection to the database instead of frequent dataset refreshes, correct?

Are there best practices regarding the ideal percentage allocation between background and interactive usage?

Additionally, how can I identify if a dataset needs tuning to reduce CU usage with each refresh?

Regards,
Faisal

lbendlin · ‎10-22-2024

Are there best practices regarding the ideal percentage allocation between background and interactive usage?

The ideal percentage allocation is 0% background. 🙂

Only refresh when the data source has been updated, and if the users actually need to see the latest data right away. (oftentimes weekly refreshes are just fine)

how can I identify if a dataset needs tuning to reduce CU usage with each refresh?

This is not about datasets. This is about developers. Whoever uses the most CUs needs to be ~~locked in a windowless room and be~~ given a full day training on Power BI best practices. Rinse and repeat.

kemalfaisal · ‎10-23-2024

Hi Ibendlin,

I couldn’t agree more with the importance of enhancing our developer skills and knowledge regarding Power BI best practices. We’re still in the process of analyzing the best development approach case by case, in line with Power BI best practices.

As for reducing CU usage, I followed your advice to review the datasets with frequent refresh schedules that consume the most CUs. In addition, I moved the workspace to another capacity to validate whether frequent refreshes were the cause of high CU consumption. The results confirmed this, as the background CU usage on the new capacity spiked from 20% to 50%.

Given this, if the dataset itself isn’t the main cause of high CU consumption, is there a way to understand how CUs are utilized during and after a dataset refresh? Additionally, is there any explaination how CUs are distributed or calculated between background and interactive processes?

Based on your information and my understanding, there are three key ways to manage CU usage:

Optimize dataset refreshes by reviewing current schedules, moving refreshes to off-peak hours, and reducing the frequency of refreshes to avoid competition with interactive processes.
Optimize reports by simplifying visuals, reducing the number of queries, and improving model design.
Monitor and scale CU usage through the Power BI Admin Portal to ensure efficient resource allocation.

I’m trying to better understand how CUs are calculated to determine if we need to purchase additional capacity based on current users accessing Power BI, as well as the reports and datasets already developed.

Additionally, I’m working on creating user guidance for those with the ability to develop their own reports, ensuring we have standardized report development aligned with specific use cases.

Regards,
Kemal

lbendlin · ‎10-23-2024

Lots of things to unpack here

 is there a way to understand how CUs are utilized during and after a dataset refresh?

CUs are incurred by a combination of duration and computational complexity. This includes both Power Query transforms and calculated columns and tables. You can have a long running refresh that doesn't cost much, and a short refresh with ginormous computational cost.

how CUs are distributed or calculated between background and interactive processes

The main difference is the smoothing period - 24 hrs for background, and (most of the time) 5 minutes for interactive.

moving refreshes to off-peak hours

That has very limited benefits, as the smoothing period is the same 24 hrs no matter when your run the refresh. It is also not meaningful in an enterprise environment operating 24x7.

To emphasize again: Reducing the schedule is important, but not the most important. The most important is to reduce both the runtime and the complexity of each refresh. Your primary KPI should be the cumulative CU consumption per developer per day. Optimization of interactive queries would be a secondary KPI.

kemalfaisal · ‎10-24-2024

Hi Ibendlin,

This is very insightful!
I’ll go back and review each report developed by our developers and users.
Thank you for sharing your perspective, I really appreciate it.

Regards,
Faisal

Issue with Power BI Capacity Unit (CU) Overuse and Dataset Optimization

Helpful resources

Power BI Dataviz World Championships

Power BI Monthly Update - November 2025

FabCon Atlanta 2026

FabCon is coming to Atlanta

Issue with Power BI Capacity Unit (CU) Overuse and Dataset Optimization

Helpful resources

Power BI Dataviz World Championships

Power BI Monthly Update - November 2025

FabCon Atlanta 2026