Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Be one of the first to start using Fabric Databases. View on-demand sessions with database experts and the Microsoft product team to learn just how easy it is to get started. Watch now

Reply

How to improve Performance of Cumulative Frequency DAX Calculation

Hi everyone!


In my data I have some response times for a certain amount of incidents and I would like to calculate the frequency distribution and the cumulative frequency for different bin sizes.
For example if my bin size is 5, I will have bins starting from 00:00:00 going up to 10:00:00 with an increase of 5 seconds, so 00:00:00, 00:00:05, 00:00:10, 00:00,15 and so on.

The frequency calculation works fine, but when I add the cumulative frequency calculation to a matrix, it exceeds the maximum computing time and it returns: Visual has exceeded the available resources.

If I filter my data to only one year I can get it work but it takes more than 13 seconds to load.

Here is the calculation:

Cumulative_Frequency =
VAR _BinSize_End = SELECTEDVALUE(Parameter[Parameter])
VAR _BinSize_Start = 0
VAR _ResponsesCount = COUNT(Data[Response Seconds])
VAR _ResponsesNr = CALCULATE(COUNT(Data[Response Seconds]), FILTER(ALLSELECTED(Data[Response Seconds]),Data[Response Seconds] >= _BinSize_Start && Data[Response Seconds] <= _BinSize_End))
VAR _ResponsesNrAll = 
 IF(ISINSCOPE(Parameter[Bins]),
  _ResponsesNr,
  _ResponsesCount
 )
RETURN _ResponsesNrAll

Basically I set the staring bin to the minimum to 0 and the bin end changes for each row.
So if I have the 00:00:00 bin I will have the frequency of all the responses that are 0, then I have the 00:00:05 bin and I will have the responses that are 0 and up to 5 seconds and so on.

 

Is there a way I could improve it?

Thanks,
Carlotta.

1 ACCEPTED SOLUTION

lbendlin_0-1712183290360.png

 

That won't work.  A parameter can only have up to 1000 distinct values.

 

Here's a simplified version of your buckets

 

 

Buckets = GENERATESERIES(0,1,5/86400)

 

 

 

You can then display these as time values

 

lbendlin_1-1712184271780.png

and then you can choose if you want to materialize the bucket assignments, or use measures.

 

Your Response column (for example) carries a massive hidden cost

It looks benign in time format

lbendlin_2-1712185951308.png

 

but really ugly as a true datetime

lbendlin_3-1712185998129.png

 

If you correct that you get a pretty good performance with minute buckets

 

lbendlin_4-1712186222425.png

 

With 5 second buckets it's back in the 10 seconds range, mostly because there are way too many buckets at the long tail.

You should consider only including data points where the cumulative value actually changes.

lbendlin_0-1712186800743.png

 

lbendlin_5-1712186393802.png

 

May also want to consider logarithmic bucket sizes.

 

 

 

 

View solution in original post

12 REPLIES 12
lbendlin
Super User
Super User

When you compute the next bin there's no need to compute the first bin again. Take the result of the first bin and add the values from the extension.  (That's what bins normally are - non-overlapping).

 

Your DAX can use some refactoring.  Use DAX Studio to examine the query plan, and to get rid of some of the extra stuff like the FILTER statement  that is not ncessary.

I could put the result of the first bin in a variabile and add it to the next bin that is being calculated, but then what about the third and other bins? If I only add the value of the first bin it won't be cumulative anymore.

 

Why do you say that FILTER is not necessary?

Please provide sample data that covers your issue or question completely, in a usable format (not as a screenshot).

Do not include sensitive information or anything not related to the issue or question.

If you are unsure how to upload data please refer to https://community.fabric.microsoft.com/t5/Community-Blog/How-to-provide-sample-data-in-the-Power-BI-...

Please show the expected outcome based on the sample data you provided.

Want faster answers? https://community.fabric.microsoft.com/t5/Desktop/How-to-Get-Your-Question-Answered-Quickly/m-p/1447...

Hi @lbendlin 

Here is a PBIX with some sample data and the measures needed.

https://drive.google.com/file/d/1KB5TAezAiCGjBp21kiUUeC2VY_m5uJLs/view?usp=drive_link 

Let me know if this is ok and you can download it.

The expected outcome is in the Frequency Distribution table. I would like to have the cumulative frequency calcualted in less time. Now it takes almost 8 seconds to finish if we filter only 1 year.

Thank you,
Carlotta. 

link asks for access. Please check.

Hi!
I have updated the link, could you try again?
https://drive.google.com/file/d/1KB5TAezAiCGjBp21kiUUeC2VY_m5uJLs/view?usp=sharing

 

Thanks.

lbendlin_0-1712183290360.png

 

That won't work.  A parameter can only have up to 1000 distinct values.

 

Here's a simplified version of your buckets

 

 

Buckets = GENERATESERIES(0,1,5/86400)

 

 

 

You can then display these as time values

 

lbendlin_1-1712184271780.png

and then you can choose if you want to materialize the bucket assignments, or use measures.

 

Your Response column (for example) carries a massive hidden cost

It looks benign in time format

lbendlin_2-1712185951308.png

 

but really ugly as a true datetime

lbendlin_3-1712185998129.png

 

If you correct that you get a pretty good performance with minute buckets

 

lbendlin_4-1712186222425.png

 

With 5 second buckets it's back in the 10 seconds range, mostly because there are way too many buckets at the long tail.

You should consider only including data points where the cumulative value actually changes.

lbendlin_0-1712186800743.png

 

lbendlin_5-1712186393802.png

 

May also want to consider logarithmic bucket sizes.

 

 

 

 

kriscoupe
Solution Supplier
Solution Supplier

Hey @Carlotta_Fabris ,

 

Unfortunately I don't come with a solution but an observation.. By my calculations 5 second bins for 10 hours is 7,200 bins. Now, because you are using CALCULATE to do a cumulative calculation each calculation will be done independently (ie the DAX engine will work out 0-5, 0-10, 0-15 etc.). Not sure if you use DAX Studio but if you run the server timings on the calculation you'd see this.

 

The engine can only do so many of these in parallel so the volume of storage engine queries may be the slowdown here. I've come across this in the past with cumulative calculations. A simple one to start would be to increase the bin size, if that is possible?

 

Hope it helps,

Kris

Hi @kriscoupe,

The bin size is dynamic and it goes from 5 seconds to 60. Increasing it improve the performances but not that much.

Hey @Carlotta_Fabris ,

 

Just had another thought, this could be a good candidate for the preview feature Visual Calculations. There is a function in there called RUNNINGSUM which you could use on your non cumulative frequency calculation. Since the calculation happens on the visual values it may be faster.. without seeing your model, visual, setup etc. it will be hard to help further, you may have to play around to see if it works. Here's a link explaining how to implement https://learn.microsoft.com/en-us/power-bi/transform-model/desktop-visual-calculations-overview 

 

Just an FYI the feature is relatively new and still in preview so exercise caution if this report is going to production as preview features can be pulled, changed etc. Be sure to check the considerations and limitations as well.

 

Hope you get some success 😊!

Thank you @kriscoupe I have tried and it improved the running times but the peoblem is that I want to publish the report to the web and visual calculations don't work when the report is published to web.

Hopefully this changes in the future. There's a post on this thread that seems to state that this is driven due to technical issue and not licensing.. here's hoping!

 

https://community.fabric.microsoft.com/t5/Desktop/Share-your-thoughts-on-visual-calculations-preview... (message 111 and 112)

Helpful resources

Announcements
Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!

ArunFabCon

Microsoft Fabric Community Conference 2025

Arun Ulag shares exciting details about the Microsoft Fabric Conference 2025, which will be held in Las Vegas, NV.

December 2024

A Year in Review - December 2024

Find out what content was popular in the Fabric community during 2024.