March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount! Early Bird pricing ends December 9th.
Register NowGet certified in Microsoft Fabric—for free! For a limited time, the Microsoft Fabric Community team will be offering free DP-600 exam vouchers. Prepare now
Hi everyone!
In my data I have some response times for a certain amount of incidents and I would like to calculate the frequency distribution and the cumulative frequency for different bin sizes.
For example if my bin size is 5, I will have bins starting from 00:00:00 going up to 10:00:00 with an increase of 5 seconds, so 00:00:00, 00:00:05, 00:00:10, 00:00,15 and so on.
The frequency calculation works fine, but when I add the cumulative frequency calculation to a matrix, it exceeds the maximum computing time and it returns: Visual has exceeded the available resources.
If I filter my data to only one year I can get it work but it takes more than 13 seconds to load.
Here is the calculation:
Cumulative_Frequency =
VAR _BinSize_End = SELECTEDVALUE(Parameter[Parameter])
VAR _BinSize_Start = 0
VAR _ResponsesCount = COUNT(Data[Response Seconds])
VAR _ResponsesNr = CALCULATE(COUNT(Data[Response Seconds]), FILTER(ALLSELECTED(Data[Response Seconds]),Data[Response Seconds] >= _BinSize_Start && Data[Response Seconds] <= _BinSize_End))
VAR _ResponsesNrAll =
IF(ISINSCOPE(Parameter[Bins]),
_ResponsesNr,
_ResponsesCount
)
RETURN _ResponsesNrAll
Basically I set the staring bin to the minimum to 0 and the bin end changes for each row.
So if I have the 00:00:00 bin I will have the frequency of all the responses that are 0, then I have the 00:00:05 bin and I will have the responses that are 0 and up to 5 seconds and so on.
Is there a way I could improve it?
Thanks,
Carlotta.
Solved! Go to Solution.
That won't work. A parameter can only have up to 1000 distinct values.
Here's a simplified version of your buckets
Buckets = GENERATESERIES(0,1,5/86400)
You can then display these as time values
and then you can choose if you want to materialize the bucket assignments, or use measures.
Your Response column (for example) carries a massive hidden cost
It looks benign in time format
but really ugly as a true datetime
If you correct that you get a pretty good performance with minute buckets
With 5 second buckets it's back in the 10 seconds range, mostly because there are way too many buckets at the long tail.
You should consider only including data points where the cumulative value actually changes.
May also want to consider logarithmic bucket sizes.
When you compute the next bin there's no need to compute the first bin again. Take the result of the first bin and add the values from the extension. (That's what bins normally are - non-overlapping).
Your DAX can use some refactoring. Use DAX Studio to examine the query plan, and to get rid of some of the extra stuff like the FILTER statement that is not ncessary.
I could put the result of the first bin in a variabile and add it to the next bin that is being calculated, but then what about the third and other bins? If I only add the value of the first bin it won't be cumulative anymore.
Why do you say that FILTER is not necessary?
Please provide sample data that covers your issue or question completely, in a usable format (not as a screenshot).
Do not include sensitive information or anything not related to the issue or question.
If you are unsure how to upload data please refer to https://community.fabric.microsoft.com/t5/Community-Blog/How-to-provide-sample-data-in-the-Power-BI-...
Please show the expected outcome based on the sample data you provided.
Want faster answers? https://community.fabric.microsoft.com/t5/Desktop/How-to-Get-Your-Question-Answered-Quickly/m-p/1447...
Hi @lbendlin
Here is a PBIX with some sample data and the measures needed.
https://drive.google.com/file/d/1KB5TAezAiCGjBp21kiUUeC2VY_m5uJLs/view?usp=drive_link
Let me know if this is ok and you can download it.
The expected outcome is in the Frequency Distribution table. I would like to have the cumulative frequency calcualted in less time. Now it takes almost 8 seconds to finish if we filter only 1 year.
Thank you,
Carlotta.
link asks for access. Please check.
Hi!
I have updated the link, could you try again?
https://drive.google.com/file/d/1KB5TAezAiCGjBp21kiUUeC2VY_m5uJLs/view?usp=sharing
Thanks.
That won't work. A parameter can only have up to 1000 distinct values.
Here's a simplified version of your buckets
Buckets = GENERATESERIES(0,1,5/86400)
You can then display these as time values
and then you can choose if you want to materialize the bucket assignments, or use measures.
Your Response column (for example) carries a massive hidden cost
It looks benign in time format
but really ugly as a true datetime
If you correct that you get a pretty good performance with minute buckets
With 5 second buckets it's back in the 10 seconds range, mostly because there are way too many buckets at the long tail.
You should consider only including data points where the cumulative value actually changes.
May also want to consider logarithmic bucket sizes.
Hey @Carlotta_Fabris ,
Unfortunately I don't come with a solution but an observation.. By my calculations 5 second bins for 10 hours is 7,200 bins. Now, because you are using CALCULATE to do a cumulative calculation each calculation will be done independently (ie the DAX engine will work out 0-5, 0-10, 0-15 etc.). Not sure if you use DAX Studio but if you run the server timings on the calculation you'd see this.
The engine can only do so many of these in parallel so the volume of storage engine queries may be the slowdown here. I've come across this in the past with cumulative calculations. A simple one to start would be to increase the bin size, if that is possible?
Hope it helps,
Kris
Hi @kriscoupe,
The bin size is dynamic and it goes from 5 seconds to 60. Increasing it improve the performances but not that much.
Hey @Carlotta_Fabris ,
Just had another thought, this could be a good candidate for the preview feature Visual Calculations. There is a function in there called RUNNINGSUM which you could use on your non cumulative frequency calculation. Since the calculation happens on the visual values it may be faster.. without seeing your model, visual, setup etc. it will be hard to help further, you may have to play around to see if it works. Here's a link explaining how to implement https://learn.microsoft.com/en-us/power-bi/transform-model/desktop-visual-calculations-overview
Just an FYI the feature is relatively new and still in preview so exercise caution if this report is going to production as preview features can be pulled, changed etc. Be sure to check the considerations and limitations as well.
Hope you get some success 😊!
Thank you @kriscoupe I have tried and it improved the running times but the peoblem is that I want to publish the report to the web and visual calculations don't work when the report is published to web.
Hopefully this changes in the future. There's a post on this thread that seems to state that this is driven due to technical issue and not licensing.. here's hoping!
https://community.fabric.microsoft.com/t5/Desktop/Share-your-thoughts-on-visual-calculations-preview... (message 111 and 112)
March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount! Early Bird pricing ends December 9th.
Check out the November 2024 Power BI update to learn about new features.
User | Count |
---|---|
121 | |
97 | |
87 | |
70 | |
62 |
User | Count |
---|---|
138 | |
116 | |
114 | |
99 | |
98 |