Microsoft is giving away 50,000 FREE Microsoft Certification exam vouchers!
Enter the sweepstakes now!Prepping for a Fabric certification exam? Join us for a live prep session with exam experts to learn how to pass the exam. Register now.
Hi everyone!
I have a dataset of responses to different incidents, categorized by a variety of variables like incident type, unit type etc.
In the dataset I have different types of responses, like at scene time, travel time, turnout time etc.
I would like to build a histogram and a matrix with different bins and the frequency of those bins for the different responses. For example if I want to measure Travel time, I will have all the possible bins and then the number of travel times that fall under those bins.
The huge problem that I am facing is that bin size should also be dynamic.
I was able to build a report with different parameters and this is the result:
The problem is that when I change the bin size, it takes more than 10 seconds to update the histogram and the matrix.
Below I will share the model and the formula I am using.
The model is only one big table and I know that maybe I could try to transform it in a star schema model to improve performances, but I wanted to understand if you have other ideas on how I could improve this.
Responses_Frequency =
SWITCH (
SELECTEDVALUE ( Select_Measure[Measures Fields] ),
"'Data'[Response Seconds]",
VAR _BinSize_Start =
SELECTEDVALUE ( Parameter[Parameter] )
VAR _BinSize_End = ( _BinSize_Start + 'Bin Size'[Bin Size Value] ) - 1
VAR _ResponsesCount =
COUNT ( Data[Response Seconds] )
VAR _ResponsesNr =
CALCULATE (
COUNT ( Data[Response Seconds] ),
FILTER (
ALL ( Data[Response Seconds] ),
Data[Response Seconds] >= _BinSize_Start
&& Data[Response Seconds] <= _BinSize_End
)
)
VAR _ResponsesNrAll =
IF ( ISINSCOPE ( Parameter[Bins] ), _ResponsesNr, _ResponsesCount )
RETURN
_ResponsesNrAll,
"'Data'[Call Processing Seconds]",
VAR _BinSize_Start =
SELECTEDVALUE ( Parameter[Parameter] )
VAR _BinSize_End = ( _BinSize_Start + 'Bin Size'[Bin Size Value] ) - 1
VAR _ResponsesCount =
COUNT ( Data[Call Processing Seconds] )
VAR _ResponsesNr =
CALCULATE (
COUNT ( Data[Call Processing Seconds] ),
FILTER (
ALL ( Data[Call Processing Seconds] ),
Data[Call Processing Seconds] >= _BinSize_Start
&& Data[Call Processing Seconds] <= _BinSize_End
)
)
VAR _ResponsesNrAll =
IF ( ISINSCOPE ( Parameter[Bins] ), _ResponsesNr, _ResponsesCount )
RETURN
_ResponsesNrAll,
"'Data'[Turnout Seconds]",
VAR _BinSize_Start =
SELECTEDVALUE ( Parameter[Parameter] )
VAR _BinSize_End = ( _BinSize_Start + 'Bin Size'[Bin Size Value] ) - 1
VAR _ResponsesCount =
COUNT ( Data[Turnout Seconds] )
VAR _ResponsesNr =
CALCULATE (
COUNT ( Data[Turnout Seconds] ),
FILTER (
ALL ( Data[Turnout Seconds] ),
Data[Turnout Seconds] >= _BinSize_Start
&& Data[Turnout Seconds] <= _BinSize_End
)
)
VAR _ResponsesNrAll =
IF ( ISINSCOPE ( Parameter[Bins] ), _ResponsesNr, _ResponsesCount )
RETURN
_ResponsesNrAll,
"'Data'[Travel Seconds]",
VAR _BinSize_Start =
SELECTEDVALUE ( Parameter[Parameter] )
VAR _BinSize_End = ( _BinSize_Start + 'Bin Size'[Bin Size Value] ) - 1
VAR _ResponsesCount =
COUNT ( Data[Travel Seconds] )
VAR _ResponsesNr =
CALCULATE (
COUNT ( Data[Travel Seconds] ),
FILTER (
ALL ( Data[Travel Seconds] ),
Data[Travel Seconds] >= _BinSize_Start
&& Data[Travel Seconds] <= _BinSize_End
)
)
VAR _ResponsesNrAll =
IF ( ISINSCOPE ( Parameter[Bins] ), _ResponsesNr, _ResponsesCount )
RETURN
_ResponsesNrAll,
"'Data'[Total Reflex Seconds]",
VAR _BinSize_Start =
SELECTEDVALUE ( Parameter[Parameter] )
VAR _BinSize_End = ( _BinSize_Start + 'Bin Size'[Bin Size Value] ) - 1
VAR _ResponsesCount =
COUNT ( Data[Total Reflex Seconds] )
VAR _ResponsesNr =
CALCULATE (
COUNT ( Data[Total Reflex Seconds] ),
FILTER (
ALL ( Data[Total Reflex Seconds] ),
Data[Total Reflex Seconds] >= _BinSize_Start
&& Data[Total Reflex Seconds] <= _BinSize_End
)
)
VAR _ResponsesNrAll =
IF ( ISINSCOPE ( Parameter[Bins] ), _ResponsesNr, _ResponsesCount )
RETURN
_ResponsesNrAll,
"'Data'[At scene Seconds]",
VAR _BinSize_Start =
SELECTEDVALUE ( Parameter[Parameter] )
VAR _BinSize_End = ( _BinSize_Start + 'Bin Size'[Bin Size Value] ) - 1
VAR _ResponsesCount =
COUNT ( Data[At scene Seconds] )
VAR _ResponsesNr =
CALCULATE (
COUNT ( Data[At scene Seconds] ),
FILTER (
ALL ( Data[At scene Seconds] ),
Data[At scene Seconds] >= _BinSize_Start
&& Data[At scene Seconds] <= _BinSize_End
)
)
VAR _ResponsesNrAll =
IF ( ISINSCOPE ( Parameter[Bins] ), _ResponsesNr, _ResponsesCount )
RETURN
_ResponsesNrAll,
"'Data'[Transport Travel Seconds]",
VAR _BinSize_Start =
SELECTEDVALUE ( Parameter[Parameter] )
VAR _BinSize_End = ( _BinSize_Start + 'Bin Size'[Bin Size Value] ) - 1
VAR _ResponsesCount =
COUNT ( Data[Transport Travel Seconds] )
VAR _ResponsesNr =
CALCULATE (
COUNT ( Data[Transport Travel Seconds] ),
FILTER (
ALL ( Data[Transport Travel Seconds] ),
Data[Transport Travel Seconds] >= _BinSize_Start
&& Data[Transport Travel Seconds] <= _BinSize_End
)
)
VAR _ResponsesNrAll =
IF ( ISINSCOPE ( Parameter[Bins] ), _ResponsesNr, _ResponsesCount )
RETURN
_ResponsesNrAll
)
A possible solution I was thinking is to ceil the values of the different responses and then use those columns in the histogram and matrix.
Doing this values greater than 00:00:00 will be ceiled for example to 00:00:05 that is my smaller bin, but I would like it to be contained in side the bin that goes from o to 5 seconds, not in the one that goes from 5 to 10.
I also tried to create a table with all the bins and create a relationship with the responses. In this case I have two problems, the total is not correct and the second is that the performances did not improve much.
Do you have ideas on how to improve the performances?
If you need more details, I will be happy to give it to you.
Thank you,
Carlotta.
Hi @Carlotta_Fabris ,
Creating histograms and matrices with dynamic bin sizes can indeed be challenging, especially when dealing with large datasets. Here are my answers.
Ensure that your dataset is properly indexed to speed up query performance. Consider pre-aggregating data where possible. For example, if you have a large number of records, aggregate them into broader time intervals (e.g., minutes instead of seconds) before creating histograms.
Your idea of ceiling values is a good one. However, ensure that the ceiling values align with the desired bin boundaries. For instance, if your smallest bin is 00:00:05, make sure that values greater than 00:00:00 are rounded up to 00:00:05, not 00:00:10.
Transforming your model into a star schema might improve performance by reducing the complexity of queries. Consider creating separate fact tables for different response types (e.g., scene time, travel time) and dimension tables for incident type, unit type, etc. Use relationships between fact and dimension tables to create efficient queries.
There may be other aspects that could be optimized, but these are the directions I can think of.
Best Regards
Yilong Zhou
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
Thank you for your suggestions.
After a lot of trial and error, I discovered that the problem was caused by the measure that calculates the cumulative frequency.
Here it is:
VAR _BinSize_Start = 0
VAR _BinSize_End = ((_BinSize_Start + SELECTEDVALUE(Parameter[Parameter])) + 'Bin Size'[Bin Size Value]) -1
VAR _ResponsesCount = COUNT(Data[Response Seconds])
VAR _ResponsesNr = CALCULATE(COUNT(Data[Response Seconds]), FILTER(ALL(Data[Response Seconds]),Data[Response Seconds] >= _BinSize_Start && Data[Response Seconds] <= _BinSize_End))
VAR _ResponsesNrAll =
IF(ISINSCOPE(Parameter[Bins]),
_ResponsesNr,
_ResponsesCount
)
RETURN _ResponsesNrAll
It basically takes as start the 0 bin and thencounts all the reponses that are within 0 and the end of the bin.
The end of the bin is always updated taking the current bin + the size of the bin -1. This way we always go from 0 to the end of the next bin.
Do you have any suggestion on how to make this faster?
Thanks,
Carlotta.
User | Count |
---|---|
85 | |
73 | |
73 | |
57 | |
51 |
User | Count |
---|---|
43 | |
41 | |
35 | |
34 | |
30 |