The ultimate Microsoft Fabric, Power BI, Azure AI, and SQL learning event: Join us in Stockholm, September 24-27, 2024.
Save €200 with code MSCUST on top of early bird pricing!
Find everything you need to get certified on Fabric—skills challenges, live sessions, exam prep, role guidance, and more. Get started
In my recent quest to create or catalog as many DAX equivalents for Excel functions, this was an interesting one because, well, frankly, I can't make heads or tails of how Excel is calculating the 1st and 3rd quartiles. Whatever method it is using does not seem to agree with any other method out there that I can find such as here or here or literally anywhere I have looked. No methods I have tried are able to replicate the numbers coming back from Excel's QUARTILE function. But, all other methods seem to agree on the 1st and 3rd quartile numbers for the set 1, 2, 4, 7, 8, 9, 10, 12. They all say that the 1st quartile is 3 and the 3rd quartile is 9.5. Excel gives 3.5 and 9.25 respectively. Note that QUARTILE.EXC gives 2.5 and 9.75 so that doesn't agree with anything either. So I guess, to *bleep* with Excel's QUARTILE function??
Anyway, here is an DAX implementation of QUARTILE that agrees with every other method out there other than Excel...
QUARTILE =
VAR __Values = SELECTCOLUMNS('Table',"Values",[Column1])
VAR __Quart = MAX('Quartiles'[Quart])
VAR __Median = MEDIANX(__Values,[Values])
VAR __Quartile =
SWITCH(__Quart,
0,MINX(__Values,[Values]),
2,__Median,
4,MAXX(__Values,[Values]),
1,MEDIANX(FILTER(__Values,[Values] < __Median),[Values]),
3,MEDIANX(FILTER(__Values,[Values] > __Median),[Values])
)
RETURN
__Quartile
Perhaps someone from Microsoft's Excel team can mosey along and explain the formula behind Excel's quartile function? Has nobody ever noticed that it gives different results than literally every other quartile calculator/calculation?? Maybe all is a strong word. All of the ones I looked at, over a dozen.
Something else that bugs me, all of the documentation on QUARTILE.INC, QUARTILE.EXC, PERCENTILE.INC, PERCENTILE.EXC all focus on the "inclusive/exclusive" part about the kth values from 0..1. Except that seems like the least important part to me because there are clearly different methods going on here in terms of how these functions compute the quartiles/percentiles because you can get very different answers, especially when dealing with even numbers of items. The fact that you can't use 0 and 1 in one of them seems like the last thing that you would want to explain but rather explain why the calculated values are different?
And another thing with regard to the "interpolation", apparently that is why the numbers generated for the 1st and 3rd quartiles in Excel varies from the way everybody else does it so how exactly is this interpolation happening and why is it better or worse than the way everyone else seems to do it?
eyJrIjoiNGY1MzNjNTYtNWQxMy00NmM0LThmMmUtZTIwNjg2YzhiZDcxIiwidCI6IjRhMDQyNzQzLTM3M2EtNDNkMi04MjdiLTAwM2Y0YzdiYTFlNSIsImMiOjN9
WOW!!!
I was so confused with why finding Quartiles with PERCENTILEX.INC dont match with Box Plot visuals Quartiles
But your solution fixed it..Thanks!!
The only small fix required in your code is you will need to include the median if there are odd sets of data points and exclude them if there are even sets..
QUARTILE =
VAR __Values = SELECTCOLUMNS('Table',"Values",[Column1])
VAR _Count = COUNTROWS(_Values)
VAR _reminder = MOD(_Count,2)
VAR __Quart = MAX('Quartiles'[Quart])
VAR __Median = MEDIANX(__Values,[Values])
VAR __Quartile_Even =
SWITCH(__Quart,
0,MINX(__Values,[Values]),
2,__Median,
4,MAXX(__Values,[Values]),
1,MEDIANX(FILTER(__Values,[Values] < __Median),[Values]),
3,MEDIANX(FILTER(__Values,[Values] > __Median),[Values])
)
VAR __Quartile_Odd =
SWITCH(__Quart,
0,MINX(__Values,[Values]),
2,__Median,
4,MAXX(__Values,[Values]),
1,MEDIANX(FILTER(__Values,[Values] <= __Median),[Values]),
3,MEDIANX(FILTER(__Values,[Values] >= __Median),[Values])
)
RETURN
IF( _reminder = 0,__Quartile_Even, __Quartile_Odd)
I have a database with a lot of agents per day per skill, and for everydate everyone answer calls, one day one of them could receive different anount of calls in comparison to other dates, so I can create a table with Average of all calls, but when I try to find out Quartiles, for all agents in a determined period, the formula is not working
someone can hell phere?