The ultimate Fabric, Power BI, SQL, and AI community-led learning event. Save €200 with code FABCOMM.
Get registeredEnhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends August 31st. Request your voucher.
Hi
I am trying to get the value at particular indexes in a dataset. This is as part of calculating the confidence interval for the median. This data will change based on slicers.
We are currently using this DAX formula to get the value:
The issue we are facing is that using the RANKX approach to get the 5th element fails if there are duplicates in the data. For example, if the data is like this:
1
2
2
3
3
3
4
5
5
6
7
8
We need the 5th value, that is 3. But instead, we get the 5th ranked value, which is 5.
Is there a way to get the nth value instead of using RANKX? Creating an index column wouldn't work as we would be running this on filtered data. What do we do?
Solved! Go to Solution.
Ok, I see what you mean...
How about:
EDIT: Instead of creating the Rank for car in step 1 below, it will probably work using a much simpler:
1) Index for Rank =SUM(Table[car]) * 10000 + SUM(Table[Index])
which avoids needing two RANKX- you will need to take into account the number of rows for the *10000 = or make it really huge like * 1000000000
1) Create a measure combining car rank and index as follows:
Index for Rank =
VAR _t =
RANKX ( ALLSELECTED ( 'Table' ), [Sum car],, ASC, DENSE )
VAR _V =
_t * 1000
+ SUM ( 'Table'[index] )
RETURN
_V
Now the rank measure to use to choose the nth value:
Rank =
RANKX(ALLSELECTED('Table'), [Index for Rank], ,ASC,Dense)
New file attached
Proud to be a Super User!
Paul on Linkedin.
Correct!
but use the method
Index for Rank =SUM(Table[car]) * 100000000000 + SUM(Table[Index])
as the first step (to avoid using the costly RANKX), and then use RANKX over this [Index for Rank].
In theory, using the car value * 10000000000 "provides" the correct order; adding the index makes each value unique.
Proud to be a Super User!
Paul on Linkedin.
You could use an index column (unique values) and then use RANK on the index value. Even if it is filtered, you should still get the correct nth value:
Attached is the sample PBIX
Proud to be a Super User!
Paul on Linkedin.
@PaulDBrown Thanks, but there's still a slight issue. Just to explain the reason for doing this. I need the nth (and mth) value in an ordered list of values to determine confidence intervals for the median.
In this case, since the index isn't ordered based on any value (e.g. car), it doesn't provide the values I want.
I used the .pbix file you had uploaded and have created another page to show the issue. For device=A and the given date range, the values of A are: 1, 1, 1, 2 in ascending order. But the index values don't match this order. So, if I wanted the 4th value (e.g.) I would need car=2 as the result. But index=4 points to car=1.
I was wondering if it would be possible to create the index based on order column?
Ok, I see what you mean...
How about:
EDIT: Instead of creating the Rank for car in step 1 below, it will probably work using a much simpler:
1) Index for Rank =SUM(Table[car]) * 10000 + SUM(Table[Index])
which avoids needing two RANKX- you will need to take into account the number of rows for the *10000 = or make it really huge like * 1000000000
1) Create a measure combining car rank and index as follows:
Index for Rank =
VAR _t =
RANKX ( ALLSELECTED ( 'Table' ), [Sum car],, ASC, DENSE )
VAR _V =
_t * 1000
+ SUM ( 'Table'[index] )
RETURN
_V
Now the rank measure to use to choose the nth value:
Rank =
RANKX(ALLSELECTED('Table'), [Index for Rank], ,ASC,Dense)
New file attached
Proud to be a Super User!
Paul on Linkedin.
Thanks, that helps a lot! I think it works.
If I've understood correctly, the theory is that you are creating a new measure that combines the index and the value such that it's unique. Am I correct?
Correct!
but use the method
Index for Rank =SUM(Table[car]) * 100000000000 + SUM(Table[Index])
as the first step (to avoid using the costly RANKX), and then use RANKX over this [Index for Rank].
In theory, using the car value * 10000000000 "provides" the correct order; adding the index makes each value unique.
Proud to be a Super User!
Paul on Linkedin.
Can you share a link to your new file?
Proud to be a Super User!
Paul on Linkedin.
It's the same file as the one you shared. I just added a slicer for device and date. And a table to show the data. I have attached the file here: https://we.tl/t-GCleyqVOhd
Hi @Anonymous ,
According to your description, here's my solution.
1. Add an index column in csv_data table in Power Query.
2.Create a measure:
Lower Bound =
MAXX (
FILTER (
ALLSELECTED ( 'csv_data' ),
RANKX (
ALLSELECTED ( 'csv_data' ),
SUMX (
FILTER (
'csv_data',
'csv_data'[Car] <= EARLIER ( 'csv_data'[Car] )
&& 'csv_data'[Index] <= EARLIER ( 'csv_data'[Index] )
),
'csv_data'[Car]
),
,
ASC
) = 5
),
'csv_data'[Car]
)
Get the correct result:
I attach my sample below for your reference.
Best Regards,
Community Support Team _ kalyj
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
@v-yanjiang-msft the issue is that if I create the index column in the beginning, the nth value in a filtered table doesn't work. It gives me wrong values.
For example, if this is a sample table:
date | device | car | index |
01/01/2022 | A | 1 | 1 |
01/02/2022 | A | 2 | 2 |
01/03/2022 | A | 1 | 3 |
01/04/2022 | A | 1 | 4 |
01/05/2022 | A | 3 | 5 |
01/01/2022 | B | 5 | 6 |
01/02/2022 | B | 1 | 7 |
01/03/2022 | B | 5 | 8 |
01/04/2022 | B | 23 | 9 |
01/05/2022 | B | 2 | 10 |
01/01/2022 | C | 7 | 11 |
01/02/2022 | C | 2 | 12 |
01/03/2022 | C | 3 | 13 |
01/04/2022 | C | 5 | 14 |
01/05/2022 | C | 54 | 15 |
The index is for all dates and devices. But the nth value I would be trying to get for would be between a range of dates and only for some devices. The formula you have shared gives me incorrect results.
But your answer gave me an idea for an approach. Would it be possible to create index columns that are based on a sort order (e.g. Car value). Then I could use the RANKX on the CarIndex instead of Car. This way, the issue of duplicates would be taken care of. And the nth value of the index would match the nth value of Car as well. Would this be possible?
@Anonymous , Try if Rank tie breaker can help
Rank Tie breaker
https://community.powerbi.com/t5/Community-Blog/Breaking-Ties-in-Rankings-with-RANKX-Using-Multiple-Columns/ba-p/918655
https://databear.com/how-to-use-the-dax-rankx-function-in-power-bi/
@amitchandak unfortunately, it doesn't. For example, in the scenario, I've mentioned, these would be the ranks for Skip and Dense options.
value | rank.Skip | rank.Dense |
1 | 1 | 1 |
2 | 2 | 2 |
2 | 2 | 2 |
3 | 4 | 3 |
3 | 4 | 3 |
3 | 4 | 3 |
4 | 7 | 4 |
5 | 8 | 5 |
5 | 8 | 5 |
6 | 10 | 6 |
7 | 11 | 7 |
8 | 12 | 8 |