Earn a 50% discount on the DP-600 certification exam by completing the Fabric 30 Days to Learn It challenge.
I am trying to find the earliest occurances of the highest value in a given column.
Here is the initial state of the data:
Item | Cost | Date |
a1 | 1 | 01/01/2022 |
a1 | 3 | 01/05/2022 |
a1 | 1 | 01/06/2022 |
a1 | 2 | 01/03/2023 |
a1 | 1 | 01/07/2023 |
b2 | 2 | 01/03/2021 |
b2 | 3 | 01/02/2022 |
b2 | 1 | 01/04/2022 |
b2 | 1 | 01/07/2022 |
b2 | 3 | 01/09/2022 |
b2 | 1 | 01/11/2022 |
As we can see Item a1 had it's higest cost on 01/05/2022.
Item b2 had two instances of highest cost, one in 01/02/2022, and the other on 01/09/2022. I want to capture only the first one.
So from this data I want to get the following result:
Item | Cost | Date |
a1 | 3 | 01/05/2022 |
b2 | 3 | 01/02/2022 |
I have so far used a query level action, first sorting Date Ascending, then sorting Cost Descending before deduping the whole thing, as that's a logic that works in Excel. I also used buffer.table when sorting to ensure all data is captured in both sorts.
But I also want to verify this data, so I am looking at making a calcualted column against the original version of this table.
Can anyone help with how that should look like?
As an update, I tried deduping based on the columns that were causing me issues, but it didn't help.
Still getting the above error. Any ideas?
create a temporary table like
Tmp table =
ADDCOLUMNS (
SUMMARIZE ( 'Table', 'Table'[Item], 'Table'[Date] ),
"@num rows", CALCULATE ( COUNTROWS ( 'Table' ) )
)
that should help you identify any remaining duplicates.
You can create a calculated table like
My Table =
INDEX(
1,
'Table',
ORDERBY( 'Table'[Cost], DESC, 'Table'[Date], ASC),
PARTITIONBY( 'Table'[Item]),
MATCHBY( 'Table'[Item], 'Table'[Date])
)
Just for reference, it needed a Blank argument as well, I chose "Default".
I am getting an error though:
Despite of MatchBy columns being specified, duplicated rows are encountered in INDEX's Relation parameter. This is not allowed. The current operation was cancelled because another operation in the transaction failed.
I think this might be because some Items might have the same date and rating twice, will try to dedupe that part on query level and then run this again.
User | Count |
---|---|
93 | |
83 | |
77 | |
72 | |
65 |
User | Count |
---|---|
114 | |
101 | |
96 | |
65 | |
60 |