Removing duplicates with same ID based on another ...

CesarOr · ‎09-22-2025

Hi Everyone,

I'm a bit of a newbie and would like help with a problem that I cannot find a solution to. I would like to create a table that extracts the duplicates but only returns one ID. If there was a FAILURE as a categorical text result in another column for one of the rows, it would return one row for the one ID as a FAILURE. If not, it would return as a PASS.

Current table:

After/Result:

Thank you!

v-saisrao-msft

Hi @CesarOr,

Checking in to see if your issue has been resolved. let us know if you still need any assistance.

Thank you.

v-saisrao-msft

Hi @CesarOr,

Have you had a chance to review the solution we shared earlier? If the issue persists, feel free to reply so we can help further.

Thank you.

JamesLee94 · ‎09-23-2025

You can use a custom Power Query step: group by ID, then pick the record with the highest (or lowest) value of your criteria-e.g., max timestamp. That way duplicates get removed based on your chosen criteria.

v-saisrao-msft · ‎09-22-2025

Hi @CesarOr,

Thank you @DataNinja777 @ivana_tomekova, for your insights.

I have reproduced the issue and received the output below. I’ve attached the PBIX file for your reference.

Hope this helps

Thank you.

DataNinja777 · ‎09-22-2025

Hi @CesarOr ,

You're looking to deduplicate your table based on the ID column, with the rule that a "FAIL" status for any entry under an ID will make the final result for that ID a "FAIL". This is an excellent use case for data shaping, and the recommended approach is using Power Query, which is available in both Excel and Power BI.

To accomplish this in Power Query, you would first load your data into the editor. The main action is to group the data using the "Group By" function on the ID column. You'll configure it to create a new column, for instance named AllData, which will hold all the original rows for each ID in a nested table. After grouping, you add a custom column named FinalCategory to determine the correct status. The formula for this column checks if the list of categories within the AllData nested table contains the text "FAIL".

if List.Contains([AllData][Category], "FAIL") then "FAIL" else "PASS"

With the final status determined, you add another custom column, perhaps called FinalRow, to extract the specific row you want to keep. This is achieved with a formula that filters the nested table to find the first row whose category matches the FinalCategory you just created.

Table.First(Table.SelectRows([AllData], each ([Category] = [FinalCategory])))

Finally, you can clean up your query. Remove the temporary AllData column, then use the expand button on the FinalRow column to bring out the original Date and Test fields. After reordering the columns to your liking, you can load the result back to your worksheet or data model.

Alternatively, if you need to create this table dynamically within a Power BI report, you can use a DAX expression to create a new calculated table. You would use the "New Table" feature and input the following code, replacing 'YourTable' with your actual table name.

ResultTable =
SUMMARIZECOLUMNS (
    'YourTable'[ID],
    "FinalCategory",
    IF (
        CALCULATE ( COUNTROWS ( 'YourTable' ), 'YourTable'[Category] = "FAIL" ) > 0,
        "FAIL",
        "PASS"
    ),
    "Date",
    VAR vFinalCategory =
        IF (
            CALCULATE ( COUNTROWS ( 'YourTable' ), 'YourTable'[Category] = "FAIL" ) > 0,
            "FAIL",
            "PASS"
        )
    RETURN
        CALCULATE ( MIN ( 'YourTable'[Date] ), 'YourTable'[Category] = vFinalCategory ),
    "Test",
    VAR vFinalCategory =
        IF (
            CALCULATE ( COUNTROWS ( 'YourTable' ), 'YourTable'[Category] = "FAIL" ) > 0,
            "FAIL",
            "PASS"
        )
    VAR vFinalDate =
        CALCULATE ( MIN ( 'YourTable'[Date] ), 'YourTable'[Category] = vFinalCategory )
    RETURN
        CALCULATE (
            MIN ( 'YourTable'[Test] ),
            'YourTable'[Category] = vFinalCategory,
            'YourTable'[Date] = vFinalDate
        )
)

This DAX formula groups the data by ID and then, for each group, calculates the final category, finds the earliest date associated with that category, and retrieves the corresponding test value, constructing the exact table you need.

Best regards,

ivana_tomekova · ‎09-22-2025

1. Order your data in a way, that row you want to keep will be on TOP - it this case, I guess you sort table first by ID ascending and then Category ascending: - in this case, if within multiple same IDs was one FAIL, it will be on the top

2. Now you need to apply Table.Buffer function (as power query step) so sorting from point 1 is applied and kept

3. You just Remove duplicated rows and that's should be it 😉

... some info can also be found here: https://community.fabric.microsoft.com/t5/Desktop/Remove-duplicates-based-on-sort-in-Power-Query/m-p...