Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

To celebrate FabCon Vienna, we are offering 50% off select exams. Ends October 3rd. Request your discount now.

Reply
CesarOr
New Member

Removing duplicates with same ID based on another criteria in another column

Hi Everyone,

 

I'm a bit of a newbie and would like help with a problem that I cannot find a solution to.  I would like to create a table that extracts the duplicates but only returns one ID. If there was a FAILURE as a categorical text result in another column for one of the rows, it would return one row for the one ID as a FAILURE. If not, it would return as a PASS. 

 

Current table:

 

CesarOr_1-1758552241631.png

 

After/Result:

CesarOr_2-1758552259936.png

 

Thank you!

 

 

6 REPLIES 6
v-saisrao-msft
Community Support
Community Support

Hi @CesarOr,

Checking in to see if your issue has been resolved. let us know if you still need any assistance.

 

Thank you.

v-saisrao-msft
Community Support
Community Support

Hi @CesarOr,

Have you had a chance to review the solution we shared earlier? If the issue persists, feel free to reply so we can help further.

 

Thank you.

JamesLee94
Helper I
Helper I

You can use a custom Power Query step: group by ID, then pick the record with the highest (or lowest) value of your criteria-e.g., max timestamp. That way duplicates get removed based on your chosen criteria.

v-saisrao-msft
Community Support
Community Support

Hi @CesarOr,

Thank you @DataNinja777 @ivana_tomekova, for your insights.

I have reproduced the issue and received the output below. I’ve attached the PBIX file for your reference.

 

vsaisraomsft_0-1758610370250.png

Hope this helps 

 

Thank you.

DataNinja777
Super User
Super User

Hi @CesarOr ,

 

You're looking to deduplicate your table based on the ID column, with the rule that a "FAIL" status for any entry under an ID will make the final result for that ID a "FAIL". This is an excellent use case for data shaping, and the recommended approach is using Power Query, which is available in both Excel and Power BI.

To accomplish this in Power Query, you would first load your data into the editor. The main action is to group the data using the "Group By" function on the ID column. You'll configure it to create a new column, for instance named AllData, which will hold all the original rows for each ID in a nested table. After grouping, you add a custom column named FinalCategory to determine the correct status. The formula for this column checks if the list of categories within the AllData nested table contains the text "FAIL".

if List.Contains([AllData][Category], "FAIL") then "FAIL" else "PASS"

With the final status determined, you add another custom column, perhaps called FinalRow, to extract the specific row you want to keep. This is achieved with a formula that filters the nested table to find the first row whose category matches the FinalCategory you just created.

Table.First(Table.SelectRows([AllData], each ([Category] = [FinalCategory])))

Finally, you can clean up your query. Remove the temporary AllData column, then use the expand button on the FinalRow column to bring out the original Date and Test fields. After reordering the columns to your liking, you can load the result back to your worksheet or data model.

Alternatively, if you need to create this table dynamically within a Power BI report, you can use a DAX expression to create a new calculated table. You would use the "New Table" feature and input the following code, replacing 'YourTable' with your actual table name.

ResultTable =
SUMMARIZECOLUMNS (
    'YourTable'[ID],
    "FinalCategory",
    IF (
        CALCULATE ( COUNTROWS ( 'YourTable' ), 'YourTable'[Category] = "FAIL" ) > 0,
        "FAIL",
        "PASS"
    ),
    "Date",
    VAR vFinalCategory =
        IF (
            CALCULATE ( COUNTROWS ( 'YourTable' ), 'YourTable'[Category] = "FAIL" ) > 0,
            "FAIL",
            "PASS"
        )
    RETURN
        CALCULATE ( MIN ( 'YourTable'[Date] ), 'YourTable'[Category] = vFinalCategory ),
    "Test",
    VAR vFinalCategory =
        IF (
            CALCULATE ( COUNTROWS ( 'YourTable' ), 'YourTable'[Category] = "FAIL" ) > 0,
            "FAIL",
            "PASS"
        )
    VAR vFinalDate =
        CALCULATE ( MIN ( 'YourTable'[Date] ), 'YourTable'[Category] = vFinalCategory )
    RETURN
        CALCULATE (
            MIN ( 'YourTable'[Test] ),
            'YourTable'[Category] = vFinalCategory,
            'YourTable'[Date] = vFinalDate
        )
)

This DAX formula groups the data by ID and then, for each group, calculates the final category, finds the earliest date associated with that category, and retrieves the corresponding test value, constructing the exact table you need.

 

Best regards,

 

ivana_tomekova
Advocate I
Advocate I

1. Order your data in a way, that row you want to keep will be on TOP - it this case, I guess you sort table first by ID ascending and then Category ascending:  - in this case, if within multiple same IDs was one FAIL, it will be on the top

2. Now you need to apply Table.Buffer function (as power query step) so sorting from point 1 is applied and kept

 

3. You just Remove duplicated rows and that's should be it 😉

... some info can also be found here: https://community.fabric.microsoft.com/t5/Desktop/Remove-duplicates-based-on-sort-in-Power-Query/m-p... 

Helpful resources

Announcements
September Power BI Update Carousel

Power BI Monthly Update - September 2025

Check out the September 2025 Power BI update to learn about new features.

August 2025 community update carousel

Fabric Community Update - August 2025

Find out what's new and trending in the Fabric community.

Top Solution Authors