Solved: Remove duplicates in column E, based on condition ...

Anonymous · ‎02-23-2024

I have Column A to E.

When I Removed duplicates of the last column (E), my desire outcome is to keep only one entry in Column B and C with "NA" there, but Power Query would keep empty cell in Column B and C. When I searched for solutions, some said Power Query remove duplicate based on order, but it does not work in my case, I tried put column B and C in both descending and ascending order, none of them works.

Anonymous · ‎02-26-2024

Hi @Anonymous

According to the following description from document (Table.Distinct), you can add a step to buffer the table after sorting by B and C descendingly. Then remove the duplicates by E. This will be more reliable.

Because Power Query sometimes offloads certain operations to backend data sources (known as folding), and also sometimes optimizes queries by skipping operations that aren't strictly necessary, in general there's no guarantee which specific duplicate will be preserved. For example, you can't assume that the first row with a unique set of column values will remain, and rows further down in the table will be removed. If you want the duplicate removal to behave predictably, first buffer the table using Table.Buffer.

    #"Sorted Rows" = Table.Sort(#"Changed Type",{{"Column B", Order.Descending}, {"Column C", Order.Descending}}),
    Custom1 = Table.Buffer(#"Sorted Rows"),
    #"Removed Duplicates" = Table.Distinct(Custom1, {"Column E"})

Best Regards,
Jing
If this post helps, please Accept it as Solution to help other members find it. Appreciate your Kudos!

View solution in original post

Anonymous · ‎02-26-2024

Hi @Anonymous

According to the following description from document (Table.Distinct), you can add a step to buffer the table after sorting by B and C descendingly. Then remove the duplicates by E. This will be more reliable.

Because Power Query sometimes offloads certain operations to backend data sources (known as folding), and also sometimes optimizes queries by skipping operations that aren't strictly necessary, in general there's no guarantee which specific duplicate will be preserved. For example, you can't assume that the first row with a unique set of column values will remain, and rows further down in the table will be removed. If you want the duplicate removal to behave predictably, first buffer the table using Table.Buffer.

    #"Sorted Rows" = Table.Sort(#"Changed Type",{{"Column B", Order.Descending}, {"Column C", Order.Descending}}),
    Custom1 = Table.Buffer(#"Sorted Rows"),
    #"Removed Duplicates" = Table.Distinct(Custom1, {"Column E"})

Best Regards,
Jing
If this post helps, please Accept it as Solution to help other members find it. Appreciate your Kudos!

lbendlin · ‎02-23-2024

It will work, sort of, but it will not be 100% reliable.

1. Sort by B and C descending

2. Select E and Remove Duplicates.

Better to use grouping and filters.

Remove duplicates in column E, based on condition of column B and C

Helpful resources

FabCon Global Hackathon

Power BI Monthly Update - September 2025

FabCon Atlanta 2026

FabCon is coming to Atlanta

Remove duplicates in column E, based on condition of column B and C

Helpful resources

FabCon Global Hackathon

Power BI Monthly Update - September 2025

FabCon Atlanta 2026