Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Enhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends August 31st. Request your voucher.

Reply
Anonymous
Not applicable

Remove duplicates in column E, based on condition of column B and C

bwang2_0-1708693391012.png

I have Column A to E. 

When I Removed duplicates of the last column (E), my desire outcome is to keep only one entry in Column B and C with "NA" there, but Power Query would keep empty cell in Column B and C. When I searched for solutions, some said Power Query remove duplicate based on order, but it does not work in my case, I tried put column B and C in both descending and ascending order, none of them works.

1 ACCEPTED SOLUTION
Anonymous
Not applicable

Hi @Anonymous 

 

According to the following description from document (Table.Distinct), you can add a step to buffer the table after sorting by B and C descendingly. Then remove the duplicates by E. This will be more reliable. 

 

Because Power Query sometimes offloads certain operations to backend data sources (known as folding), and also sometimes optimizes queries by skipping operations that aren't strictly necessary, in general there's no guarantee which specific duplicate will be preserved. For example, you can't assume that the first row with a unique set of column values will remain, and rows further down in the table will be removed. If you want the duplicate removal to behave predictably, first buffer the table using Table.Buffer.

    #"Sorted Rows" = Table.Sort(#"Changed Type",{{"Column B", Order.Descending}, {"Column C", Order.Descending}}),
    Custom1 = Table.Buffer(#"Sorted Rows"),
    #"Removed Duplicates" = Table.Distinct(Custom1, {"Column E"})

 

Best Regards,
Jing
If this post helps, please Accept it as Solution to help other members find it. Appreciate your Kudos!

View solution in original post

2 REPLIES 2
Anonymous
Not applicable

Hi @Anonymous 

 

According to the following description from document (Table.Distinct), you can add a step to buffer the table after sorting by B and C descendingly. Then remove the duplicates by E. This will be more reliable. 

 

Because Power Query sometimes offloads certain operations to backend data sources (known as folding), and also sometimes optimizes queries by skipping operations that aren't strictly necessary, in general there's no guarantee which specific duplicate will be preserved. For example, you can't assume that the first row with a unique set of column values will remain, and rows further down in the table will be removed. If you want the duplicate removal to behave predictably, first buffer the table using Table.Buffer.

    #"Sorted Rows" = Table.Sort(#"Changed Type",{{"Column B", Order.Descending}, {"Column C", Order.Descending}}),
    Custom1 = Table.Buffer(#"Sorted Rows"),
    #"Removed Duplicates" = Table.Distinct(Custom1, {"Column E"})

 

Best Regards,
Jing
If this post helps, please Accept it as Solution to help other members find it. Appreciate your Kudos!

lbendlin
Super User
Super User

It will work, sort of, but it will not be 100% reliable.

 

1. Sort by B and C descending

2. Select E and Remove Duplicates.

 

Better to use grouping and filters.

Helpful resources

Announcements
July PBI25 Carousel

Power BI Monthly Update - July 2025

Check out the July 2025 Power BI update to learn about new features.

Join our Fabric User Panel

Join our Fabric User Panel

This is your chance to engage directly with the engineering team behind Fabric and Power BI. Share your experiences and shape the future.

June 2025 community update carousel

Fabric Community Update - June 2025

Find out what's new and trending in the Fabric community.