Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

View all the Fabric Data Days sessions on demand. View schedule

Reply
Cado_one
Resolver III
Resolver III

Conditionnaly remove duplicates

Hello !

 

I face an issue with one table, I will try to reproduce this issue as a fictive exercise below.

Here is a table :

IDNAMESOURCE
1PLANT AMYSQL
2PLANT B1MYSQL
2PLANT B2

MYSQL

3PLANT CMYSQL
4PLANT DWEB API
2PLANT B3WEB API
5PLANT EWEB API

As you can see there are duplicate IDs for the plants with ID = 2. The exercise consists in removing the ID duplicates from the second source WEB API only.

In this case, the PLANT B3 would be removed but the PLANT B1 and PLANT B2 would remain in the table.

How would you solve this case ?

One rule : I don't want to filter on column NAME to remove directly the PLANT B3, I need it to be automatic.

 

Thanks in advance.

Best regards,

CADO

1 ACCEPTED SOLUTION
AlexisOlson
Super User
Super User

Split the table into WEB API and not WEB API rows. In the former, remove rows with IDs that exist in the latter and then append them together.

 

Try pasting this into the Advanced Editor of a new blank query to see the steps:

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlTSUQrwcfQLUXAEsnwjgwN9lGJ1opWM4OJOhrgkjFAkjOESzijiJnBxFyAr3NVJwTHAE90oYzQpU7iUK7JMLAA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [ID = _t, NAME = _t, SOURCE = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"ID", Int64.Type}, {"NAME", type text}, {"SOURCE", type text}}),
    #"Filtered NonWEB" = Table.SelectRows(#"Changed Type", each ([SOURCE] <> "WEB API")),
    #"Filtered WEBAPI" = Table.SelectRows(#"Changed Type", each ([SOURCE] = "WEB API")),
    #"Removed Duplicates" = Table.SelectRows(#"Filtered WEBAPI", each not List.Contains(#"Filtered NonWEB"[ID], [ID])),
    #"Appended Query" = Table.Combine({#"Filtered NonWEB", #"Removed Duplicates"})
in
    #"Appended Query"

 

For large datasets, it's probably more efficient to use an anti-join rather than List.Contains but the idea is the same.

 

View solution in original post

2 REPLIES 2
Cado_one
Resolver III
Resolver III

Hi @AlexisOlson 

 

Thank you for the trick it's working !

A little bit slow with 200 records so I will try the anti-join option.

 

Have a good day !

Cado

AlexisOlson
Super User
Super User

Split the table into WEB API and not WEB API rows. In the former, remove rows with IDs that exist in the latter and then append them together.

 

Try pasting this into the Advanced Editor of a new blank query to see the steps:

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlTSUQrwcfQLUXAEsnwjgwN9lGJ1opWM4OJOhrgkjFAkjOESzijiJnBxFyAr3NVJwTHAE90oYzQpU7iUK7JMLAA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [ID = _t, NAME = _t, SOURCE = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"ID", Int64.Type}, {"NAME", type text}, {"SOURCE", type text}}),
    #"Filtered NonWEB" = Table.SelectRows(#"Changed Type", each ([SOURCE] <> "WEB API")),
    #"Filtered WEBAPI" = Table.SelectRows(#"Changed Type", each ([SOURCE] = "WEB API")),
    #"Removed Duplicates" = Table.SelectRows(#"Filtered WEBAPI", each not List.Contains(#"Filtered NonWEB"[ID], [ID])),
    #"Appended Query" = Table.Combine({#"Filtered NonWEB", #"Removed Duplicates"})
in
    #"Appended Query"

 

For large datasets, it's probably more efficient to use an anti-join rather than List.Contains but the idea is the same.

 

Helpful resources

Announcements
November Power BI Update Carousel

Power BI Monthly Update - November 2025

Check out the November 2025 Power BI update to learn about new features.

Fabric Data Days Carousel

Fabric Data Days

Advance your Data & AI career with 50 days of live learning, contests, hands-on challenges, study groups & certifications and more!

FabCon Atlanta 2026 carousel

FabCon Atlanta 2026

Join us at FabCon Atlanta, March 16-20, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.