Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Get Fabric Certified for FREE during Fabric Data Days. Don't miss your chance! Request now

Reply
markmess77
Resolver I
Resolver I

How does remove duplicates work?

I'm looking to remove duplicates in a table, but I want to make sure that the rows that I am keeping are the ones that I actually want.

For example, I have something like the following where I have an acccount ID and two employees assigned to the account

IDEmployee 1Employee 2
123JoeEddie
456FrankFelicia
789MonicaBetty
789MonicaBetty
789HannahBrett
123Joe

Hank

 

I'm under the impression that if I remove duplicates within Power Query, it will always keep the last instance. I am also under the assumption that whenever I append new data to Power BI, the most recent data is added to the bottom of each table. From the example above, I would want to keep the last instance of each of the ID columns as they contain the most up-to-date employee assignments. Essentially, I want to be certain that once I drop the duplicates of ID, I always return the below

 

IDEmployee 1Employee 2
456FrankFelicia
789HannahBrett

123

JoeHank

 

I want to be certain that everything is acting as I expect and I am presenting the most relevant and up-to-date data.

Is someone able to confirm that the drop duplicates function will always act this way?

2 ACCEPTED SOLUTIONS
Anonymous
Not applicable

Hi @markmess77,

You can refer to the following sample code to achieve your requirement:

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMjQyVtJR8spPBZKuKSmZqUqxOtFKJqZmQL5bUWJeNohOzclMzkwEy5hbWAJFfPPzMpMTgQyn1JKSSmIkPBLz8hIzQBJFQBmwBLLVHiCbYmMB", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [ID = _t, #"Employee 1" = _t, #"Employee 2" = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"ID", Int64.Type}, {"Employee 1", type text}, {"Employee 2", type text}}),
    #"Grouped Rows" = Table.Group(#"Changed Type", {"ID"}, {{"Count", each Table.LastN(_,1), type table [ID= number, Employee 1=text, Employee 2=text]}}),
    #"Expanded Count" = Table.ExpandTableColumn(#"Grouped Rows", "Count", {"Employee 1", "Employee 2"}, {"Employee 1", "Employee 2"})
in
    #"Expanded Count"

Logic:

1. Group table rows based on ID fields.
2. Keep the last row of each grouped tables.
3. Expand filtered group table records.

Regards,

Xiaoxin Sheng

View solution in original post

Ashish_Mathur
Super User
Super User

Hi,

This M code works

let
    Source = Excel.CurrentWorkbook(){[Name="Data"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"ID", Int64.Type}, {"Employee 1", type text}, {"Employee 2", type text}}),
    #"Added Index" = Table.AddIndexColumn(#"Changed Type", "Index", 1, 1),
    #"Grouped Rows" = Table.Group(#"Added Index", {"ID"}, {{"Max of index", each List.Max([Index]), type number}}),
    Joined = Table.Join(#"Added Index", "ID", #"Grouped Rows", "ID"),
    #"Added Custom" = Table.AddColumn(Joined, "Test", each if [Index]=[Max of index] then "Keep" else "Discard"),
    #"Filtered Rows" = Table.SelectRows(#"Added Custom", each ([Test] = "Keep")),
    #"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"Index", "Max of index", "Test"})
in
    #"Removed Columns"

Hope this helps.

Untitled.png


Regards,
Ashish Mathur
http://www.ashishmathur.com
https://www.linkedin.com/in/excelenthusiasts/

View solution in original post

4 REPLIES 4
Ashish_Mathur
Super User
Super User

Hi,

This M code works

let
    Source = Excel.CurrentWorkbook(){[Name="Data"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"ID", Int64.Type}, {"Employee 1", type text}, {"Employee 2", type text}}),
    #"Added Index" = Table.AddIndexColumn(#"Changed Type", "Index", 1, 1),
    #"Grouped Rows" = Table.Group(#"Added Index", {"ID"}, {{"Max of index", each List.Max([Index]), type number}}),
    Joined = Table.Join(#"Added Index", "ID", #"Grouped Rows", "ID"),
    #"Added Custom" = Table.AddColumn(Joined, "Test", each if [Index]=[Max of index] then "Keep" else "Discard"),
    #"Filtered Rows" = Table.SelectRows(#"Added Custom", each ([Test] = "Keep")),
    #"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"Index", "Max of index", "Test"})
in
    #"Removed Columns"

Hope this helps.

Untitled.png


Regards,
Ashish Mathur
http://www.ashishmathur.com
https://www.linkedin.com/in/excelenthusiasts/
Anonymous
Not applicable

Hi @markmess77,

You can refer to the following sample code to achieve your requirement:

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMjQyVtJR8spPBZKuKSmZqUqxOtFKJqZmQL5bUWJeNohOzclMzkwEy5hbWAJFfPPzMpMTgQyn1JKSSmIkPBLz8hIzQBJFQBmwBLLVHiCbYmMB", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [ID = _t, #"Employee 1" = _t, #"Employee 2" = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"ID", Int64.Type}, {"Employee 1", type text}, {"Employee 2", type text}}),
    #"Grouped Rows" = Table.Group(#"Changed Type", {"ID"}, {{"Count", each Table.LastN(_,1), type table [ID= number, Employee 1=text, Employee 2=text]}}),
    #"Expanded Count" = Table.ExpandTableColumn(#"Grouped Rows", "Count", {"Employee 1", "Employee 2"}, {"Employee 1", "Employee 2"})
in
    #"Expanded Count"

Logic:

1. Group table rows based on ID fields.
2. Keep the last row of each grouped tables.
3. Expand filtered group table records.

Regards,

Xiaoxin Sheng

camargos88
Community Champion
Community Champion

Hi @markmess77 ,

 

Try this code:

 

let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMjQyVtJR8spPBZKuKSmZqUqxOtFKJqZmQL5bUWJeNohOzclMzkwEy5hbWAJFfPPzMpMTgQyn1JKSSmIkPBLz8hIzQBJFQBmwBLLVHiCbYmMB", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [ID = _t, #"Employee 1" = _t, #"Employee 2" = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"ID", Int64.Type}, {"Employee 1", type text}, {"Employee 2", type text}}),
Group = Table.Group(#"Changed Type", {"ID"}, {{"Rows", each Table.AddIndexColumn(_, "Index", 1), type table}}),
#"Expanded Rows" = Table.ExpandTableColumn(Group, "Rows", {"Employee 1", "Employee 2", "Index"}, {"Employee 1", "Employee 2", "Index"}),
#"Added Custom" = Table.AddColumn(#"Expanded Rows", "Custom", each Table.Group(#"Expanded Rows", {"ID"}, {{"Max", each List.Max([Index]), type number}})),
#"Expanded Custom" = Table.ExpandTableColumn(#"Added Custom", "Custom", {"ID", "Max"}, {"ID.1", "Max"}),
#"Added Custom1" = Table.AddColumn(#"Expanded Custom", "Custom", each if [ID] = [ID.1] and [Index] = [Max] then 1 else 0),
#"Filtered Rows" = Table.SelectRows(#"Added Custom1", each ([Custom] = 1)),
#"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"Index", "ID.1", "Max"})
in
#"Removed Columns"

 

It basically creates an index for each ID and filter to the max value for each ID.

 

I hope it helps you,

 

Ricardo



Did I answer your question? Mark my post as a solution!

Proud to be a Super User!



Greg_Deckler
Community Champion
Community Champion

Add an Index and then you can be certain which one. I thought it kept the first row it found.


Follow on LinkedIn
@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
DAX For Humans

DAX is easy, CALCULATE makes DAX hard...

Helpful resources

Announcements
Fabric Data Days Carousel

Fabric Data Days

Advance your Data & AI career with 50 days of live learning, contests, hands-on challenges, study groups & certifications and more!

October Power BI Update Carousel

Power BI Monthly Update - October 2025

Check out the October 2025 Power BI update to learn about new features.

FabCon Atlanta 2026 carousel

FabCon Atlanta 2026

Join us at FabCon Atlanta, March 16-20, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.

Top Solution Authors