Solved: Re: trouble "thinking in M" WRT find duplicates, c...

ryan1919 · ‎03-01-2024

PBIC,

Below link is to an .xlsx file with two tables. Table one is "sample" data and table two is "expected results".

https://1drv.ms/x/s!AiY07PwRMPgKi1lVuJXs3aW8K81I?e=u7MJgH

In "sample" data table, there are multiple duplicate Address entries. To the right are four columns, each a "cateogry" column. Some may have data, others may not. It's completely random.

I would like to search for the Address column for duplicates, and then, scan the four category cells to the right, and which ever duplicate row has the highest number of cells that contain data, keep that 1 row, and delete the remaining duplicate Address rows.

I tried to color code a different example, and a note to the right hand side of what that color code represents.

Thank you in advance.

Ryan

dufoq3 · ‎03-02-2024

Hi @ryan1919, try this.

Before

After:

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("7ZNNS8NAEIb/ytBTA5tikzQft0ZFEdpLKyiUHrbJNl3dZCG7odRf70xIldqDKXrwIOxOklmeyXy8u1oNXrQAU0q7G7DB2PNhzmUFS8vgWhurKwbzFNAfTPAc136/HyHSEqNMl+iiE3RNi5JL1frW7OK43SLSNG9SEAVPXFWNbambncx4oRk8zGAS+N74HOt+5l2G1boQNX74PTAqnud5LYzpaj9GETm+Bd+FGGK9DqBxTwunCKrJDtTMftnHR8zqsiQu7MclRw7HQ1gwCWFJjWu5ueSlZHA3gySOwvPJbKRSkDfZK/XLG8OCegeLnMEtV4obBo/Pv4EOURgOYG5uFCdXfyQHGj7BtAutjegkEEdocr7dHqY7bU/uwA9ybWUJNd9spCVZhNHnDUqt4pXlDO5TQB2FlMAQHw6gcXHHXbq2qa0SX6T6H/kj8vod", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Name = _t, Address = _t, Phone = _t, Website = _t, Reviews = _t, Email = _t]),
    GroupedRows = Table.Group(Source, {"Name", "Address"}, {{"All", each 
          Table.RemoveColumns(Table.FromRecords({ Table.Max(Table.AddColumn(_,"Count", (x)=> List.Count(List.Select(Record.ToList(x), (y)=> y <> "" and y <> null))), "Count") }), {"Count"}), type table}}),
    CombinedTables = Table.Combine(GroupedRows[All])
in
    CombinedTables

Note: Check this link to learn how to use my query.
Check this link if you don't know how to provide sample data.

View solution in original post

dufoq3 · ‎03-02-2024

Hi @ryan1919, try this.

Before

After:

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("7ZNNS8NAEIb/ytBTA5tikzQft0ZFEdpLKyiUHrbJNl3dZCG7odRf70xIldqDKXrwIOxOklmeyXy8u1oNXrQAU0q7G7DB2PNhzmUFS8vgWhurKwbzFNAfTPAc136/HyHSEqNMl+iiE3RNi5JL1frW7OK43SLSNG9SEAVPXFWNbambncx4oRk8zGAS+N74HOt+5l2G1boQNX74PTAqnud5LYzpaj9GETm+Bd+FGGK9DqBxTwunCKrJDtTMftnHR8zqsiQu7MclRw7HQ1gwCWFJjWu5ueSlZHA3gySOwvPJbKRSkDfZK/XLG8OCegeLnMEtV4obBo/Pv4EOURgOYG5uFCdXfyQHGj7BtAutjegkEEdocr7dHqY7bU/uwA9ybWUJNd9spCVZhNHnDUqt4pXlDO5TQB2FlMAQHw6gcXHHXbq2qa0SX6T6H/kj8vod", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Name = _t, Address = _t, Phone = _t, Website = _t, Reviews = _t, Email = _t]),
    GroupedRows = Table.Group(Source, {"Name", "Address"}, {{"All", each 
          Table.RemoveColumns(Table.FromRecords({ Table.Max(Table.AddColumn(_,"Count", (x)=> List.Count(List.Select(Record.ToList(x), (y)=> y <> "" and y <> null))), "Count") }), {"Count"}), type table}}),
    CombinedTables = Table.Combine(GroupedRows[All])
in
    CombinedTables

Note: Check this link to learn how to use my query.
Check this link if you don't know how to provide sample data.

Vijay_A_Verma · ‎03-02-2024

Insert this as your next step

= Table.FromRecords(Table.Group(Source, {"Name", "Address"}, {{"All", each Record.RemoveFields(Table.Max(Table.AddColumn(_, "Count", (x)=> List.Count(List.Select(Record.ToList(x), (y)=> y <>""))), "Count"), "Count")}})[All])

trouble "thinking in M" WRT find duplicates, count data, keep highest number

Helpful resources

FabCon Global Hackathon

Power BI Monthly Update - October 2025

FabCon Atlanta 2026

FabCon is coming to Atlanta

trouble "thinking in M" WRT find duplicates, count data, keep highest number

Helpful resources

FabCon Global Hackathon

Power BI Monthly Update - October 2025

FabCon Atlanta 2026