Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Get certified in Microsoft Fabric—for free! For a limited time, get a free DP-600 exam voucher to use by the end of 2024. Register now

Reply
ryan1919
Frequent Visitor

trouble "thinking in M" WRT find duplicates, count data, keep highest number

PBIC,

 

Below link is to an .xlsx file with two tables.  Table one is "sample" data and table two is "expected results".

 

https://1drv.ms/x/s!AiY07PwRMPgKi1lVuJXs3aW8K81I?e=u7MJgH 

 

In "sample" data table, there are multiple duplicate Address entries.  To the right are four columns, each a "cateogry" column.  Some may have data, others may not.  It's completely random.

 

I would like to search for the Address column for duplicates, and then, scan the four category cells to the right, and which ever duplicate row has the highest number of cells that contain data, keep that 1 row, and delete the remaining duplicate Address rows.

 

I tried to color code a different example, and a note to the right hand side of what that color code represents.

 

Thank you in advance.

Ryan

1 ACCEPTED SOLUTION
dufoq3
Super User
Super User

Hi @ryan1919, try this.

 

Before

dufoq3_0-1709373122869.png

 

After:

dufoq3_1-1709373141973.png

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("7ZNNS8NAEIb/ytBTA5tikzQft0ZFEdpLKyiUHrbJNl3dZCG7odRf70xIldqDKXrwIOxOklmeyXy8u1oNXrQAU0q7G7DB2PNhzmUFS8vgWhurKwbzFNAfTPAc136/HyHSEqNMl+iiE3RNi5JL1frW7OK43SLSNG9SEAVPXFWNbambncx4oRk8zGAS+N74HOt+5l2G1boQNX74PTAqnud5LYzpaj9GETm+Bd+FGGK9DqBxTwunCKrJDtTMftnHR8zqsiQu7MclRw7HQ1gwCWFJjWu5ueSlZHA3gySOwvPJbKRSkDfZK/XLG8OCegeLnMEtV4obBo/Pv4EOURgOYG5uFCdXfyQHGj7BtAutjegkEEdocr7dHqY7bU/uwA9ybWUJNd9spCVZhNHnDUqt4pXlDO5TQB2FlMAQHw6gcXHHXbq2qa0SX6T6H/kj8vod", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Name = _t, Address = _t, Phone = _t, Website = _t, Reviews = _t, Email = _t]),
    GroupedRows = Table.Group(Source, {"Name", "Address"}, {{"All", each 
          Table.RemoveColumns(Table.FromRecords({ Table.Max(Table.AddColumn(_,"Count", (x)=> List.Count(List.Select(Record.ToList(x), (y)=> y <> "" and y <> null))), "Count") }), {"Count"}), type table}}),
    CombinedTables = Table.Combine(GroupedRows[All])
in
    CombinedTables

 


Note: Check this link to learn how to use my query.
Check this link if you don't know how to provide sample data.

View solution in original post

2 REPLIES 2
dufoq3
Super User
Super User

Hi @ryan1919, try this.

 

Before

dufoq3_0-1709373122869.png

 

After:

dufoq3_1-1709373141973.png

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("7ZNNS8NAEIb/ytBTA5tikzQft0ZFEdpLKyiUHrbJNl3dZCG7odRf70xIldqDKXrwIOxOklmeyXy8u1oNXrQAU0q7G7DB2PNhzmUFS8vgWhurKwbzFNAfTPAc136/HyHSEqNMl+iiE3RNi5JL1frW7OK43SLSNG9SEAVPXFWNbambncx4oRk8zGAS+N74HOt+5l2G1boQNX74PTAqnud5LYzpaj9GETm+Bd+FGGK9DqBxTwunCKrJDtTMftnHR8zqsiQu7MclRw7HQ1gwCWFJjWu5ueSlZHA3gySOwvPJbKRSkDfZK/XLG8OCegeLnMEtV4obBo/Pv4EOURgOYG5uFCdXfyQHGj7BtAutjegkEEdocr7dHqY7bU/uwA9ybWUJNd9spCVZhNHnDUqt4pXlDO5TQB2FlMAQHw6gcXHHXbq2qa0SX6T6H/kj8vod", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Name = _t, Address = _t, Phone = _t, Website = _t, Reviews = _t, Email = _t]),
    GroupedRows = Table.Group(Source, {"Name", "Address"}, {{"All", each 
          Table.RemoveColumns(Table.FromRecords({ Table.Max(Table.AddColumn(_,"Count", (x)=> List.Count(List.Select(Record.ToList(x), (y)=> y <> "" and y <> null))), "Count") }), {"Count"}), type table}}),
    CombinedTables = Table.Combine(GroupedRows[All])
in
    CombinedTables

 


Note: Check this link to learn how to use my query.
Check this link if you don't know how to provide sample data.

Vijay_A_Verma
Super User
Super User

Insert this as your next step

= Table.FromRecords(Table.Group(Source, {"Name", "Address"}, {{"All", each Record.RemoveFields(Table.Max(Table.AddColumn(_, "Count", (x)=> List.Count(List.Select(Record.ToList(x), (y)=> y <>""))), "Count"), "Count")}})[All])

Helpful resources

Announcements
November Carousel

Fabric Community Update - November 2024

Find out what's new and trending in the Fabric Community.

Live Sessions with Fabric DB

Be one of the first to start using Fabric Databases

Starting December 3, join live sessions with database experts and the Fabric product team to learn just how easy it is to get started.

Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount! Early Bird pricing ends December 9th.

Nov PBI Update Carousel

Power BI Monthly Update - November 2024

Check out the November 2024 Power BI update to learn about new features.