Starting December 3, join live sessions with database experts and the Microsoft product team to learn just how easy it is to get started
Learn moreShape the future of the Fabric Community! Your insights matter. That’s why we created a quick survey to learn about your experience finding answers to technical questions. Take survey.
PBIC,
Below link is to an .xlsx file with two tables. Table one is "sample" data and table two is "expected results".
https://1drv.ms/x/s!AiY07PwRMPgKi1lVuJXs3aW8K81I?e=u7MJgH
In "sample" data table, there are multiple duplicate Address entries. To the right are four columns, each a "cateogry" column. Some may have data, others may not. It's completely random.
I would like to search for the Address column for duplicates, and then, scan the four category cells to the right, and which ever duplicate row has the highest number of cells that contain data, keep that 1 row, and delete the remaining duplicate Address rows.
I tried to color code a different example, and a note to the right hand side of what that color code represents.
Thank you in advance.
Ryan
Solved! Go to Solution.
Hi @ryan1919, try this.
Before
After:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("7ZNNS8NAEIb/ytBTA5tikzQft0ZFEdpLKyiUHrbJNl3dZCG7odRf70xIldqDKXrwIOxOklmeyXy8u1oNXrQAU0q7G7DB2PNhzmUFS8vgWhurKwbzFNAfTPAc136/HyHSEqNMl+iiE3RNi5JL1frW7OK43SLSNG9SEAVPXFWNbambncx4oRk8zGAS+N74HOt+5l2G1boQNX74PTAqnud5LYzpaj9GETm+Bd+FGGK9DqBxTwunCKrJDtTMftnHR8zqsiQu7MclRw7HQ1gwCWFJjWu5ueSlZHA3gySOwvPJbKRSkDfZK/XLG8OCegeLnMEtV4obBo/Pv4EOURgOYG5uFCdXfyQHGj7BtAutjegkEEdocr7dHqY7bU/uwA9ybWUJNd9spCVZhNHnDUqt4pXlDO5TQB2FlMAQHw6gcXHHXbq2qa0SX6T6H/kj8vod", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Name = _t, Address = _t, Phone = _t, Website = _t, Reviews = _t, Email = _t]),
GroupedRows = Table.Group(Source, {"Name", "Address"}, {{"All", each
Table.RemoveColumns(Table.FromRecords({ Table.Max(Table.AddColumn(_,"Count", (x)=> List.Count(List.Select(Record.ToList(x), (y)=> y <> "" and y <> null))), "Count") }), {"Count"}), type table}}),
CombinedTables = Table.Combine(GroupedRows[All])
in
CombinedTables
Hi @ryan1919, try this.
Before
After:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("7ZNNS8NAEIb/ytBTA5tikzQft0ZFEdpLKyiUHrbJNl3dZCG7odRf70xIldqDKXrwIOxOklmeyXy8u1oNXrQAU0q7G7DB2PNhzmUFS8vgWhurKwbzFNAfTPAc136/HyHSEqNMl+iiE3RNi5JL1frW7OK43SLSNG9SEAVPXFWNbambncx4oRk8zGAS+N74HOt+5l2G1boQNX74PTAqnud5LYzpaj9GETm+Bd+FGGK9DqBxTwunCKrJDtTMftnHR8zqsiQu7MclRw7HQ1gwCWFJjWu5ueSlZHA3gySOwvPJbKRSkDfZK/XLG8OCegeLnMEtV4obBo/Pv4EOURgOYG5uFCdXfyQHGj7BtAutjegkEEdocr7dHqY7bU/uwA9ybWUJNd9spCVZhNHnDUqt4pXlDO5TQB2FlMAQHw6gcXHHXbq2qa0SX6T6H/kj8vod", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Name = _t, Address = _t, Phone = _t, Website = _t, Reviews = _t, Email = _t]),
GroupedRows = Table.Group(Source, {"Name", "Address"}, {{"All", each
Table.RemoveColumns(Table.FromRecords({ Table.Max(Table.AddColumn(_,"Count", (x)=> List.Count(List.Select(Record.ToList(x), (y)=> y <> "" and y <> null))), "Count") }), {"Count"}), type table}}),
CombinedTables = Table.Combine(GroupedRows[All])
in
CombinedTables
Insert this as your next step
= Table.FromRecords(Table.Group(Source, {"Name", "Address"}, {{"All", each Record.RemoveFields(Table.Max(Table.AddColumn(_, "Count", (x)=> List.Count(List.Select(Record.ToList(x), (y)=> y <>""))), "Count"), "Count")}})[All])
Your insights matter. That’s why we created a quick survey to learn about your experience finding answers to technical questions.
User | Count |
---|---|
25 | |
13 | |
12 | |
11 | |
8 |
User | Count |
---|---|
44 | |
27 | |
19 | |
16 | |
12 |