The ultimate Microsoft Fabric, Power BI, Azure AI, and SQL learning event: Join us in Stockholm, September 24-27, 2024.
Save €200 with code MSCUST on top of early bird pricing!
Find everything you need to get certified on Fabric—skills challenges, live sessions, exam prep, role guidance, and more. Get started
Hi,
I have a table with 3 columns:
[ID]: Whole Number
[Informations]: Text
[Info quality]: Whole Number, from 1 to 4
I want to keep one row for each distinct [ID], but the row I keep must be the one with the lowest number in the [Info quality] column.
So for example if I have:
ID | Informations | Info Quality |
235 | Blue | 3 |
425 | Blue | 3 |
425 | Red | 2 |
838 | White | 4 |
235 | Red | 2 |
... I want to remove row #1 (ID = 235; Info Quality = 3) instead of row #5 (ID = 235; Info Quality = 2) and remove row #2 (ID = 425; Info Quality = 3) instead of row #3 (ID = 425; Info Quality = 2), ending up with the following:
ID | Informations | Info Quality |
425 | Red | 2 |
838 | White | 4 |
235 | Red | 2 |
Thanks!
Solved! Go to Solution.
rowstobedeleted= Table.Sort(Yoursource,{{Quality, Order.Ascending}})
and then you could use Table.Distinct to remove the duplicates.
Hi,
This M code works
let
Source = Excel.CurrentWorkbook(){[Name="Data"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"ID", Int64.Type}, {"Informations", type text}, {"Info Quality", Int64.Type}}),
#"Grouped Rows" = Table.Group(#"Changed Type", {"ID"}, {{"Count", each Table.Min(_,"Info Quality")}}),
#"Expanded Count" = Table.ExpandRecordColumn(#"Grouped Rows", "Count", {"Informations", "Info Quality"}, {"Informations", "Info Quality"})
in
#"Expanded Count"
Hope this helps.
Correct. It can't be relied on.
Hi,
This M code works
let
Source = Excel.CurrentWorkbook(){[Name="Data"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"ID", Int64.Type}, {"Informations", type text}, {"Info Quality", Int64.Type}}),
#"Grouped Rows" = Table.Group(#"Changed Type", {"ID"}, {{"Count", each Table.Min(_,"Info Quality")}}),
#"Expanded Count" = Table.ExpandRecordColumn(#"Grouped Rows", "Count", {"Informations", "Info Quality"}, {"Informations", "Info Quality"})
in
#"Expanded Count"
Hope this helps.
Hi,
Your code is based off my example, which was a simplified version of my real situation because I wanted to not overburden my question.
My real table has 16 columns, and I see in your code:
#"Expanded Count" = Table.ExpandRecordColumn(#"Grouped Rows", "Count", {"Informations", "Info Quality"}, {"Informations", "Info Quality"})
... that you refer to the columns other than "ID". Do I need to insert all 15 column names twice? Or is there a short hand for doing this?
Thank!
Just throwing in a warning about relying on a sort order and using Table.Distinct. The documentation has some warnings against this.
Can you elaborate a bit? You think in my scenario, there's a risk the lowest value in the sorted column won't be the one preserved post Table.Disctinct?
rowstobedeleted= Table.Sort(Yoursource,{{Quality, Order.Ascending}})
and then you could use Table.Distinct to remove the duplicates.
Join the community in Stockholm for expert Microsoft Fabric learning including a very exciting keynote from Arun Ulag, Corporate Vice President, Azure Data.
Check out the August 2024 Power BI update to learn about new features.
Learn from experts, get hands-on experience, and win awesome prizes.
User | Count |
---|---|
114 | |
79 | |
78 | |
47 | |
39 |
User | Count |
---|---|
147 | |
115 | |
65 | |
64 | |
53 |