The ultimate Fabric, Power BI, SQL, and AI community-led learning event. Save €200 with code FABCOMM.
Get registeredCompete to become Power BI Data Viz World Champion! First round ends August 18th. Get started.
I am working on a dataset that has a column for enrollment (either Y or N), and I would like to remove the duplicate customers from just the enrollment data (enrollment = Y). Is there a way that I can remove duplicates from just the enrollment people without splitting up the list?
Solved! Go to Solution.
Filter your column on N once and Y once. On filtered Y table, remove duplicates. Append the result with filtered N table.
See the working here - Open a blank query - Home - Advanced Editor - Remove everything from there and paste the below code to test (later on when you use the query on your dataset, you will have to change the source appropriately. If you have columns other than these, then delete Changed type step and do a Changed type for complete table from UI again)
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WclTSUYpUitWJVnKCs5yBLD8wC1UWIuYCF3OHsxBinnB1XiiyEDEfHHZAWP5wlhMOF0BYwRBWLAA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Name = _t, Enrollment = _t]),
#"Added Index" = Table.AddIndexColumn(Source, "Index", 0, 1, Int64.Type),
#"Filtered Rows" = Table.SelectRows(#"Added Index", each ([Enrollment] = "N")),
Custom1 = Table.SelectRows(#"Added Index", each ([Enrollment] = "Y")),
#"Removed Duplicates" = Table.Distinct(Custom1, {"Name"}),
#"Appended Query" = Table.Combine({#"Filtered Rows", #"Removed Duplicates"}),
#"Sorted Rows" = Table.Sort(#"Appended Query",{{"Index", Order.Ascending}}),
#"Removed Columns" = Table.RemoveColumns(#"Sorted Rows",{"Index"})
in
#"Removed Columns"
Filter your column on N once and Y once. On filtered Y table, remove duplicates. Append the result with filtered N table.
See the working here - Open a blank query - Home - Advanced Editor - Remove everything from there and paste the below code to test (later on when you use the query on your dataset, you will have to change the source appropriately. If you have columns other than these, then delete Changed type step and do a Changed type for complete table from UI again)
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WclTSUYpUitWJVnKCs5yBLD8wC1UWIuYCF3OHsxBinnB1XiiyEDEfHHZAWP5wlhMOF0BYwRBWLAA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Name = _t, Enrollment = _t]),
#"Added Index" = Table.AddIndexColumn(Source, "Index", 0, 1, Int64.Type),
#"Filtered Rows" = Table.SelectRows(#"Added Index", each ([Enrollment] = "N")),
Custom1 = Table.SelectRows(#"Added Index", each ([Enrollment] = "Y")),
#"Removed Duplicates" = Table.Distinct(Custom1, {"Name"}),
#"Appended Query" = Table.Combine({#"Filtered Rows", #"Removed Duplicates"}),
#"Sorted Rows" = Table.Sort(#"Appended Query",{{"Index", Order.Ascending}}),
#"Removed Columns" = Table.RemoveColumns(#"Sorted Rows",{"Index"})
in
#"Removed Columns"
let
Origine = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WSjZU0lGqBGJDA6VYHSDfCMo3gvJh8sZo8iZQvjGQnQfEpkB+LAA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Colonna1 = _t, Colonna2 = _t, Colonna3 = _t]),
#"Modificato tipo" = Table.TransformColumnTypes(Origine,{{"Colonna1", type text}, {"Colonna2", type text}}),
#"Rimossi duplicati" = Table.Distinct(#"Modificato tipo",{"Colonna1","Colonna2"})
in
#"Rimossi duplicati"
table.distinct or list.distict