Check your eligibility for this 50% exam voucher offer and join us for free live learning sessions to get prepared for Exam DP-700.
Get StartedDon't miss out! 2025 Microsoft Fabric Community Conference, March 31 - April 2, Las Vegas, Nevada. Use code MSCUST for a $150 discount. Prices go up February 11th. Register now.
I'm trying to clean up some data in my Data Set and remove rows that are no longer needed based on prior steps, but I'm having trouble coming up with the specific syntax
In my data, I have multiple rows for an ID. In prior steps I have created null values in one column, while leaving other values in that column. I am looking to remove the rows with the null values, but only when there are other rows for that ID that do not have a null value. I want to maintain the rows where this is only 1 row for an ID, that has a null value in that column. There will never be multiple rows for an ID that contain null in the specfic column, as I have already removed duplicate rows based on ID and that column.
To do this, I feel that I need to use the Group By on the ID, and create an index. Then, if there is only a single index value for the ID, leave the row as is. However, if there are multiple rows for that ID, remove the row that corresponds to the index that has the null value. This last part is where I am getting stuck and any help is appreciated.
Example data
ID | Column |
1 | |
2 | |
2 | Value1 |
3 | Value2 |
4 | |
5 | |
5 | Value3 |
5 | Value4 |
Desired Result
ID | Column |
1 | |
2 | Value1 |
3 | Value2 |
4 | |
5 | Value3 |
5 | Value4 |
Solved! Go to Solution.
hi @DerhakRon ,
create a blank query. Copy and paste the below code into the advanced editor.
let
// set up input
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45W8nRR0lFyzs8pzc1TitWJVjIEchXALCMUVlhiTmmqIZhrDOMagbkmcHWmKCywEmNUrolSbCwA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"Example data" = _t, #"(blank)" = _t]),
#"Promoted Headers" = Table.PromoteHeaders(Source, [PromoteAllScalars=true]),
#"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"ID", Int64.Type}, {"Column", type text}}),
// sort "column" ascending
#"Sorted Rows" = Table.Sort(#"Changed Type",{{"Column", Order.Ascending}}),
// group rows by "ID", get all other data. Count distinct rows.
#"Grouped Rows" = Table.Group(#"Sorted Rows", {"ID"}, {{"Data", each _, type table [ID=nullable number, Column=nullable text]}, {"DistinctRowCount", each Table.RowCount(Table.Distinct(_)), Int64.Type}}),
// Add an index column
Transformed_Data = Table.AddColumn(#"Grouped Rows", "Transformed_Data", each Table.AddIndexColumn([Data], "Index",0,1 )),
//remove and unwanted columns and expand table
#"Removed Other Columns" = Table.SelectColumns(Transformed_Data,{"Transformed_Data", "DistinctRowCount"}),
#"Expanded Transformed_Data" = Table.ExpandTableColumn(#"Removed Other Columns", "Transformed_Data", {"ID", "Column", "Index"}, {"ID", "Column", "Index"}),
#"Changed Type1" = Table.TransformColumnTypes(#"Expanded Transformed_Data",{{"Index", Int64.Type}, {"Column", type text}, {"ID", Int64.Type}}),
// create the filter criteria and filter
#"Add FilterCritera" = Table.AddColumn(#"Changed Type1", "FilterCritera", each if [DistinctRowCount] <> 1 and [Index] =0 then "Y" else "N"),
#"FilterCritera = N" = Table.SelectRows(#"Add FilterCritera", each ([FilterCritera] = "N")),
#"Removed Other Columns1" = Table.SelectColumns(#"FilterCritera = N",{"Column", "ID"}),
#"Sorted Rows1" = Table.Sort(#"Removed Other Columns1",{{"ID", Order.Ascending}})
in
#"Sorted Rows1"
let me know if this meets the requirement.
hi @DerhakRon ,
create a blank query. Copy and paste the below code into the advanced editor.
let
// set up input
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45W8nRR0lFyzs8pzc1TitWJVjIEchXALCMUVlhiTmmqIZhrDOMagbkmcHWmKCywEmNUrolSbCwA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"Example data" = _t, #"(blank)" = _t]),
#"Promoted Headers" = Table.PromoteHeaders(Source, [PromoteAllScalars=true]),
#"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"ID", Int64.Type}, {"Column", type text}}),
// sort "column" ascending
#"Sorted Rows" = Table.Sort(#"Changed Type",{{"Column", Order.Ascending}}),
// group rows by "ID", get all other data. Count distinct rows.
#"Grouped Rows" = Table.Group(#"Sorted Rows", {"ID"}, {{"Data", each _, type table [ID=nullable number, Column=nullable text]}, {"DistinctRowCount", each Table.RowCount(Table.Distinct(_)), Int64.Type}}),
// Add an index column
Transformed_Data = Table.AddColumn(#"Grouped Rows", "Transformed_Data", each Table.AddIndexColumn([Data], "Index",0,1 )),
//remove and unwanted columns and expand table
#"Removed Other Columns" = Table.SelectColumns(Transformed_Data,{"Transformed_Data", "DistinctRowCount"}),
#"Expanded Transformed_Data" = Table.ExpandTableColumn(#"Removed Other Columns", "Transformed_Data", {"ID", "Column", "Index"}, {"ID", "Column", "Index"}),
#"Changed Type1" = Table.TransformColumnTypes(#"Expanded Transformed_Data",{{"Index", Int64.Type}, {"Column", type text}, {"ID", Int64.Type}}),
// create the filter criteria and filter
#"Add FilterCritera" = Table.AddColumn(#"Changed Type1", "FilterCritera", each if [DistinctRowCount] <> 1 and [Index] =0 then "Y" else "N"),
#"FilterCritera = N" = Table.SelectRows(#"Add FilterCritera", each ([FilterCritera] = "N")),
#"Removed Other Columns1" = Table.SelectColumns(#"FilterCritera = N",{"Column", "ID"}),
#"Sorted Rows1" = Table.Sort(#"Removed Other Columns1",{{"ID", Order.Ascending}})
in
#"Sorted Rows1"
let me know if this meets the requirement.
March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount! Prices go up Feb. 11th.
Check out the January 2025 Power BI update to learn about new features in Reporting, Modeling, and Data Connectivity.
User | Count |
---|---|
14 | |
13 | |
12 | |
12 | |
12 |