Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Don't miss out! 2025 Microsoft Fabric Community Conference, March 31 - April 2, Las Vegas, Nevada. Use code MSCUST for a $150 discount. Prices go up February 11th. Register now.

Reply
DerhakRon
Frequent Visitor

Remove Rows Where a Null Value Exists When Using Group By if Other Rows Exist for the Group By Key

I'm trying to clean up some data in my Data Set and remove rows that are no longer needed based on prior steps, but I'm having trouble coming up with the specific syntax

 

In my data, I have multiple rows for an ID.  In prior steps I have created null values in one column, while leaving other values in that column.  I am looking to remove the rows with the null values, but only when there are other rows for that ID that do not have a null value. I want to maintain the rows where this is only 1 row for an ID, that has a null value in that column.  There will never be multiple rows for an ID that contain null in the specfic column, as I have already removed duplicate rows based on ID and that column.

 

To do this, I feel that I need to use the Group By on the ID, and create an index.  Then, if there is only a single index value for the ID, leave the row as is.  However, if there are multiple rows for that ID, remove the row that corresponds to the index that has the null value.  This last part is where I am getting stuck and any help is appreciated.

 

Example data

IDColumn
1 
2 
2Value1
3Value2
4 
5 
5Value3
5Value4

Desired Result

IDColumn
1 
2Value1
3Value2
4 
5Value3
5Value4
1 ACCEPTED SOLUTION
adudani
Super User
Super User

hi @DerhakRon ,

 

create a blank query. Copy and paste the below code into the advanced editor.

 

let
   // set up input
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45W8nRR0lFyzs8pzc1TitWJVjIEchXALCMUVlhiTmmqIZhrDOMagbkmcHWmKCywEmNUrolSbCwA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"Example data" = _t, #"(blank)" = _t]),
    #"Promoted Headers" = Table.PromoteHeaders(Source, [PromoteAllScalars=true]),
    #"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"ID", Int64.Type}, {"Column", type text}}),
   
   // sort "column" ascending
    #"Sorted Rows" = Table.Sort(#"Changed Type",{{"Column", Order.Ascending}}),
   // group rows by "ID", get all other data. Count distinct rows. 
    #"Grouped Rows" = Table.Group(#"Sorted Rows", {"ID"}, {{"Data", each _, type table [ID=nullable number, Column=nullable text]}, {"DistinctRowCount", each Table.RowCount(Table.Distinct(_)), Int64.Type}}),
   
   // Add an index column
    Transformed_Data = Table.AddColumn(#"Grouped Rows", "Transformed_Data", each  Table.AddIndexColumn([Data], "Index",0,1 )),
    
    //remove and unwanted columns and  expand table
    #"Removed Other Columns" = Table.SelectColumns(Transformed_Data,{"Transformed_Data", "DistinctRowCount"}),
    #"Expanded Transformed_Data" = Table.ExpandTableColumn(#"Removed Other Columns", "Transformed_Data", {"ID", "Column", "Index"}, {"ID", "Column", "Index"}),
    #"Changed Type1" = Table.TransformColumnTypes(#"Expanded Transformed_Data",{{"Index", Int64.Type}, {"Column", type text}, {"ID", Int64.Type}}),
    // create the filter criteria and filter 
    #"Add FilterCritera" = Table.AddColumn(#"Changed Type1", "FilterCritera", each if [DistinctRowCount] <> 1 and [Index] =0 then "Y" else "N"),
    #"FilterCritera = N" = Table.SelectRows(#"Add FilterCritera", each ([FilterCritera] = "N")),
    #"Removed Other Columns1" = Table.SelectColumns(#"FilterCritera = N",{"Column", "ID"}),
    #"Sorted Rows1" = Table.Sort(#"Removed Other Columns1",{{"ID", Order.Ascending}})
in
    #"Sorted Rows1"

adudani_0-1722542029168.png

 

 

let me know if this meets the requirement.

 

 

Did I answer your question? Mark my post as a solution, this will help others!
If my response(s) assisted you in any way, don't forget to drop me a Kudos 🙂
Kind Regards,
Avinash

View solution in original post

1 REPLY 1
adudani
Super User
Super User

hi @DerhakRon ,

 

create a blank query. Copy and paste the below code into the advanced editor.

 

let
   // set up input
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45W8nRR0lFyzs8pzc1TitWJVjIEchXALCMUVlhiTmmqIZhrDOMagbkmcHWmKCywEmNUrolSbCwA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"Example data" = _t, #"(blank)" = _t]),
    #"Promoted Headers" = Table.PromoteHeaders(Source, [PromoteAllScalars=true]),
    #"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"ID", Int64.Type}, {"Column", type text}}),
   
   // sort "column" ascending
    #"Sorted Rows" = Table.Sort(#"Changed Type",{{"Column", Order.Ascending}}),
   // group rows by "ID", get all other data. Count distinct rows. 
    #"Grouped Rows" = Table.Group(#"Sorted Rows", {"ID"}, {{"Data", each _, type table [ID=nullable number, Column=nullable text]}, {"DistinctRowCount", each Table.RowCount(Table.Distinct(_)), Int64.Type}}),
   
   // Add an index column
    Transformed_Data = Table.AddColumn(#"Grouped Rows", "Transformed_Data", each  Table.AddIndexColumn([Data], "Index",0,1 )),
    
    //remove and unwanted columns and  expand table
    #"Removed Other Columns" = Table.SelectColumns(Transformed_Data,{"Transformed_Data", "DistinctRowCount"}),
    #"Expanded Transformed_Data" = Table.ExpandTableColumn(#"Removed Other Columns", "Transformed_Data", {"ID", "Column", "Index"}, {"ID", "Column", "Index"}),
    #"Changed Type1" = Table.TransformColumnTypes(#"Expanded Transformed_Data",{{"Index", Int64.Type}, {"Column", type text}, {"ID", Int64.Type}}),
    // create the filter criteria and filter 
    #"Add FilterCritera" = Table.AddColumn(#"Changed Type1", "FilterCritera", each if [DistinctRowCount] <> 1 and [Index] =0 then "Y" else "N"),
    #"FilterCritera = N" = Table.SelectRows(#"Add FilterCritera", each ([FilterCritera] = "N")),
    #"Removed Other Columns1" = Table.SelectColumns(#"FilterCritera = N",{"Column", "ID"}),
    #"Sorted Rows1" = Table.Sort(#"Removed Other Columns1",{{"ID", Order.Ascending}})
in
    #"Sorted Rows1"

adudani_0-1722542029168.png

 

 

let me know if this meets the requirement.

 

 

Did I answer your question? Mark my post as a solution, this will help others!
If my response(s) assisted you in any way, don't forget to drop me a Kudos 🙂
Kind Regards,
Avinash

Helpful resources

Announcements
Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount! Prices go up Feb. 11th.

Jan25PBI_Carousel

Power BI Monthly Update - January 2025

Check out the January 2025 Power BI update to learn about new features in Reporting, Modeling, and Data Connectivity.

Jan NL Carousel

Fabric Community Update - January 2025

Find out what's new and trending in the Fabric community.