Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Join us for an expert-led overview of the tools and concepts you'll need to become a Certified Power BI Data Analyst and pass exam PL-300. Register now.

Reply
Analyst_in_prog
Frequent Visitor

Remove duplicates that contains null in one particular column ,but both contains null keep one

I am trying to remove duplicates on the basis of "report id ", but some records "priority level" is not null i want to keep such records in which priority level is not null , but if both duplicate records contains null then  keep any one of them.
for example the report id 894 is having 2 duplicated but only one with priority level is required .
how can i do this ..please help me out 

image.png

 

1 ACCEPTED SOLUTION
Anonymous
Not applicable

Hi @Analyst_in_prog ,

Here is my sample data:

vjunyantmsft_0-1722307193947.png


Here is the whole M code in the Advanced Editor:

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WclTSUfLJLweSFhYGQLJAKVYHIoom5ISk0BBFFE3IGUmhEYoompALTMgYRQimFyHqClNoAiRL8Yt5ZKZnoIu7IZlqhiqMLuYOFbM0ISwGswpZ3AOm1hS/mCdMzAxVDG4mmjjU+XDhWAA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [comments = _t, #"priority level" = _t, reportID = _t, eventType = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"comments", type text}, {"priority level", type text}, {"reportID", Int64.Type}, {"eventType", type text}}),
    #"Replaced Value" = Table.ReplaceValue(#"Changed Type","",null,Replacer.ReplaceValue,{"priority level"}),
    #"Grouped Rows" = Table.Group(#"Replaced Value", {"reportID"}, {{"AllNulls", each List.NonNullCount([priority level]) = 0, type logical}}),
    #"Merged" = Table.NestedJoin(#"Replaced Value", {"reportID"}, #"Grouped Rows", {"reportID"}, "GroupedData", JoinKind.LeftOuter),
    #"Expanded GroupedData" = Table.ExpandTableColumn(Merged, "GroupedData", {"AllNulls"}, {"AllNulls"}),
    #"Removed Duplicates" = Table.Distinct(#"Expanded GroupedData", {"reportID", "priority level"}),
    #"Filtered Rows" = Table.SelectRows(#"Removed Duplicates", each not ([priority level] = null and [AllNulls] = false))
in
    #"Filtered Rows"

And the final output is as below:

vjunyantmsft_1-1722307462963.png

My overall idea is to first add a column that marks the reportIDs where the priority level are both null as TRUE, then delete duplicate rows for both the reportID and priority level columns, and finally delete the rows where the priority level=null and marked as FALSE.

Best Regards,
Dino Tao
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

View solution in original post

2 REPLIES 2
Analyst_in_prog
Frequent Visitor

hii @Anonymous ,

thank you so much for replying , to be honest  i didn't worked with Advance editor that much, and some part of your M code i didn't understand very well but i tried it and it worked .!!! 😄🤝
I Request you , if you have done this using inbuilt functions can you please tell me step by step ? or if you directly done it using M code can you please guide me where i can learn how to work with M code ?


One more thing i would be happy to connect with you on linkedin. please share your ID , otherwise sent a request to mine .
https://www.linkedin.com/in/lalit--soni/  

Thanks and Regards,

Lalit soni

Anonymous
Not applicable

Hi @Analyst_in_prog ,

Here is my sample data:

vjunyantmsft_0-1722307193947.png


Here is the whole M code in the Advanced Editor:

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WclTSUfLJLweSFhYGQLJAKVYHIoom5ISk0BBFFE3IGUmhEYoompALTMgYRQimFyHqClNoAiRL8Yt5ZKZnoIu7IZlqhiqMLuYOFbM0ISwGswpZ3AOm1hS/mCdMzAxVDG4mmjjU+XDhWAA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [comments = _t, #"priority level" = _t, reportID = _t, eventType = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"comments", type text}, {"priority level", type text}, {"reportID", Int64.Type}, {"eventType", type text}}),
    #"Replaced Value" = Table.ReplaceValue(#"Changed Type","",null,Replacer.ReplaceValue,{"priority level"}),
    #"Grouped Rows" = Table.Group(#"Replaced Value", {"reportID"}, {{"AllNulls", each List.NonNullCount([priority level]) = 0, type logical}}),
    #"Merged" = Table.NestedJoin(#"Replaced Value", {"reportID"}, #"Grouped Rows", {"reportID"}, "GroupedData", JoinKind.LeftOuter),
    #"Expanded GroupedData" = Table.ExpandTableColumn(Merged, "GroupedData", {"AllNulls"}, {"AllNulls"}),
    #"Removed Duplicates" = Table.Distinct(#"Expanded GroupedData", {"reportID", "priority level"}),
    #"Filtered Rows" = Table.SelectRows(#"Removed Duplicates", each not ([priority level] = null and [AllNulls] = false))
in
    #"Filtered Rows"

And the final output is as below:

vjunyantmsft_1-1722307462963.png

My overall idea is to first add a column that marks the reportIDs where the priority level are both null as TRUE, then delete duplicate rows for both the reportID and priority level columns, and finally delete the rows where the priority level=null and marked as FALSE.

Best Regards,
Dino Tao
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Helpful resources

Announcements
Join our Fabric User Panel

Join our Fabric User Panel

This is your chance to engage directly with the engineering team behind Fabric and Power BI. Share your experiences and shape the future.

June 2025 Power BI Update Carousel

Power BI Monthly Update - June 2025

Check out the June 2025 Power BI update to learn about new features.

June 2025 community update carousel

Fabric Community Update - June 2025

Find out what's new and trending in the Fabric community.