Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

The Power BI Data Visualization World Championships is back! Get ahead of the game and start preparing now! Learn more

Reply
kzielinska
Helper I
Helper I

Complex duplication removal process

Hi,

 

I want to remove the row if name is duplicated and the surname is blank only if for this name there also is some data in surname column.

 

input:

kzielinska_0-1727894730064.png

 

desired output:

kzielinska_1-1727894756029.png

How to do that?

 

3 ACCEPTED SOLUTIONS
lbendlin
Super User
Super User

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMjQyVtJRMlSK1YGxoUxTrEyQAiMo2wTCjgUA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [name = _t, surname = _t]),
    #"Grouped Rows" = Table.Group(Source, {"name"}, {{"surname", each let l=List.Distinct(_[surname]) in if List.Count(l)=1 then l else List.RemoveItems(l,{""}), type list}}),
    #"Expanded Rows" = Table.ExpandListColumn(#"Grouped Rows", "surname")
in
    #"Expanded Rows"

How to use this code: Create a new Blank Query. Click on "Advanced Editor". Replace the code in the window with the code provided here. Click "Done". Once you examined the code, replace the Source step with your own source.

View solution in original post

Ashish_Mathur
Super User
Super User

Hi,

This M code works

let
    Source = Excel.CurrentWorkbook(){[Name="Data"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"name", Int64.Type}, {"surname", Int64.Type}}),
    #"Grouped Rows" = Table.Group(#"Changed Type", {"name"}, {{"Count", each _, type table [name=nullable number, surname=nullable number]}, {"Count1", each List.NonNullCount(_[surname]), Int64.Type}}),
    #"Expanded Count" = Table.ExpandTableColumn(#"Grouped Rows", "Count", {"surname"}, {"surname"}),
    #"Added Custom" = Table.AddColumn(#"Expanded Count", "Custom", each if [Count1]<>0 and [surname]=null then "Remove" else "Keep"),
    #"Filtered Rows" = Table.SelectRows(#"Added Custom", each ([Custom] = "Keep")),
    #"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"Count1", "Custom"}),
    #"Removed Duplicates" = Table.Distinct(#"Removed Columns")
in
    #"Removed Duplicates"

Hope this helps.

Ashish_Mathur_0-1727914984643.png

 


Regards,
Ashish Mathur
http://www.ashishmathur.com
https://www.linkedin.com/in/excelenthusiasts/

View solution in original post

Anonymous
Not applicable

Hi @kzielinska 

 

Thanks for the reply from lbendlin and Ashish_Mathur , please allow me to provide another insight:

 

Method 1:

1. Create a measure as follows:

KeepRow = 
VAR _NonBlankSurnamesExist = 
CALCULATE(
    COUNTROWS(
        FILTER(
            'Table',
            NOT ISBLANK('Table'[surname])
        )
    ) + 0,
    ALLEXCEPT('Table', 'Table'[name])
)
RETURN
IF(
    _NonBlankSurnamesExist > 0 && MAX([surname]) <> BLANK() || _NonBlankSurnamesExist = 0,
    1,
    0
)

 

2. Put the measure into the visual-level filters, set up show items when the value is 1.

vxuxinyimsft_0-1727934340060.png

 

 

Method 2:

 

Create a calculated table as follows:

Table 2 = 
VAR _NonBlankSurnamesExist = 
CALCULATE(
    COUNTROWS(
        FILTER(
            'Table',
            NOT ISBLANK('Table'[surname])
        )
    ) + 0,
    ALLEXCEPT('Table', 'Table'[name])
)
VAR _KeepRow =
IF(
    _NonBlankSurnamesExist > 0 &&  MAX('Table'[surname]) <> BLANK() || _NonBlankSurnamesExist = 0,
    1,
    0
)
RETURN
SUMMARIZE(FILTER('Table', [KeepRow] = 1), 'Table'[name], 'Table'[surname])

 

Output:

vxuxinyimsft_1-1727934526124.png

 

Best Regards,
Yulia Xu

 

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

View solution in original post

3 REPLIES 3
Anonymous
Not applicable

Hi @kzielinska 

 

Thanks for the reply from lbendlin and Ashish_Mathur , please allow me to provide another insight:

 

Method 1:

1. Create a measure as follows:

KeepRow = 
VAR _NonBlankSurnamesExist = 
CALCULATE(
    COUNTROWS(
        FILTER(
            'Table',
            NOT ISBLANK('Table'[surname])
        )
    ) + 0,
    ALLEXCEPT('Table', 'Table'[name])
)
RETURN
IF(
    _NonBlankSurnamesExist > 0 && MAX([surname]) <> BLANK() || _NonBlankSurnamesExist = 0,
    1,
    0
)

 

2. Put the measure into the visual-level filters, set up show items when the value is 1.

vxuxinyimsft_0-1727934340060.png

 

 

Method 2:

 

Create a calculated table as follows:

Table 2 = 
VAR _NonBlankSurnamesExist = 
CALCULATE(
    COUNTROWS(
        FILTER(
            'Table',
            NOT ISBLANK('Table'[surname])
        )
    ) + 0,
    ALLEXCEPT('Table', 'Table'[name])
)
VAR _KeepRow =
IF(
    _NonBlankSurnamesExist > 0 &&  MAX('Table'[surname]) <> BLANK() || _NonBlankSurnamesExist = 0,
    1,
    0
)
RETURN
SUMMARIZE(FILTER('Table', [KeepRow] = 1), 'Table'[name], 'Table'[surname])

 

Output:

vxuxinyimsft_1-1727934526124.png

 

Best Regards,
Yulia Xu

 

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Ashish_Mathur
Super User
Super User

Hi,

This M code works

let
    Source = Excel.CurrentWorkbook(){[Name="Data"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"name", Int64.Type}, {"surname", Int64.Type}}),
    #"Grouped Rows" = Table.Group(#"Changed Type", {"name"}, {{"Count", each _, type table [name=nullable number, surname=nullable number]}, {"Count1", each List.NonNullCount(_[surname]), Int64.Type}}),
    #"Expanded Count" = Table.ExpandTableColumn(#"Grouped Rows", "Count", {"surname"}, {"surname"}),
    #"Added Custom" = Table.AddColumn(#"Expanded Count", "Custom", each if [Count1]<>0 and [surname]=null then "Remove" else "Keep"),
    #"Filtered Rows" = Table.SelectRows(#"Added Custom", each ([Custom] = "Keep")),
    #"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"Count1", "Custom"}),
    #"Removed Duplicates" = Table.Distinct(#"Removed Columns")
in
    #"Removed Duplicates"

Hope this helps.

Ashish_Mathur_0-1727914984643.png

 


Regards,
Ashish Mathur
http://www.ashishmathur.com
https://www.linkedin.com/in/excelenthusiasts/
lbendlin
Super User
Super User

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMjQyVtJRMlSK1YGxoUxTrEyQAiMo2wTCjgUA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [name = _t, surname = _t]),
    #"Grouped Rows" = Table.Group(Source, {"name"}, {{"surname", each let l=List.Distinct(_[surname]) in if List.Count(l)=1 then l else List.RemoveItems(l,{""}), type list}}),
    #"Expanded Rows" = Table.ExpandListColumn(#"Grouped Rows", "surname")
in
    #"Expanded Rows"

How to use this code: Create a new Blank Query. Click on "Advanced Editor". Replace the code in the window with the code provided here. Click "Done". Once you examined the code, replace the Source step with your own source.

Helpful resources

Announcements
Power BI DataViz World Championships

Power BI Dataviz World Championships

The Power BI Data Visualization World Championships is back! Get ahead of the game and start preparing now!

December 2025 Power BI Update Carousel

Power BI Monthly Update - December 2025

Check out the December 2025 Power BI Holiday Recap!

FabCon Atlanta 2026 carousel

FabCon Atlanta 2026

Join us at FabCon Atlanta, March 16-20, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.