Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn a 50% discount on the DP-600 certification exam by completing the Fabric 30 Days to Learn It challenge.

Reply
Mdobbels
Frequent Visitor

keep rows containing duplicates in column 1 OR column 2

Hi all,

 

I would need to keep the rows that contain a duplicate in column 1 OR column 2. 

In the easy example below, I should keep rows 1,2,4 and 5 as 1 and 2 contain a duplicate in column 1 and row 4 and 5 have a duplicate in column2. 

 

Thank you in advance.

 

Column1            Column2
123                    23
123                    45
231                    66
432                    77
789                    77

1 ACCEPTED SOLUTION
CNENFRNL
Community Champion
Community Champion

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("TYzJDQAhDAN7yZsHkAtqQem/DbCllfbhKBof58iYKk14xpO6VPuoOaynMNKpiETAwrNITRHJRJsrgLn2H+5OGsoe8pj2kKoL", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Column1 = _t, Column2 = _t, Column3 = _t, Column4 = _t]),
    Cols = Table.ToColumns(Source),
    Total = List.Count(Cols),
    #"Added Custom" = Table.AddColumn(
        Source, 
        "Duplicate", 
        (r) => List.LastN(
                List.Generate(
                () => [cnt=0, sn=0],
                each [sn]<Total and [cnt]<=1,
                each [cnt=List.Count(List.PositionOf(Cols{[sn]}, Record.ToList(r){[sn]}, Occurrence.All)), sn=[sn]+1],
                each List.Count(List.PositionOf(Cols{[sn]}, Record.ToList(r){[sn]}, Occurrence.All))
            ),1
        ){0}>1
    )
in
    #"Added Custom"

 

Screenshot 2021-05-19 155852.png


Thanks to the great efforts by MS engineers to simplify syntax of DAX! Most beginners are SUCCESSFULLY MISLED to think that they could easily master DAX; but it turns out that the intricacy of the most frequently used RANKX() is still way beyond their comprehension!

DAX is simple, but NOT EASY!

View solution in original post

2 REPLIES 2
CNENFRNL
Community Champion
Community Champion

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("TYzJDQAhDAN7yZsHkAtqQem/DbCllfbhKBof58iYKk14xpO6VPuoOaynMNKpiETAwrNITRHJRJsrgLn2H+5OGsoe8pj2kKoL", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Column1 = _t, Column2 = _t, Column3 = _t, Column4 = _t]),
    Cols = Table.ToColumns(Source),
    Total = List.Count(Cols),
    #"Added Custom" = Table.AddColumn(
        Source, 
        "Duplicate", 
        (r) => List.LastN(
                List.Generate(
                () => [cnt=0, sn=0],
                each [sn]<Total and [cnt]<=1,
                each [cnt=List.Count(List.PositionOf(Cols{[sn]}, Record.ToList(r){[sn]}, Occurrence.All)), sn=[sn]+1],
                each List.Count(List.PositionOf(Cols{[sn]}, Record.ToList(r){[sn]}, Occurrence.All))
            ),1
        ){0}>1
    )
in
    #"Added Custom"

 

Screenshot 2021-05-19 155852.png


Thanks to the great efforts by MS engineers to simplify syntax of DAX! Most beginners are SUCCESSFULLY MISLED to think that they could easily master DAX; but it turns out that the intricacy of the most frequently used RANKX() is still way beyond their comprehension!

DAX is simple, but NOT EASY!

mahoneypat
Employee
Employee

Here's one way to do it in the query editor.  To see how it works, just create a blank query, open the Advanced Editor and replace the text there with the M code below.

 

let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMjQyPrRAAYyUdJQgTCNjpVgd7FImpmApI2NDDCkzM7CUibERhpS5OVjK3MISq1QsAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"Column1 " = _t, Column2 = _t]),
#"Renamed Columns" = Table.RenameColumns(Source,{{"Column1 ", "Column1"}}),
#"Changed Type" = Table.TransformColumnTypes(#"Renamed Columns",{{"Column1", Int64.Type}, {"Column2", Int64.Type}}),
#"Added Custom" = Table.AddColumn(#"Changed Type", "Custom", each let thiscol1 = [Column1] in List.Count(List.Select(#"Changed Type"[Column1], each _ = thiscol1))),
#"Added Custom1" = Table.AddColumn(#"Added Custom", "Custom.1", each let thiscol2 = [Column2] in List.Count(List.Select(#"Changed Type"[Column2], each _ = thiscol2))),
#"Added Custom2" = Table.AddColumn(#"Added Custom1", "Keep", each if [Custom]>1 or [Custom.1]>1 then "Y" else "N"),
#"Filtered Rows" = Table.SelectRows(#"Added Custom2", each ([Keep] = "Y")),
#"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"Custom", "Custom.1", "Keep"})
in
#"Removed Columns"

 

Pat

 





Did I answer your question? Mark my post as a solution! Kudos are also appreciated!

To learn more about Power BI, follow me on Twitter or subscribe on YouTube.


@mahoneypa HoosierBI on YouTube


Helpful resources

Announcements
LearnSurvey

Fabric certifications survey

Certification feedback opportunity for the community.

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors
Top Kudoed Authors