Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn a 50% discount on the DP-600 certification exam by completing the Fabric 30 Days to Learn It challenge.

Reply
RokuCap
Helper I
Helper I

Comparing Multiple Columns for Duplicate Rows to Identify Mismatching Data

I have appended tables from 2 databases only kept the rows with duplicate values (A & B). The goal is to compare the values (Field1, Field2 and Field3) for the duplicate values in the "ID" column and identify discrepancies. Then have a conditional formula to describe which columns do not match. In the case where there are multiple mismatching columns, then this will be concatenated in the "Comment": column text e.g. "Field1 Mismatch; Field2 Mismatch".

 

My initial attempt involved keeping the tables separate, merging them based on the ID value and then adding conditional check columns for each field. However, due to how large these tables are, I'm trying to minimize my use of the "Merge" function.

 
SourceIDField1Field2Field3Comment
Database1A1JapanYesField2 Mismatch
Database2A1CanadaYesField2 Mismatch
Database1B3AustraliaNoField1 Mismatch; Field2 Mismatch
Database2B1New ZealandNoField1 Mismatch; Field2 Mismatch
 

Thanks in advance!

 

1 ACCEPTED SOLUTION
ImkeF
Super User
Super User

Hi @RokuCap ,
you can try to give it a go with grouping instead:

let
  Source = Table.FromRows(
    Json.Document(
      Binary.Decompress(
        Binary.FromText(
          "i45WckksSUxKLE41VNJRcgRiEO2VWJCYB6QjU4uVYnUQaoyQ1Dgn5iWmJGJRBJJ0AmJjkOLS4pKixJxMkDq/fAyznKBm+aWWK0SlJuYk5qVAFcYCAA==", 
          BinaryEncoding.Base64
        ), 
        Compression.Deflate
      )
    ), 
    let
      _t = ((type nullable text) meta [Serialized.Text = true])
    in
      type table [Source = _t, ID = _t, Field1 = _t, Field2 = _t, Field3 = _t]
  ), 
  #"Changed Type" = Table.TransformColumnTypes(
    Source, 
    {
      {"Source", type text}, 
      {"ID", type text}, 
      {"Field1", Int64.Type}, 
      {"Field2", type text}, 
      {"Field3", type text}
    }
  ), 
  #"Grouped Rows" = Table.Group(
    #"Changed Type", 
    {"ID"}, 
    {
      {
        "Mismatches", 
        each Text.Combine(
          List.Select(
            List.Transform(
              {"Field1", "Field2", "Field3"}, 
              (l) =>
                if List.Count(List.Distinct(Table.Column(_, l))) > 1 then "Mismatch " & l else null
            ), 
            (s) => s <> null
          ), 
          ", "
        )
      }, 
      {"All", each _}
    }
  ), 
  #"Expanded All" = Table.ExpandTableColumn(
    #"Grouped Rows", 
    "All", 
    {"Source", "Field1", "Field2", "Field3"}
  )
in
  #"Expanded All"

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

View solution in original post

1 REPLY 1
ImkeF
Super User
Super User

Hi @RokuCap ,
you can try to give it a go with grouping instead:

let
  Source = Table.FromRows(
    Json.Document(
      Binary.Decompress(
        Binary.FromText(
          "i45WckksSUxKLE41VNJRcgRiEO2VWJCYB6QjU4uVYnUQaoyQ1Dgn5iWmJGJRBJJ0AmJjkOLS4pKixJxMkDq/fAyznKBm+aWWK0SlJuYk5qVAFcYCAA==", 
          BinaryEncoding.Base64
        ), 
        Compression.Deflate
      )
    ), 
    let
      _t = ((type nullable text) meta [Serialized.Text = true])
    in
      type table [Source = _t, ID = _t, Field1 = _t, Field2 = _t, Field3 = _t]
  ), 
  #"Changed Type" = Table.TransformColumnTypes(
    Source, 
    {
      {"Source", type text}, 
      {"ID", type text}, 
      {"Field1", Int64.Type}, 
      {"Field2", type text}, 
      {"Field3", type text}
    }
  ), 
  #"Grouped Rows" = Table.Group(
    #"Changed Type", 
    {"ID"}, 
    {
      {
        "Mismatches", 
        each Text.Combine(
          List.Select(
            List.Transform(
              {"Field1", "Field2", "Field3"}, 
              (l) =>
                if List.Count(List.Distinct(Table.Column(_, l))) > 1 then "Mismatch " & l else null
            ), 
            (s) => s <> null
          ), 
          ", "
        )
      }, 
      {"All", each _}
    }
  ), 
  #"Expanded All" = Table.ExpandTableColumn(
    #"Grouped Rows", 
    "All", 
    {"Source", "Field1", "Field2", "Field3"}
  )
in
  #"Expanded All"

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

Helpful resources

Announcements
RTI Forums Carousel3

New forum boards available in Real-Time Intelligence.

Ask questions in Eventhouse and KQL, Eventstream, and Reflex.

MayPowerBICarousel

Power BI Monthly Update - May 2024

Check out the May 2024 Power BI update to learn about new features.

LearnSurvey

Fabric certifications survey

Certification feedback opportunity for the community.

Top Solution Authors
Top Kudoed Authors