cancel
Showing results for 
Search instead for 
Did you mean: 

Fabric is Generally Available. Browse Fabric Presentations. Work towards your Fabric certification with the Cloud Skills Challenge.

Reply
RokuCap
Helper I
Helper I

Comparing Multiple Columns for Duplicate Rows to Identify Mismatching Data

I have appended tables from 2 databases only kept the rows with duplicate values (A & B). The goal is to compare the values (Field1, Field2 and Field3) for the duplicate values in the "ID" column and identify discrepancies. Then have a conditional formula to describe which columns do not match. In the case where there are multiple mismatching columns, then this will be concatenated in the "Comment": column text e.g. "Field1 Mismatch; Field2 Mismatch".

 

My initial attempt involved keeping the tables separate, merging them based on the ID value and then adding conditional check columns for each field. However, due to how large these tables are, I'm trying to minimize my use of the "Merge" function.

 
SourceIDField1Field2Field3Comment
Database1A1JapanYesField2 Mismatch
Database2A1CanadaYesField2 Mismatch
Database1B3AustraliaNoField1 Mismatch; Field2 Mismatch
Database2B1New ZealandNoField1 Mismatch; Field2 Mismatch
 

Thanks in advance!

 

1 ACCEPTED SOLUTION
ImkeF
Super User
Super User

Hi @RokuCap ,
you can try to give it a go with grouping instead:

let
  Source = Table.FromRows(
    Json.Document(
      Binary.Decompress(
        Binary.FromText(
          "i45WckksSUxKLE41VNJRcgRiEO2VWJCYB6QjU4uVYnUQaoyQ1Dgn5iWmJGJRBJJ0AmJjkOLS4pKixJxMkDq/fAyznKBm+aWWK0SlJuYk5qVAFcYCAA==", 
          BinaryEncoding.Base64
        ), 
        Compression.Deflate
      )
    ), 
    let
      _t = ((type nullable text) meta [Serialized.Text = true])
    in
      type table [Source = _t, ID = _t, Field1 = _t, Field2 = _t, Field3 = _t]
  ), 
  #"Changed Type" = Table.TransformColumnTypes(
    Source, 
    {
      {"Source", type text}, 
      {"ID", type text}, 
      {"Field1", Int64.Type}, 
      {"Field2", type text}, 
      {"Field3", type text}
    }
  ), 
  #"Grouped Rows" = Table.Group(
    #"Changed Type", 
    {"ID"}, 
    {
      {
        "Mismatches", 
        each Text.Combine(
          List.Select(
            List.Transform(
              {"Field1", "Field2", "Field3"}, 
              (l) =>
                if List.Count(List.Distinct(Table.Column(_, l))) > 1 then "Mismatch " & l else null
            ), 
            (s) => s <> null
          ), 
          ", "
        )
      }, 
      {"All", each _}
    }
  ), 
  #"Expanded All" = Table.ExpandTableColumn(
    #"Grouped Rows", 
    "All", 
    {"Source", "Field1", "Field2", "Field3"}
  )
in
  #"Expanded All"

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

View solution in original post

1 REPLY 1
ImkeF
Super User
Super User

Hi @RokuCap ,
you can try to give it a go with grouping instead:

let
  Source = Table.FromRows(
    Json.Document(
      Binary.Decompress(
        Binary.FromText(
          "i45WckksSUxKLE41VNJRcgRiEO2VWJCYB6QjU4uVYnUQaoyQ1Dgn5iWmJGJRBJJ0AmJjkOLS4pKixJxMkDq/fAyznKBm+aWWK0SlJuYk5qVAFcYCAA==", 
          BinaryEncoding.Base64
        ), 
        Compression.Deflate
      )
    ), 
    let
      _t = ((type nullable text) meta [Serialized.Text = true])
    in
      type table [Source = _t, ID = _t, Field1 = _t, Field2 = _t, Field3 = _t]
  ), 
  #"Changed Type" = Table.TransformColumnTypes(
    Source, 
    {
      {"Source", type text}, 
      {"ID", type text}, 
      {"Field1", Int64.Type}, 
      {"Field2", type text}, 
      {"Field3", type text}
    }
  ), 
  #"Grouped Rows" = Table.Group(
    #"Changed Type", 
    {"ID"}, 
    {
      {
        "Mismatches", 
        each Text.Combine(
          List.Select(
            List.Transform(
              {"Field1", "Field2", "Field3"}, 
              (l) =>
                if List.Count(List.Distinct(Table.Column(_, l))) > 1 then "Mismatch " & l else null
            ), 
            (s) => s <> null
          ), 
          ", "
        )
      }, 
      {"All", each _}
    }
  ), 
  #"Expanded All" = Table.ExpandTableColumn(
    #"Grouped Rows", 
    "All", 
    {"Source", "Field1", "Field2", "Field3"}
  )
in
  #"Expanded All"

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

Helpful resources

Announcements
PBI November 2023 Update Carousel

Power BI Monthly Update - November 2023

Check out the November 2023 Power BI update to learn about new features.

Community News

Fabric Community News unified experience

Read the latest Fabric Community announcements, including updates on Power BI, Synapse, Data Factory and Data Activator.

Power BI Fabric Summit Carousel

The largest Power BI and Fabric virtual conference

130+ sessions, 130+ speakers, Product managers, MVPs, and experts. All about Power BI and Fabric. Attend online or watch the recordings.

Top Solution Authors