Solved: Comparing data based on certain column

Anonymous · ‎06-01-2022

Hi there!

I have a huge table with employee data from two data sources (SharePoint, SAP).

My data looks like this:

ID

Name

CC

Entry date

Leaving date

Div

BU

Quitting reason

Quitting type

Work hours

FTE

Location

Position

DataSource

563475

Paul Smith

597

2019.03.04

2021.04.18

Fin.

MCCK

resignation

Voluntary

40

1

Toronto

Application Administrator

SharePoint

563475

Paul Smith

597

2019.03.04

2021.04.18

Fin.

MCCK

resignation

Voluntary

40

1

Toronto

Application Administrator

SAP

89567

John Cook

5896

2020.03.07

2021.04.18

IT

Cloud

resignation

Voluntary

40

1

Toronto

Developer

Sharepoint

89567

John Cook

5896

2020.03.07

2021.04.19

IT

Cloud

resignation

Voluntary

40

1

Toronto

Developer

SAP

I need a function that compares the two rows for each employee and checks if there's difference in any column apart from the DataSource. The output should be in a new column that indicates if there's an inconsistency in the data from the 2 different data sources ("Yes"/"No"). It would be useful to have an additional column, which is empty by default, but in case of difference it would show the name of the column that differs (for this example, it would be empty for the first two rows, and contain "Leaving date" for the third and fourth)

Any idea how to achive this with Power Query?

Thanks a lot.

AlexisOlson · ‎06-03-2022

I think I found a cleaner method.

The key is reshaping the data into this format.

Then filtering for the mismatching rows concatenate the column names for the mismatching column names.

Here's the full code:

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("zZHNasMwEIRfRfgcjGxLtnU0LoH+UUNCLyEH0YhaRNEaed3St+9K7am0hUAoPWh2Bgn2Y7TbZbKuRCOzVTboxbHNyeJIQaqGtOSFynmVc5FCWZDLi5bC2vqcxn3f39IIZrbPXqMFT+kR3OJRhzfygpMUdLYQwCOQ66bJ2af0mHWHk/V2xqARAt1tRh3MANZjtl/9N7ZuSFCtknUEuIHRsx7gGJFaVX9g8MTUfGW63pL0DpbDuUhX5sU4mMw39ZxNoi5H8llGWUnR1qn0MCO70zNGiEq0F63jAVG/6rjXeAuB/VLKj0TlX3xQrGX/Dg==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [ID = _t, Name = _t, CC = _t, #"Entry date" = _t, #"Leaving date" = _t, Div = _t, BU = _t, #"Quitting reason" = _t, #"Quitting type" = _t, #"Work hours" = _t, FTE = _t, Location = _t, Position = _t, DataSource = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"ID", Int64.Type}, {"Name", type text}, {"CC", Int64.Type}, {"Entry date", type date}, {"Leaving date", type date}, {"Div", type text}, {"BU", type text}, {"Quitting reason", type text}, {"Quitting type", type text}, {"Work hours", Int64.Type}, {"FTE", Int64.Type}, {"Location", type text}, {"Position", type text}, {"DataSource", type text}}),
    #"Unpivoted Columns" = Table.UnpivotOtherColumns(#"Changed Type", {"ID", "DataSource"}, "Column", "Value"),
    #"Pivoted Column" = Table.Pivot(#"Unpivoted Columns", List.Distinct(#"Unpivoted Columns"[DataSource]), "DataSource", "Value"),
    #"Filtered Rows" = Table.SelectRows(#"Pivoted Column", each [SAP] <> [SharePoint]),
    #"Grouped Rows" = Table.Group(#"Filtered Rows", {"ID"}, {{"MismatchColumns", each Text.Combine([Column], ", "), type text}}),
    #"Merged Queries" = Table.NestedJoin(#"Changed Type", {"ID"}, #"Grouped Rows", {"ID"}, "Grouped Rows", JoinKind.LeftOuter),
    #"Expanded Grouped Rows" = Table.ExpandTableColumn(#"Merged Queries", "Grouped Rows", {"MismatchColumns"}, {"MismatchColumns"})
in
    #"Expanded Grouped Rows"

View solution in original post

AlexisOlson · ‎06-03-2022

I think I found a cleaner method.

The key is reshaping the data into this format.

Then filtering for the mismatching rows concatenate the column names for the mismatching column names.

Here's the full code:

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("zZHNasMwEIRfRfgcjGxLtnU0LoH+UUNCLyEH0YhaRNEaed3St+9K7am0hUAoPWh2Bgn2Y7TbZbKuRCOzVTboxbHNyeJIQaqGtOSFynmVc5FCWZDLi5bC2vqcxn3f39IIZrbPXqMFT+kR3OJRhzfygpMUdLYQwCOQ66bJ2af0mHWHk/V2xqARAt1tRh3MANZjtl/9N7ZuSFCtknUEuIHRsx7gGJFaVX9g8MTUfGW63pL0DpbDuUhX5sU4mMw39ZxNoi5H8llGWUnR1qn0MCO70zNGiEq0F63jAVG/6rjXeAuB/VLKj0TlX3xQrGX/Dg==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [ID = _t, Name = _t, CC = _t, #"Entry date" = _t, #"Leaving date" = _t, Div = _t, BU = _t, #"Quitting reason" = _t, #"Quitting type" = _t, #"Work hours" = _t, FTE = _t, Location = _t, Position = _t, DataSource = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"ID", Int64.Type}, {"Name", type text}, {"CC", Int64.Type}, {"Entry date", type date}, {"Leaving date", type date}, {"Div", type text}, {"BU", type text}, {"Quitting reason", type text}, {"Quitting type", type text}, {"Work hours", Int64.Type}, {"FTE", Int64.Type}, {"Location", type text}, {"Position", type text}, {"DataSource", type text}}),
    #"Unpivoted Columns" = Table.UnpivotOtherColumns(#"Changed Type", {"ID", "DataSource"}, "Column", "Value"),
    #"Pivoted Column" = Table.Pivot(#"Unpivoted Columns", List.Distinct(#"Unpivoted Columns"[DataSource]), "DataSource", "Value"),
    #"Filtered Rows" = Table.SelectRows(#"Pivoted Column", each [SAP] <> [SharePoint]),
    #"Grouped Rows" = Table.Group(#"Filtered Rows", {"ID"}, {{"MismatchColumns", each Text.Combine([Column], ", "), type text}}),
    #"Merged Queries" = Table.NestedJoin(#"Changed Type", {"ID"}, #"Grouped Rows", {"ID"}, "Grouped Rows", JoinKind.LeftOuter),
    #"Expanded Grouped Rows" = Table.ExpandTableColumn(#"Merged Queries", "Grouped Rows", {"MismatchColumns"}, {"MismatchColumns"})
in
    #"Expanded Grouped Rows"

Vijay_A_Verma · ‎06-02-2022

See the working here - Open a blank query - Home - Advanced Editor - Remove everything from there and paste the below code to test

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("zZJRC4IwFIX/ivgcsqlT9yhGUBEIRi/Rw6iRo7krcwb9+24WURFB0EMPO/cctoePs7te+yyJ4pT5I78UvfaqRrkaA+MpakgoD0gUkHgIIUUX0AzDRJkAx6Io5jis7NTeCKfAYFqB7o0T9oQ+JigUzxIsGAfo8rbVajs89vJdo4zqnBUOLN5VtbCyBGWcvxn9G1teDlAZZ8kFYAa18QqAwwUp48kVgwxM6SvTdIlSaOh33yKN5VFqaOW9nvZez9ck/HcktzJoGP3L9zyvznsw+kjGPpDRn2/O5gw=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [ID = _t, Name = _t, CC = _t, #"Entry date" = _t, #"Leaving date" = _t, Div = _t, BU = _t, #"Quitting reason" = _t, #"Quitting type" = _t, #"Work hours" = _t, FTE = _t, Location = _t, Position = _t, DataSource = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"ID", Int64.Type}, {"Name", type text}, {"CC", Int64.Type}, {"Entry date", type date}, {"Leaving date", type date}, {"Div", type text}, {"BU", type text}, {"Quitting reason", type text}, {"Quitting type", type text}, {"Work hours", Int64.Type}, {"FTE", Int64.Type}, {"Location", type text}, {"Position", type text}, {"DataSource", type text}}),
    #"Removed Columns" = Table.RemoveColumns(#"Changed Type",{"DataSource"}),
    ListOfColumns = Table.ColumnNames(#"Removed Columns"),
    #"Added Index" = Table.AddIndexColumn(#"Removed Columns", "Index", 0, 1, Int64.Type),
    #"Calculated Modulo" = Table.TransformColumns(#"Added Index", {{"Index", each Number.Mod(_, 2), type number}}),
    #"Filtered Rows" = Table.RemoveColumns(Table.SelectRows(#"Calculated Modulo", each ([Index] = 0)),{"Index"}),
    #"Added Custom" = Table.AddColumn(#"Filtered Rows", "Custom", each Record.ToList(_)),
    Tbl1 = Table.ExpandListColumn(#"Added Custom", "Custom"),
    #"Filtered Rows1" = Table.RemoveColumns(Table.SelectRows(#"Calculated Modulo", each ([Index] = 1)),{"Index"}),
    #"Added Custom1" = Table.AddColumn(#"Filtered Rows1", "Custom", each Record.ToList(_)),
    Tbl2 = Table.ExpandListColumn(#"Added Custom1", "Custom"),
    Custom1 = #"Filtered Rows",
    #"Added Custom2" = Table.AddColumn(Custom1, "Custom", each ListOfColumns),
    Tbl3 = Table.ExpandListColumn(#"Added Custom2", "Custom"),
    Custom2 = Table.FromColumns(Table.ToColumns(Tbl1) & {Tbl2[Custom]} & {Tbl3[Custom]},Table.ColumnNames(Tbl1)&{"Custom1", "Custom2"}),
    #"Added Custom3" = Table.AddColumn(Custom2, "Custom.1", each [Custom]=[Custom1]),
    #"Filtered Rows2" = Table.SelectRows(#"Added Custom3", each ([Custom.1] = false)),
    #"Grouped Rows" = Table.Group(#"Filtered Rows2", {"ID"}, {{"Not matching columns", each Text.Combine([Custom2],","), type text}}),
    #"Merged Queries" = Table.NestedJoin(#"Changed Type", {"ID"}, #"Grouped Rows", {"ID"}, "Grouped Rows", JoinKind.LeftOuter),
    #"Expanded Grouped Rows" = Table.ExpandTableColumn(#"Merged Queries", "Grouped Rows", {"Not matching columns"}, {"Not matching columns"})
in
    #"Expanded Grouped Rows"

Comparing data based on certain column

Helpful resources

Join our Fabric User Panel

Power BI Monthly Update - February 2026

FabCon Atlanta 2026

FabCon is coming to Atlanta

Comparing data based on certain column

Helpful resources

Join our Fabric User Panel

Power BI Monthly Update - February 2026

FabCon Atlanta 2026