Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

The Power BI Data Visualization World Championships is back! Get ahead of the game and start preparing now! Learn more

Reply
Anonymous
Not applicable

Comparing data based on certain column

Hi there!

 

I have a huge table with employee data from two data sources (SharePoint, SAP).

My data looks like this:

IDNameCCEntry dateLeaving dateDivBUQuitting reasonQuitting typeWork hoursFTELocationPositionDataSource
563475Paul Smith5972019.03.042021.04.18Fin.MCCKresignationVoluntary401TorontoApplication AdministratorSharePoint
563475Paul Smith5972019.03.042021.04.18Fin.MCCKresignationVoluntary401TorontoApplication AdministratorSAP
89567John Cook58962020.03.072021.04.18ITCloudresignationVoluntary401TorontoDeveloperSharepoint
89567John Cook58962020.03.072021.04.19ITCloudresignationVoluntary401TorontoDeveloperSAP


I need a function that compares the two rows for each employee and checks if there's difference in any column apart from the DataSource. The output should be in a new column that indicates if there's an inconsistency in the data from the 2 different data sources ("Yes"/"No"). It would be useful to have an additional column, which is empty by default, but in case of difference it would show the name of the column that differs (for this example, it would be empty for the first two rows, and contain "Leaving date" for the third and fourth)

Any idea how to achive this with Power Query?

Thanks a lot.

1 ACCEPTED SOLUTION
AlexisOlson
Super User
Super User

I think I found a cleaner method.

 

The key is reshaping the data into this format.

AlexisOlson_1-1654273519283.png

Then filtering for the mismatching rows concatenate the column names for the mismatching column names.

 

Here's the full code:

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("zZHNasMwEIRfRfgcjGxLtnU0LoH+UUNCLyEH0YhaRNEaed3St+9K7am0hUAoPWh2Bgn2Y7TbZbKuRCOzVTboxbHNyeJIQaqGtOSFynmVc5FCWZDLi5bC2vqcxn3f39IIZrbPXqMFT+kR3OJRhzfygpMUdLYQwCOQ66bJ2af0mHWHk/V2xqARAt1tRh3MANZjtl/9N7ZuSFCtknUEuIHRsx7gGJFaVX9g8MTUfGW63pL0DpbDuUhX5sU4mMw39ZxNoi5H8llGWUnR1qn0MCO70zNGiEq0F63jAVG/6rjXeAuB/VLKj0TlX3xQrGX/Dg==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [ID = _t, Name = _t, CC = _t, #"Entry date" = _t, #"Leaving date" = _t, Div = _t, BU = _t, #"Quitting reason" = _t, #"Quitting type" = _t, #"Work hours" = _t, FTE = _t, Location = _t, Position = _t, DataSource = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"ID", Int64.Type}, {"Name", type text}, {"CC", Int64.Type}, {"Entry date", type date}, {"Leaving date", type date}, {"Div", type text}, {"BU", type text}, {"Quitting reason", type text}, {"Quitting type", type text}, {"Work hours", Int64.Type}, {"FTE", Int64.Type}, {"Location", type text}, {"Position", type text}, {"DataSource", type text}}),
    #"Unpivoted Columns" = Table.UnpivotOtherColumns(#"Changed Type", {"ID", "DataSource"}, "Column", "Value"),
    #"Pivoted Column" = Table.Pivot(#"Unpivoted Columns", List.Distinct(#"Unpivoted Columns"[DataSource]), "DataSource", "Value"),
    #"Filtered Rows" = Table.SelectRows(#"Pivoted Column", each [SAP] <> [SharePoint]),
    #"Grouped Rows" = Table.Group(#"Filtered Rows", {"ID"}, {{"MismatchColumns", each Text.Combine([Column], ", "), type text}}),
    #"Merged Queries" = Table.NestedJoin(#"Changed Type", {"ID"}, #"Grouped Rows", {"ID"}, "Grouped Rows", JoinKind.LeftOuter),
    #"Expanded Grouped Rows" = Table.ExpandTableColumn(#"Merged Queries", "Grouped Rows", {"MismatchColumns"}, {"MismatchColumns"})
in
    #"Expanded Grouped Rows"

 

 

View solution in original post

2 REPLIES 2
AlexisOlson
Super User
Super User

I think I found a cleaner method.

 

The key is reshaping the data into this format.

AlexisOlson_1-1654273519283.png

Then filtering for the mismatching rows concatenate the column names for the mismatching column names.

 

Here's the full code:

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("zZHNasMwEIRfRfgcjGxLtnU0LoH+UUNCLyEH0YhaRNEaed3St+9K7am0hUAoPWh2Bgn2Y7TbZbKuRCOzVTboxbHNyeJIQaqGtOSFynmVc5FCWZDLi5bC2vqcxn3f39IIZrbPXqMFT+kR3OJRhzfygpMUdLYQwCOQ66bJ2af0mHWHk/V2xqARAt1tRh3MANZjtl/9N7ZuSFCtknUEuIHRsx7gGJFaVX9g8MTUfGW63pL0DpbDuUhX5sU4mMw39ZxNoi5H8llGWUnR1qn0MCO70zNGiEq0F63jAVG/6rjXeAuB/VLKj0TlX3xQrGX/Dg==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [ID = _t, Name = _t, CC = _t, #"Entry date" = _t, #"Leaving date" = _t, Div = _t, BU = _t, #"Quitting reason" = _t, #"Quitting type" = _t, #"Work hours" = _t, FTE = _t, Location = _t, Position = _t, DataSource = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"ID", Int64.Type}, {"Name", type text}, {"CC", Int64.Type}, {"Entry date", type date}, {"Leaving date", type date}, {"Div", type text}, {"BU", type text}, {"Quitting reason", type text}, {"Quitting type", type text}, {"Work hours", Int64.Type}, {"FTE", Int64.Type}, {"Location", type text}, {"Position", type text}, {"DataSource", type text}}),
    #"Unpivoted Columns" = Table.UnpivotOtherColumns(#"Changed Type", {"ID", "DataSource"}, "Column", "Value"),
    #"Pivoted Column" = Table.Pivot(#"Unpivoted Columns", List.Distinct(#"Unpivoted Columns"[DataSource]), "DataSource", "Value"),
    #"Filtered Rows" = Table.SelectRows(#"Pivoted Column", each [SAP] <> [SharePoint]),
    #"Grouped Rows" = Table.Group(#"Filtered Rows", {"ID"}, {{"MismatchColumns", each Text.Combine([Column], ", "), type text}}),
    #"Merged Queries" = Table.NestedJoin(#"Changed Type", {"ID"}, #"Grouped Rows", {"ID"}, "Grouped Rows", JoinKind.LeftOuter),
    #"Expanded Grouped Rows" = Table.ExpandTableColumn(#"Merged Queries", "Grouped Rows", {"MismatchColumns"}, {"MismatchColumns"})
in
    #"Expanded Grouped Rows"

 

 

Vijay_A_Verma
Most Valuable Professional
Most Valuable Professional

See the working here - Open a blank query - Home - Advanced Editor - Remove everything from there and paste the below code to test

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("zZJRC4IwFIX/ivgcsqlT9yhGUBEIRi/Rw6iRo7krcwb9+24WURFB0EMPO/cctoePs7te+yyJ4pT5I78UvfaqRrkaA+MpakgoD0gUkHgIIUUX0AzDRJkAx6Io5jis7NTeCKfAYFqB7o0T9oQ+JigUzxIsGAfo8rbVajs89vJdo4zqnBUOLN5VtbCyBGWcvxn9G1teDlAZZ8kFYAa18QqAwwUp48kVgwxM6SvTdIlSaOh33yKN5VFqaOW9nvZez9ck/HcktzJoGP3L9zyvznsw+kjGPpDRn2/O5gw=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [ID = _t, Name = _t, CC = _t, #"Entry date" = _t, #"Leaving date" = _t, Div = _t, BU = _t, #"Quitting reason" = _t, #"Quitting type" = _t, #"Work hours" = _t, FTE = _t, Location = _t, Position = _t, DataSource = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"ID", Int64.Type}, {"Name", type text}, {"CC", Int64.Type}, {"Entry date", type date}, {"Leaving date", type date}, {"Div", type text}, {"BU", type text}, {"Quitting reason", type text}, {"Quitting type", type text}, {"Work hours", Int64.Type}, {"FTE", Int64.Type}, {"Location", type text}, {"Position", type text}, {"DataSource", type text}}),
    #"Removed Columns" = Table.RemoveColumns(#"Changed Type",{"DataSource"}),
    ListOfColumns = Table.ColumnNames(#"Removed Columns"),
    #"Added Index" = Table.AddIndexColumn(#"Removed Columns", "Index", 0, 1, Int64.Type),
    #"Calculated Modulo" = Table.TransformColumns(#"Added Index", {{"Index", each Number.Mod(_, 2), type number}}),
    #"Filtered Rows" = Table.RemoveColumns(Table.SelectRows(#"Calculated Modulo", each ([Index] = 0)),{"Index"}),
    #"Added Custom" = Table.AddColumn(#"Filtered Rows", "Custom", each Record.ToList(_)),
    Tbl1 = Table.ExpandListColumn(#"Added Custom", "Custom"),
    #"Filtered Rows1" = Table.RemoveColumns(Table.SelectRows(#"Calculated Modulo", each ([Index] = 1)),{"Index"}),
    #"Added Custom1" = Table.AddColumn(#"Filtered Rows1", "Custom", each Record.ToList(_)),
    Tbl2 = Table.ExpandListColumn(#"Added Custom1", "Custom"),
    Custom1 = #"Filtered Rows",
    #"Added Custom2" = Table.AddColumn(Custom1, "Custom", each ListOfColumns),
    Tbl3 = Table.ExpandListColumn(#"Added Custom2", "Custom"),
    Custom2 = Table.FromColumns(Table.ToColumns(Tbl1) & {Tbl2[Custom]} & {Tbl3[Custom]},Table.ColumnNames(Tbl1)&{"Custom1", "Custom2"}),
    #"Added Custom3" = Table.AddColumn(Custom2, "Custom.1", each [Custom]=[Custom1]),
    #"Filtered Rows2" = Table.SelectRows(#"Added Custom3", each ([Custom.1] = false)),
    #"Grouped Rows" = Table.Group(#"Filtered Rows2", {"ID"}, {{"Not matching columns", each Text.Combine([Custom2],","), type text}}),
    #"Merged Queries" = Table.NestedJoin(#"Changed Type", {"ID"}, #"Grouped Rows", {"ID"}, "Grouped Rows", JoinKind.LeftOuter),
    #"Expanded Grouped Rows" = Table.ExpandTableColumn(#"Merged Queries", "Grouped Rows", {"Not matching columns"}, {"Not matching columns"})
in
    #"Expanded Grouped Rows"

 

Helpful resources

Announcements
Power BI DataViz World Championships

Power BI Dataviz World Championships

The Power BI Data Visualization World Championships is back! Get ahead of the game and start preparing now!

December 2025 Power BI Update Carousel

Power BI Monthly Update - December 2025

Check out the December 2025 Power BI Holiday Recap!

FabCon Atlanta 2026 carousel

FabCon Atlanta 2026

Join us at FabCon Atlanta, March 16-20, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.