Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Join the FabCon + SQLCon recap series. Up next: Power BI, Real-Time Intelligence, IQ and AI, and Data Factory take center stage. All sessions are available on-demand after the live show. Register now

Reply
Martin0011
Frequent Visitor

Master Data potential duplicate analysis

I have a table "Vendor_TAX" where there is below list : 

 

Supplier IDTAX1TAX2TAX3
1A  
2 A 
3  A
4B  
5B  
6 B 
7  C
8  D
9 E 
10 C 

 

I need to identify the potential duplicate crossing all taxes - for each line, it should chetk : 

- If the TAX1 = TAX1 or TAX 2 or TAX3 for another supplier

- If the TAX2 = TAX1 or TAX 2 or TAX3 for another supplier

- If the TAX3 = TAX1 or TAX 2 or TAX3 for another supplier

 

Output would like like : 

Supplier IDTAX1TAX2TAX3Duplicate GroupDuplicate List
1A  11,2,3
2 A 11,2,3
3  A11,2,3
4B  24,5,6
5B  24,5,6
6 B 24,5,6
7  C37,1
8  D  
9 E   
10 C 37,1

It would group all the supplier that match the same taxes as well as listing them. 

 

I can't do this into Power Querry or in the table because I need to formula to follow the fitler input by the end user. 

 

Any idea how to acheive this ? 

2 REPLIES 2
Anonymous
Not applicable

Hi @Martin0011 ,

Please try like:

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlTSUXIEYhCK1YlWMoKwHWF8Y6gcSAjENwGynJDUm6LxzSBsJxjfHK7fGcy3gPNdwHxLCMcVpt7QAKYaJBALAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"Supplier ID" = _t, TAX1 = _t, TAX2 = _t, TAX3 = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Supplier ID", Int64.Type}, {"TAX1", type text}, {"TAX2", type text}, {"TAX3", type text}}),
    #"Unpivoted Columns" = Table.UnpivotOtherColumns(#"Changed Type", {"Supplier ID"}, "Attribute", "Value"),
    #"Filtered Rows" = Table.SelectRows(#"Unpivoted Columns", each ([Value] <> "")),
    #"Grouped Rows" = Table.Group(#"Filtered Rows", {"Value"}, {{"Data", each _, type table [Supplier ID=nullable number, Attribute=text, Value=text]}}),
    Custom1 = Table.SelectRows(#"Grouped Rows", each List.Count(List.Distinct([Data][Attribute]))>1),
    #"Added Index" = Table.AddIndexColumn(Custom1, "Duplicate Group", 1, 1, Int64.Type),
    #"Added Custom" = Table.AddColumn(#"Added Index", "Duplicate List", each Text.Combine(List.Transform([Data][Supplier ID],each Text.From(_)),",")),
    #"Expanded Data" = Table.ExpandTableColumn(#"Added Custom", "Data", {"Supplier ID"}, {"Supplier ID"}),
    #"Merged Queries" = Table.NestedJoin(#"Changed Type", {"Supplier ID"}, #"Expanded Data", {"Supplier ID"}, "Expanded Data", JoinKind.LeftOuter),
    #"Expanded Expanded Data" = Table.ExpandTableColumn(#"Merged Queries", "Expanded Data", {"Duplicate Group", "Duplicate List"}, {"Duplicate Group", "Duplicate List"})
in
    #"Expanded Expanded Data"

Output:

vcgaomsft_0-1705988587091.png

Best Regards,
Gao

Community Support Team

 

If there is any post helps, then please consider Accept it as the solution  to help the other members find it more quickly.
If I misunderstand your needs or you still have problems on it, please feel free to let us know. Thanks a lot!

How to get your questions answered quickly --  How to provide sample data in the Power BI Forum -- China Power BI User Group

It works super great and give me idea. But it would 100% fit the requirement as this is stored in the query. 

It must be dynamic based on the filter the end user will define. 

 

Let's assume I have a Active/Not Active dimension : 

 

Supplier IDStatus

1Active
2Not Active
3Active
4Not Active
5Not Active
6Not Active
7Active
8Active
9Active
10Active

 

Based on the above, the result should remove all the Not Active prior to attirbute a group and define the duplicate list. 

Reason why I was looking to a measure. 

Helpful resources

Announcements
April Power BI Update Carousel

Power BI Monthly Update - April 2026

Check out the April 2026 Power BI update to learn about new features.

New to Fabric survey Carousel

New to Fabric Survey

If you have recently started exploring Fabric, we'd love to hear how it's going. Your feedback can help with product improvements.

Power BI DataViz World Championships carousel

Power BI DataViz World Championships - June 2026

A new Power BI DataViz World Championship is coming this June! Don't miss out on submitting your entry.

FabCon and SQLCon Highlights Carousel

FabCon &SQLCon Highlights

Experience the highlights from FabCon & SQLCon, available live and on-demand starting April 14th.