Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Next up in the FabCon + SQLCon recap series: The roadmap for Microsoft SQL and Maximizing Developer experiences in Fabric. All sessions are available on-demand after the live show. Register now

Reply
tonijj
Helper IV
Helper IV

Identify Duplicates - multiple parameters

Hi,

 

Been searching the forum but haven’t really found a solution to my problem. Some threads are close, but maybe not all the way.

 

What I want to achieve:

Find and list duplicates based on 3 different columns

 

The columns that should be analyzed for duplicates:

  • Supplier 
  • Category
  • Purchasing Unit

 

 

If either 2 or 3 parameters are the same, they should be listed as duplicates.

 

I have created a unique ID per row in the Query.

 

As it is for a company with lot of different divisions, (lots) the Purchasing units can all buy from the same supplier. However, at times, each Division creates a supplier record for the same Supplier, hence creating a duplicate.

 

And/or – Division A categories the Supplier as “Phone retailer” and Division B categories the same supplier as “Computer manufacturer”, same thing there, two records, same Supplier.

 

Screenshot 2022-06-08 at 14.05.56.png

1 ACCEPTED SOLUTION
Anonymous
Not applicable

Hi @tonijj ,

  • I am very sorry that I wrote the wrong formula. Please correct it.
Measure =
VAR _countsupplier =
    CALCULATE (
        COUNTROWS ( 'Table' ),
        FILTER (
            ALL ( 'Table' ),
            'Table'[supplier number] = SELECTEDVALUE ( 'Table'[supplier number] )
                && 'Table'[Supplier] = SELECTEDVALUE ( 'Table'[Supplier] )
        )
    )
VAR _countcategory =
    CALCULATE (
        COUNTROWS ( 'Table' ),
        FILTER (
            ALL ( 'Table' ),
            'Table'[supplier number] = SELECTEDVALUE ( 'Table'[supplier number] )
                && 'Table'[Category] = SELECTEDVALUE ( 'Table'[Category] )
        )
    )
VAR _purchasingubit =
    CALCULATE (
        COUNTROWS ( 'Table' ),
        FILTER (
            ALL ( 'Table' ),
            'Table'[supplier number] = SELECTEDVALUE ( 'Table'[supplier number] )
                && 'Table'[Purchasing Ubit] = SELECTEDVALUE ( 'Table'[Purchasing Ubit] )
        )
    )
RETURN
    IF (
        ( _countsupplier >= 2
            && _countcategory >= 2 )
            || ( _countsupplier >= 2
            && _purchasingubit >= 2 )
            || ( _countcategory >= 2
            && _purchasingubit >= 2 ),
        "Duplicate",
        "No"
    )

Because you are looking for   2 or 3 parameters are the same. We only need to consider the simplest two with duplicate values between them.

  • Yes, You can write a formula like mine. But too many parameters can affect the performance of the formula. Please pay attention to.

 

Best Regards

Community Support Team _ Polly

 

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

View solution in original post

3 REPLIES 3
Anonymous
Not applicable

Hi @tonijj ,

Please have a try.

Create a measure.

Measure =
VAR _countsupplier =
    CALCULATE (
        COUNTROWS ( 'Table' ),
        FILTER (
            ALL ( 'Table' ),
            'Table'[supplier number] = SELECTEDVALUE ( 'Table'[supplier number] )
                && 'Table'[Supplier] = SELECTEDVALUE ( 'Table'[Supplier] )
        )
    )
VAR _countcategory =
    CALCULATE (
        COUNTROWS ( 'Table' ),
        FILTER (
            ALL ( 'Table' ),
            'Table'[supplier number] = SELECTEDVALUE ( 'Table'[supplier number] )
                && 'Table'[Category] = SELECTEDVALUE ( 'Table'[Category] )
        )
    )
VAR _purchasingubit =
    CALCULATE (
        COUNTROWS ( 'Table' ),
        FILTER (
            ALL ( 'Table' ),
            'Table'[supplier number] = SELECTEDVALUE ( 'Table'[supplier number] )
                && 'Table'[Purchasing Ubit] = SELECTEDVALUE ( 'Table'[Purchasing Ubit] )
        )
    )
RETURN
    IF (
        ( _countsupplier >= 2
            && _countcategory >= 2 )
            || ( _countsupplier >= 2
            && _purchasingubit >= 2 )
            || ( _countsupplier >= 2
            && _purchasingubit >= 2 ),
        "Duplicate",
        "No"
    )

vpollymsft_1-1655102001690.png

If I have misunderstood your meaning, please provide your desired output with more details and you sample pbix file without privacy information.

 

Best Regards

Community Support Team _ Polly

 

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly

 

Hi @Anonymous 

First of all, a big thanks for this! 

I have just a few quick follow-up questions;

1, If we look at the bottom part of the formula, isnt the red highlighted part redundant? 

( _countsupplier >= 2

            && _purchasingubit >= 2 )

            || ( _countsupplier >= 2

            && _purchasingubit >= 2 ),

  

2. Can I have more parameters to identify duplicates, basically, can I include more columns simply by following the logic in the code you provided?

 

Sincerely

Anonymous
Not applicable

Hi @tonijj ,

  • I am very sorry that I wrote the wrong formula. Please correct it.
Measure =
VAR _countsupplier =
    CALCULATE (
        COUNTROWS ( 'Table' ),
        FILTER (
            ALL ( 'Table' ),
            'Table'[supplier number] = SELECTEDVALUE ( 'Table'[supplier number] )
                && 'Table'[Supplier] = SELECTEDVALUE ( 'Table'[Supplier] )
        )
    )
VAR _countcategory =
    CALCULATE (
        COUNTROWS ( 'Table' ),
        FILTER (
            ALL ( 'Table' ),
            'Table'[supplier number] = SELECTEDVALUE ( 'Table'[supplier number] )
                && 'Table'[Category] = SELECTEDVALUE ( 'Table'[Category] )
        )
    )
VAR _purchasingubit =
    CALCULATE (
        COUNTROWS ( 'Table' ),
        FILTER (
            ALL ( 'Table' ),
            'Table'[supplier number] = SELECTEDVALUE ( 'Table'[supplier number] )
                && 'Table'[Purchasing Ubit] = SELECTEDVALUE ( 'Table'[Purchasing Ubit] )
        )
    )
RETURN
    IF (
        ( _countsupplier >= 2
            && _countcategory >= 2 )
            || ( _countsupplier >= 2
            && _purchasingubit >= 2 )
            || ( _countcategory >= 2
            && _purchasingubit >= 2 ),
        "Duplicate",
        "No"
    )

Because you are looking for   2 or 3 parameters are the same. We only need to consider the simplest two with duplicate values between them.

  • Yes, You can write a formula like mine. But too many parameters can affect the performance of the formula. Please pay attention to.

 

Best Regards

Community Support Team _ Polly

 

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Helpful resources

Announcements
New to Fabric survey Carousel

New to Fabric Survey

If you have recently started exploring Fabric, we'd love to hear how it's going. Your feedback can help with product improvements.

Power BI DataViz World Championships carousel

Power BI DataViz World Championships - June 2026

A new Power BI DataViz World Championship is coming this June! Don't miss out on submitting your entry.

Join our Fabric User Panel

Join our Fabric User Panel

Share feedback directly with Fabric product managers, participate in targeted research studies and influence the Fabric roadmap.

March Power BI Update Carousel

Power BI Community Update - March 2026

Check out the March 2026 Power BI update to learn about new features.