Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Get inspired! Check out the entries from the Power BI DataViz World Championships preliminary rounds and give kudos to your favorites. View the vizzies.

Reply
jeannier2022
Regular Visitor

dax query to locate duplicates

I need to write a DAX query to locate duplicate growers I want to create a column in a table called duplicate growers.

I have a column GrowerID, Grower. The issue I am having is that the users enter duplicate grower id for grower names, the names are a little different the user puts comma between the words or a - between words but the same growerid which is causing a duplicate error in my dashboard. I would like to put in a dax query to locate duplicates and based on the values yes for dulicates no for no duplicates use these values in Power Automate to trigger an action to send an email to myself if there are duplicate errors so I can fix those errors before the board members view the dashboard.

the code that I have so far:

Grower_Duplicates = 
VAR _count =
    CALCULATE (
        DISTINCTCOUNT ( 'MAIN/PRIMARY LUT'[GROWER] ),
        FILTER (
            'MAIN/PRIMARY LUT',
              'MAIN/PRIMARY LUT'[GRADE ID] = EARLIER ( 'MAIN/PRIMARY LUT'[GROWER ID] )
                && 'MAIN/PRIMARY LUT'[GROWER] IN { "Monitored", "Unmonitored" }

        )
    )
RETURN
    IF ( _count > 1, "yes", "no" )

 

1 ACCEPTED SOLUTION
v-nuoc-msft
Community Support
Community Support

Hi @jeannier2022

 

@Jihwan_Kim Good share!

 

For your question, here is the method I provided:

 

Here's some dummy data

 

"MAIN/PRIMARY LUT"

vnuocmsft_0-1707102091459.png

 

Create a measure. This formula creates a local variable to store the of the current row and another variable to normalize the name by removing commas and hyphens. It then uses these variables within a function that counts the number of rows with the same and normalized name. If more than one row is found, it flags as "yes" for duplicates, otherwise "no" for unique entries.

 

Grower_Duplicates = 
    VAR CurrentGrowerID = 
        SELECTEDVALUE('MAIN/PRIMARY LUT'[Grade ID])
    VAR _Grower = 
        SUBSTITUTE(
            SUBSTITUTE(
                SELECTEDVALUE('MAIN/PRIMARY LUT'[Grower])
                , 
                ",", 
                ""
            ), 
                "-", 
                ""
        )
RETURN IF (
        CALCULATE (
            COUNTROWS ( 'MAIN/PRIMARY LUT' ),
            FILTER (
                ALL ( 'MAIN/PRIMARY LUT' ),
                'MAIN/PRIMARY LUT'[Grade ID] = CurrentGrowerID &&
                SUBSTITUTE(SUBSTITUTE('MAIN/PRIMARY LUT'[Grower], ",", ""), "-", "") = _Grower
            )
        ) > 1,
        "yes",
        "no"
    )

 

 

Here is the result.

vnuocmsft_1-1707102251679.png

If you're still having problems, provide some dummy data and the desired outcome. It is best presented in the form of a table.

 

Regards,

Nono Chen

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

View solution in original post

2 REPLIES 2
v-nuoc-msft
Community Support
Community Support

Hi @jeannier2022

 

@Jihwan_Kim Good share!

 

For your question, here is the method I provided:

 

Here's some dummy data

 

"MAIN/PRIMARY LUT"

vnuocmsft_0-1707102091459.png

 

Create a measure. This formula creates a local variable to store the of the current row and another variable to normalize the name by removing commas and hyphens. It then uses these variables within a function that counts the number of rows with the same and normalized name. If more than one row is found, it flags as "yes" for duplicates, otherwise "no" for unique entries.

 

Grower_Duplicates = 
    VAR CurrentGrowerID = 
        SELECTEDVALUE('MAIN/PRIMARY LUT'[Grade ID])
    VAR _Grower = 
        SUBSTITUTE(
            SUBSTITUTE(
                SELECTEDVALUE('MAIN/PRIMARY LUT'[Grower])
                , 
                ",", 
                ""
            ), 
                "-", 
                ""
        )
RETURN IF (
        CALCULATE (
            COUNTROWS ( 'MAIN/PRIMARY LUT' ),
            FILTER (
                ALL ( 'MAIN/PRIMARY LUT' ),
                'MAIN/PRIMARY LUT'[Grade ID] = CurrentGrowerID &&
                SUBSTITUTE(SUBSTITUTE('MAIN/PRIMARY LUT'[Grower], ",", ""), "-", "") = _Grower
            )
        ) > 1,
        "yes",
        "no"
    )

 

 

Here is the result.

vnuocmsft_1-1707102251679.png

If you're still having problems, provide some dummy data and the desired outcome. It is best presented in the form of a table.

 

Regards,

Nono Chen

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Jihwan_Kim
Super User
Super User

Hi,

Please try below whether it suits your requirement.

 

Grower_Duplicates =
VAR _countcondition =
    COUNTROWS (
        SUMMARIZE (
            FILTER (
                'MAIN/PRIMARY LUT',
                'MAIN/PRIMARY LUT'[GRADE ID] = EARLIER ( 'MAIN/PRIMARY LUT'[GROWER ID] )
                    && 'MAIN/PRIMARY LUT'[GROWER] IN { "Monitored", "Unmonitored" }
            ),
            'MAIN/PRIMARY LUT'[GROWER]
        )
    ) > 1
RETURN
    IF ( _countcondition, "yes", "no" )

If this post helps, then please consider accepting it as the solution to help other members find it faster, and give a big thumbs up.


Visit my LinkedIn page by clicking here.


Schedule a meeting with me to discuss further by clicking here.

Helpful resources

Announcements
Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!

FebPBI_Carousel

Power BI Monthly Update - February 2025

Check out the February 2025 Power BI update to learn about new features.

Feb2025 NL Carousel

Fabric Community Update - February 2025

Find out what's new and trending in the Fabric community.

Top Solution Authors
Top Kudoed Authors