Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

The Power BI Data Visualization World Championships is back! Get ahead of the game and start preparing now! Learn more

Reply
cwollett
Advocate II
Advocate II

Finding duplicate rows using DAX

My scenario is a much larger dataset with longer strings for the two columns, but this should be generalizable so for simplicity, let's say I have a base table with the following data:

Group NameMember Name
Group AAlex
Group ABetty
Group ACharles
Group BAlex
Group BBetty
Group CAlex
Group CBetty
Group CCharles

 

I know how to make a measure that concatenates the members into a single string:

 

Member List = 
CONCATENATEX(
    VALUES(Example[Member Name]),
    Example[Member Name],
    ", ",
    Example[Member Name],
    ASC
)

 

Group NameMember List
Group AAlex, Betty, Charles
Group BAlex, Betty
Group CAlex, Betty, Charles

 

What I would like to be able to do is have a way to do a distinct on that Member List column so that I can find duplicates. I'd also like to do it using a temp table and not one in storage since I actually have something that's thousands of rows and the Member List string is going to be thousands of characters, hence the DAX preference.

So I'm looking for something like:

Group NameisDuplicate
Group A1
Group B0
Group C1

 

Any suggestions? I've tried to use ADDCOLUMNS(SUMMARIZE( ... where I add the concatenatex as an additional column, but when I try to do something like DISTINCTCOUNT on that new column, I get errors.

5 REPLIES 5
tamerj1
Super User
Super User

Hi @cwollett 

I hope this works. However, I followed the same methodology of using CONCATENATEX but didn't have the chance to test it. My guess is that even if it works it won't be so efficient. I think other approaches are possible but I need to be on my PC to test. 
Duplicate =
VAR MemberList =
CONCATENATEX (
VALUES ( Example[Member Name] ),
Example[Member Name],
", ",
Example[Member Name], ASC
)
RETURN
SUMX (
ALL ( Example[Group Name] ),
INT (
CONCATENATEX (
CALCULATETABLE (
VALUES ( Example[Member Name] ),
ALLEXCEPT ( Example, Example[Group Name] )
),
Example[Member Name],
", ",
Example[Member Name], ASC
) = MemberList
)
) - 1

Hello! So this does work perfectly on the smaller example scenario. But, as you mentioned, using the larger real data, performance is bad. I might have to make a related table with the list in a column and see what that does to the model size.

 

Thanks for trying!

@cwollett 

This is probably faster 

Duplicate =
VAR MemberList =
VALUES ( Example[Member Name] )
RETURN
SUMX (
ALL ( Example[Group Name] ),
INT (
ISEMPTY (
EXCEPT (
CALCULATETABLE (
VALUES ( Example[Member Name] ),
ALLEXCEPT ( Example, Example[Group Name] )
),
MemberList
)
)
)
) - 1

Fowmy
Super User
Super User

@cwollett 

Can you explain you did you caculate 1 for Group A and 0 for Group B?


Did I answer your question? Mark my post as a solution! and hit thumbs up


Subscribe and learn Power BI from these videos

Website LinkedIn PBI User Group

I would like to be able to return a way to say the Member List column from the table above that one is a duplicate. So since both A & C are "Alex, Betty, Charles" they are duplicates of one another. Since Group B is unique, it should not be shown as a duplicate.

Helpful resources

Announcements
Power BI DataViz World Championships

Power BI Dataviz World Championships

The Power BI Data Visualization World Championships is back! Get ahead of the game and start preparing now!

December 2025 Power BI Update Carousel

Power BI Monthly Update - December 2025

Check out the December 2025 Power BI Holiday Recap!

FabCon Atlanta 2026 carousel

FabCon Atlanta 2026

Join us at FabCon Atlanta, March 16-20, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.