Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Reply
Anonymous
Not applicable

Identify duplicates and show up records

I've got contact table with columns: names, emails. 

 

I want to identify duplicates in emails and show up their names. 

for example, show me the table like this.   any ideas, thanks. CRM.PNG

 

1 ACCEPTED SOLUTION
Anonymous
Not applicable

You could create a calculated column with the following formula:

Count Emails = 
Var Emails = [Emails]
RETURN

CALCULATE(
    COUNTROWS('Your Table'),
    all('Your Table'),
    'Your Table'[Emails] = Emails
)

From here, you could simply look for any rows where the Count Emails is greater than 1.

View solution in original post

30 REPLIES 30
Anonymous
Not applicable

Good question! The documentation doesn't seem to specify, however I always assume that it do the worst case scenario of whatever i'm attempting.  So i write statements as best as possible, then test if I get the desired result.



@Anonymous

 

Hi Ross. This code works for me as well but I am a bit confused:

 

Are you able to just use:

 

Count Emails = 

CALCULATE(
    COUNTROWS('Your Table'),
    all('Your Table'),
    'Your Table'[Emails]
)

 (ie. exclude the VAR function)? Correct me if I am wrong, but doesn't that just define the variable 'Invoice'? Shouldn't the code work the same without it (albeit less efficiently). I tried excluding that part of the coding and it didn't work (comes up with an error saying "Cannot convert value ............. of type Text to type True/False".

 

Thanks so much in advance!

 

edday.

 

 

Anonymous
Not applicable

Better late than never, but the 'cannot convert value...' is because the argument for calculate() must evaluate to true or false. It could work on a boolean column, but not on a text column.

Anonymous
Not applicable

I can't think if i've ever attempted to write this query in the method you have described, please try and let me know.  The reasons for my structure (based on my attempts and other similar languages i've used):

Firstly, Calculate throws a hissy fit over certain peices of information being used in the constraits.  The variable gets around this problem. I often do this as a goto as i know it will always work.

Secondly, when you need to do something like 'Your Table'[Emails] = 'Your Table'[Emails].  Where what you are trying to do is say 'Only return lines where the Email address field matches the current email', this doesn't work in DAX as i've written it.  In your example, if DAX takes your line 'Your Table'[Emails] to mean that, i'd love to know that it does it that way.  I'm on vacation so i can't really just load it up and test it right now.  I still prefer my method for  the code readability as i'm explicit in exactly what i'm doing.  This is often important when i come back to old projects later down the line, however if your method actually works it would be a neat DAX trick to know about.

Anonymous
Not applicable

Hi Guys,

I couldn't get this to work for me. When I created my expressions based on my data

Count [#"HAWB/HBL"] =
Var HAWB/HBL = [#"HAWB/HBL"]
RETURN

CALCULATE(COUNTROWS('Your Table'),
    all('Your Table'),
    'Your Table'[#"HAWB/HBL"] = 'HAWB/HBL')

The syntax error seems to pick on "var "hawb/hbl". I have tried without parethatsise as well and get the same issue.

 

Please can you help

@Anonymous

 

Thanks for the clarification Ross. I have been unsuccessful in using your code without the Var function. Specifically Power BI comes up with the error "Cannot convert value …………. of type Text to type True/False". I tried directly converting the column of data into the True/False data type through the query editor, but was still unsuccessful. 

 

At this point I am not sure if your code without the Var would work or not, but your explanation makes sense. I'd say Power BI is throwing a hissy fit indeed.

 

Thanks again. 

Why it doesn't work for me? 

Count LAK_NUM =
VAR LAK_NUM = [LAK_NUM] RETURN
CALCULATE(
COUNTROWS('Customers'),
all('Customers'),
'Customers'[LAK_NUM]= LAK_NUM
)

 

I get an error: The value for 'LAK_NUM' cannot be determined. Either 'LAK_NUM' doesn't exist, or there is no current row for a column named 'LAK_NUM'.

I was not able to insert an image to the post.

Anonymous
Not applicable

Do you have any nulls (or blanks) on any rows for the column LAK_NUM?

If you do, you might want to add an IF statement into your Variable to check for blanks and use a 0 value in its place.  Something like

 

LAK_NUM = IF (ISBLANK([LAK_NUM]), 0, [LAK_NUM]

There are no blanks in the column. 

Why it says there in no current raw?

Anonymous
Not applicable

Without seeing your dataset that would be difficult to say.  Generally i'd check things like:

 

  • Did i spell the column name correctly?
  • Is the column in this specific table, and not another table
  • Is there a single blank row in my table
  • Is there a null or blank value in any of my rows for that column

 

From the data area of Power BI, you can copy the table and paste into Excel.  I find that helps with the trouble shooting.  Follow the data and it should start to come clear what is out of the ordinary.

Helpful resources

Announcements
April AMA free

Microsoft Fabric AMA Livestream

Join us Tuesday, April 09, 9:00 – 10:00 AM PST for a live, expert-led Q&A session on all things Microsoft Fabric!

March Fabric Community Update

Fabric Community Update - March 2024

Find out what's new and trending in the Fabric Community.