Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Enhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends August 31st. Request your voucher.

Reply
Anonymous
Not applicable

Identify duplicate entries

Good Afternoon,

 

I hope someone can help. It is a bit confusing but I will try to explane to the best my ability.

 

I have unique activities which were raised and completed between certain times for certain post codes. I need to identify which of those activities may be duplicate. However, it has to match the below criteria:

1) Activities has to have the same post code and street name;

2) Have to be raised before the completion date and after the previous completion date;

3) first activity is not duplicated and only the activities which were raised after first activity for the same date period(it may be raised at different dates but completed the same date)

 

As an example i am attaching the table. The column (duplicated?) shows the correct answer. I need to create formula(s) which would allow me to see those answers. I know this is a bit confusing but i hope someone can assist. 

p.s. in other terms it would look like: if (activity raised date)=< (activity completed date) and if (activity raised date)> then previous (activity completed date) , (activity completed date) > (activity raised date), and activity Post code is equal to each activity post code and street.  

 

ActivityRaised DateCompleted DateStreetPost codeDuplicated?
101/05/201705/05/2017Albert roadww57N
202/05/201705/05/2017Albert roadww57Y
325/08/201725/08/2018Albert roadww57N
401/01/201806/01/2018Albert roadww57N
512/03/201814/03/2018Albert roadww57N
612/03/201814/03/2018Albert roadww57Y
714/03/201814/03/2018Albert roadww57Y
817/03/201820/03/2018Sun roadZE47N
918/03/201820/03/2018Sun roadZE47Y
1010/10/201811/10/2018Sun roadZE47N
3 ACCEPTED SOLUTIONS

Hi @Anonymous,

 

Try this formula, please.

Column =
VAR dupCount =
    CALCULATE (
        COUNTROWS ( 'Table1' ),
        FILTER (
            'Table1',
            Table1[Post code] = EARLIER ( Table1[Post code] )
                && Table1[Street] = EARLIER ( Table1[Street] )
                && Table1[Completed Date] = EARLIER ( Table1[Completed Date] )
                && Table1[Activity] <= EARLIER ( Table1[Activity] )
        )
    )
VAR lastRaisedDate =
    CALCULATE (
        MIN ( Table1[Raised Date] ),
        FILTER (
            'Table1',
            Table1[Post code] = EARLIER ( Table1[Post code] )
                && Table1[Street] = EARLIER ( Table1[Street] )
                && 'Table1'[Activity]
                    = EARLIER ( Table1[Activity] ) - 1
        )
    )
RETURN
    IF (
        [Raised Date] <= [Completed Date]
            && [Raised Date] >= lastRaisedDate
            && dupCount > 1,
        "Y",
        "N"
    )

Identify-duplicate-entries

 

Best Regards,
Dale

Community Support Team _ Dale
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

View solution in original post

Anonymous
Not applicable

Thank you so much for this. However, when i enter the formula i get this error message. I cannot work it out why tho..

 

Thank you

 

error.PNG

 

 

View solution in original post

Hi @Anonymous,

 

What's the Data type of [Activity]? It should be a numeric type to work with the formula. Can you change it or share a more accurate sample, please?

 

 

Best Regards,
Dale

Community Support Team _ Dale
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

View solution in original post

11 REPLIES 11
v-jiascu-msft
Microsoft Employee
Microsoft Employee

Hi @Anonymous,

 

I'm afraid some parts aren't consistent here. 

For example, the Raised Date of Activity 2 isn't after the previous completion date. So the [Duplicated?] of Activity 2 should be N, right?

What if the Completed Date of Activity 7 is "15/03/2018"? What should the [Duplicated?] be?

 

 

Best Regards,
Dale

Community Support Team _ Dale
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
Anonymous
Not applicable

Good Morning,

 

activity 2 is duplicated becasue it is raised for the same post code and street. However, both activities were closed at the same date.

If activity 7 was closed on 15th then it wouldn't be a duplicate becasue activity 7 was raised on 14th also, activity 5 and 6 were closed on 14.

 

Basically, I am trying to identify how many duplicated activities raised there are since last completed activity, for the same post code and street. However, first activity after previous activity was completed wouldn't count as duplicate as it is classed as a new entry, but any other activity after would be duplicate.

 

Does it make sense?

 

If it would help i can explain how that works in excel? But not sure if there is a better way with DAX.

 

Thank you

Kind REgards,

 

Hi @Anonymous,

 

Try this formula, please.

Column =
VAR dupCount =
    CALCULATE (
        COUNTROWS ( 'Table1' ),
        FILTER (
            'Table1',
            Table1[Post code] = EARLIER ( Table1[Post code] )
                && Table1[Street] = EARLIER ( Table1[Street] )
                && Table1[Completed Date] = EARLIER ( Table1[Completed Date] )
                && Table1[Activity] <= EARLIER ( Table1[Activity] )
        )
    )
VAR lastRaisedDate =
    CALCULATE (
        MIN ( Table1[Raised Date] ),
        FILTER (
            'Table1',
            Table1[Post code] = EARLIER ( Table1[Post code] )
                && Table1[Street] = EARLIER ( Table1[Street] )
                && 'Table1'[Activity]
                    = EARLIER ( Table1[Activity] ) - 1
        )
    )
RETURN
    IF (
        [Raised Date] <= [Completed Date]
            && [Raised Date] >= lastRaisedDate
            && dupCount > 1,
        "Y",
        "N"
    )

Identify-duplicate-entries

 

Best Regards,
Dale

Community Support Team _ Dale
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
Anonymous
Not applicable

Thank you so much for this. However, when i enter the formula i get this error message. I cannot work it out why tho..

 

Thank you

 

error.PNG

 

 

Hi @Anonymous,

 

What's the Data type of [Activity]? It should be a numeric type to work with the formula. Can you change it or share a more accurate sample, please?

 

 

Best Regards,
Dale

Community Support Team _ Dale
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
Anonymous
Not applicable

Hi,

 

I am sorry, but it is me again. Your formula was great, but it brought up another few issue,s after i started using big data set.

 

As i mentioned i included index but it didn't seem to resolve the issue for THE bigger data set.

 

So i will use another example to explain:

 

 I need to identify how many duplicated activities there are raised for the same post code and street within the last 24 hours since completion date.  However, first activity after previous activity was completed wouldn't count as duplicate as it is classed as a new entry, but any other activity after would be duplicate.

 

The formula which i use don't seem to like that completion date is the same hours and seconds. But also, includes some of activities, whihc are completed at the same time but the first activity was raised later then second one.

 

Worth to mention that i sorted the data in advance editor :  #"Sorting" = Table.Sort(#"Changed Type",{{"Postcode", Order.Ascending}, {"CompletedDate", Order.Ascending},{"Street", Order.Ascending}})

 

Also,I included Index based on post code and used the below formula in Report tab and not Data  . Example data attached.

ActivityRaisedDateCompletedDateStreetPostcodeOpsRegion_Join
0123 456ABCD01/01/1980 15:1902/01/1980 12:00Little streetAB1 2CDEastern South
0123 456BCDE01/01/1980 15:2602/01/1980 12:00Little streetAB1 2CDEastern South
0123 456CABFE01/01/1980 15:5202/01/1980 12:00Little streetAB1 2CDEastern South

 

Duplicated(24hrs) =

var dupCount = CALCULATE(COUNTROWS(Table1),FILTER(Table1,Table1[Postcode] = EARLIER(Table1[Postcode])&& Table1[Street]=EARLIER(Table1[Street])&& Table1[CompletedDate]>=EARLIEr(Table1[CompletedDate])-1&&Table1[CompletedDate]=Table1[CompletedDate] && Table1[CompletedDate]<=EARLIER(Table1[CompletedDate])&& Table1[Index PostCode]=Table1[Index PostCode] &&Table1[Index PostCode]<=EARLIER(Table1[Index PostCode])))

var lastraiseddate = CALCULATE(MIN(Table1[RaisedDate]),FILTER(Table1,Table1[Postcode]=EARLIER(Table1[Postcode])&& Table1[Street]=EARLIER(Table1[Street])&& Table1[Index PostCode]=EARLIER(Table1[Index PostCode])-1))

RETURN IF(Table1[RaisedDate]<=Table1[CompletedDate]&& Table1[RaisedDate]>=lastraiseddate&& dupCount>1,"Y","N")

 

Thank you so much for your help in advance.

Hi @Anonymous ,

 

I would suggest you create a new thread in this forum. 

 

Best Regards,

Community Support Team _ Dale
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
Anonymous
Not applicable

Thank you. I have created a new discussion.

 

Kind Regards

Anonymous
Not applicable

I really appreciate your help! Although the activity was a combination of numbers and letters i incuded Index under the query. So it all works now.

 

If not too much to ask i have another question/task based on this excersise. In the example i used only a date but in the practical excersise i also need to calculate completion time since previous completed acitvity for the same post code and address. I am attaching an example and how it would work in excel.

 

 

Anonymous
Not applicable

Example.PNG

Hi @Anonymous,

 

My pleasure. Please try this formula.

Column =
VAR lastCompletedDate =
    CALCULATE (
        MIN ( Table2[Completed Date] ),
        FILTER (
            ALLEXCEPT ( Table2, Table2[Street], Table2[Post code] ),
            Table2[Index]
                = EARLIER ( [Index] ) - 1
        )
    )
RETURN
    IF (
        ISBLANK ( lastCompletedDate ),
        99999,
        [Completed Date] - lastCompletedDate
    )

Identify-duplicate-entries2

 

Best Regards,
Dale

Community Support Team _ Dale
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Helpful resources

Announcements
August Power BI Update Carousel

Power BI Monthly Update - August 2025

Check out the August 2025 Power BI update to learn about new features.

August 2025 community update carousel

Fabric Community Update - August 2025

Find out what's new and trending in the Fabric community.