Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Enhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends August 31st. Request your voucher.

Reply
MichelleRA_24
Regular Visitor

Remove duplicates from several varibles

Hi team!

I hope you are doing great! I thanks for your support.

 

I need your help to figure out how to create a logic that results in the "Result" column, considering the number_of_people,

sequence_number,

payment_date,

modified_date,

creation_date,

Filter Address=1, 

status=ERROR_TIMBRE.

If the sequence number is the same for an employee, keep a line that meets the Address Filter = 1 and status = ERROR_TIMBRE, and use the most recent date from any of the 3 date columns. If the employee has different sequence numbers and some are repeated, do the same, making sure to keep the line with the most recent date compared to the others

number_of_peoplesequence_numbermodified_datepayment_datecreation_dateFilter AddressstatusResult
32000127112/28/20236/3/20246/3/20240ERROR_TIMBRERemove
3200012711/4/20246/3/20246/3/20241ERROR_TIMBRERemove
3200012715/31/20246/3/20246/3/20241ERROR_TIMBREKeep
320001851112/26/20235/15/20246/7/20240ERROR_TIMBRERemove
320001851112/26/20235/15/20246/12/20240ERROR_TIMBRERemove
320001851112/26/20235/15/20246/13/20240EmitidoRemove
32000185116/12/20245/15/20246/7/20241ERROR_TIMBRERemove
32000185116/12/20245/15/20246/12/20241ERROR_TIMBREKeep
32000185116/12/20245/15/20246/13/20241EmitidoRemove
320001851212/26/20235/30/20246/7/20240ERROR_TIMBRERemove
320001851212/26/20235/30/20246/12/20240ERROR_TIMBRERemove
320001851212/26/20235/30/20246/13/20240EmitidoRemove
32000185126/12/20245/30/20246/7/20241ERROR_TIMBRERemove
32000185126/12/20245/30/20246/12/20241ERROR_TIMBREKeep
32000185126/12/20245/30/20246/13/20241EmitidoRemove
320001851312/26/20236/14/20246/13/20240EmitidoRemove
32000185136/12/20246/14/20246/13/20241CanceladoRemove
320001851412/26/20236/28/20246/27/20240EmitidoRemove
32000185146/12/20246/28/20246/27/20241EmitidoRemove
        
        
Table name: Secuencias       
        
        
Expected results       
number_of_peoplesequence_numbermodified_datepayment_datecreation_dateFilter AddressstatusResult
3200012715/31/20246/3/20246/3/20241ERROR_TIMBREKeep
32000185116/12/20245/15/20246/12/20241ERROR_TIMBREKeep
32000185126/12/20245/30/20246/12/20241ERROR_TIMBREKeep

 

MichelleRA_24_0-1724952834268.png

 

1 ACCEPTED SOLUTION
Anonymous
Not applicable

Thanks for the reply from lbendlin , please allow me to provide another insight: 
Hi  @MichelleRA_24 ,

 

Here are the steps you can follow:

1. Create calculated column.

Result_test =
var _modifieddate=
MAXX(
    FILTER(ALL('Test'),
    'Test'[number_of_people]=EARLIER('Test'[number_of_people])&&'Test'[Filter Address]=1&&'Test'[status]="ERROR_TIMBRE"&&
    'Test'[modified_date]<=TODAY()),[modified_date])
var _count=
COUNTX(
    FILTER(ALL('Test'),[number_of_people]=EARLIER('Test'[number_of_people])&&'Test'[sequence_number]=EARLIER('Test'[sequence_number])&&[modified_date]=_modifieddate),[number_of_people])

var _creationdate=

MAXX(
    FILTER(ALL('Test'),
    'Test'[number_of_people]=EARLIER('Test'[number_of_people])&&'Test'[Filter Address]=1&&'Test'[status]="ERROR_TIMBRE"&&'Test'[sequence_number]=EARLIER('Test'[sequence_number])&&
    'Test'[creation_date]<=TODAY()),[creation_date])
return
IF(
    _count=1&&'Test'[modified_date]=_modifieddate,"Keep",
IF( _count>1&&'Test'[modified_date]=_modifieddate&&'Test'[creation_date]=_creationdate,"Keep",BLANK()))

2. Create calculated table.

Table =
FILTER(
    'Test',[Result_test]="Keep")

3. Result:

2024.9.2.png

Best Regards,

Liu Yang

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly

View solution in original post

5 REPLIES 5
Anonymous
Not applicable

Thanks for the reply from lbendlin , please allow me to provide another insight: 
Hi  @MichelleRA_24 ,

 

Here are the steps you can follow:

1. Create calculated column.

Result_test =
var _modifieddate=
MAXX(
    FILTER(ALL('Test'),
    'Test'[number_of_people]=EARLIER('Test'[number_of_people])&&'Test'[Filter Address]=1&&'Test'[status]="ERROR_TIMBRE"&&
    'Test'[modified_date]<=TODAY()),[modified_date])
var _count=
COUNTX(
    FILTER(ALL('Test'),[number_of_people]=EARLIER('Test'[number_of_people])&&'Test'[sequence_number]=EARLIER('Test'[sequence_number])&&[modified_date]=_modifieddate),[number_of_people])

var _creationdate=

MAXX(
    FILTER(ALL('Test'),
    'Test'[number_of_people]=EARLIER('Test'[number_of_people])&&'Test'[Filter Address]=1&&'Test'[status]="ERROR_TIMBRE"&&'Test'[sequence_number]=EARLIER('Test'[sequence_number])&&
    'Test'[creation_date]<=TODAY()),[creation_date])
return
IF(
    _count=1&&'Test'[modified_date]=_modifieddate,"Keep",
IF( _count>1&&'Test'[modified_date]=_modifieddate&&'Test'[creation_date]=_creationdate,"Keep",BLANK()))

2. Create calculated table.

Table =
FILTER(
    'Test',[Result_test]="Keep")

3. Result:

2024.9.2.png

Best Regards,

Liu Yang

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly

lbendlin
Super User
Super User

You posted this in the DAX section. Is it ok to do this via Power Query instead?

 

making sure to keep the line with the most recent date compared to the others

 

That is a very ambiguous requirement. Your sample data only has two rows per group.  Please provide better sample data.

 

lbendlin_0-1724979268618.png

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("rZTLCoMwEEX/xbWQZJKo6xYXpZRC6E6kSJuFUGsXpd/fBBEf0TxsF4GJymHuPWBRRBQwxgTSKI6IPoAgQ4CBqkuCqB7ZdMTq5EKcxfVyOO1Erq5CNu1HRmVs8hBbQIxGEkLjiJJg3FHK1wiWcf1ZnzXps3JE+MBLQ8J6EfWrfyOnSpr6Xd9bO228x0piHyE+wOF5iBMLb+rYlRbM7ij+SbCDuEWwCxkmGIz+FhP7C7YDwwU7eGGC6bw7RWCbu6Oz1dZgerV99bzJR2XDMXO57rfa8SANWm7e9xpsubfyCw==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [number_of_people = _t, sequence_number = _t, modified_date = _t, payment_date = _t, creation_date = _t, #"Filter Address" = _t, status = _t, Result = _t]),
    #"Filtered Rows" = Table.SelectRows(Source, each ([status] = "ERROR_TIMBRE") and ([Filter Address] = "1")),
    #"Changed Type" = Table.TransformColumnTypes(#"Filtered Rows",{{"modified_date", type date}, {"payment_date", type date}, {"creation_date", type date}}),
    #"Added Custom" = Table.AddColumn(#"Changed Type", "Max Date", (k)=> let f = Table.SelectRows(#"Changed Type", each [number_of_people]=k[number_of_people] and [sequence_number]=k[sequence_number])
in
List.Max(List.Combine({f[modified_date],f[payment_date],f[creation_date]})))
in
    #"Added Custom"

 

Hi Ibendlin, thanks for helping me, I don´t have problem using Power query

This is all I have; there are more than 100 employees, but they have the same data.

In this case, the only difference is that one column may be missing dates, which can happen.

So, the logic is the same: each employee must have a unique sequence and be the most recent among the three dates. The rows should also maintain Filter Address=1 and Status=ERROR_TIMBRE. For the rows that meet these criteria, indicate 'Keep', and for those that do not, indicate 'Remove'.

 

 

 

 

number_of_peoplesequence_numbermodified_datepayment_datecreation_dateFilter AddressstatusResult320001201512/26/20236/14/20246/13/20240ERROR_TIMBRE 32000120151/4/20246/14/20246/13/20241ERROR_TIMBREKeep320001201612/26/20236/28/20246/27/20241ERROR_TIMBRE 32000120161/4/20246/28/20246/27/20241ERROR_TIMBREKeep32001297232/22/2024 6/7/20240ERROR_TIMBRE 32001297232/22/2024 6/12/20241ERROR_TIMBRE 32001297232/25/2024 6/7/20241ERROR_TIMBRE 32001297232/25/2024 6/12/20241ERROR_TIMBREKeep32001297292/22/20246/14/20246/14/20240Emitido 32001297292/25/20246/14/20246/14/20241ERROR_TIMBREKeep32001297302/22/20246/21/20246/21/20241ERROR_TIMBRE 32001297302/25/20246/21/20246/21/20241ERROR_TIMBREKeep32001297312/22/20246/28/20246/28/20240Emitido 32001297312/25/20246/28/20246/28/20241ERROR_TIMBREKeep        Expected results       number_of_peoplesequence_numbermodified_datepayment_datecreation_dateFilter AddressstatusResult32000120151/4/20246/14/20246/13/20241ERROR_TIMBREKeep32000120161/4/20246/28/20246/27/20241ERROR_TIMBREKeep32001297232/25/2024 6/12/20241ERROR_TIMBREKeep32001297292/25/20246/14/20246/14/20241ERROR_TIMBREKeep32001297302/25/20246/21/20246/21/20241ERROR_TIMBREKeep32001297312/25/20246/28/20246/28/20241ERROR_TIMBREKeep

MichelleRA_24_4-1725036223502.png

number_of_peoplesequence_numbermodified_datepayment_datecreation_dateFilter AddressstatusResult
320001201512/26/20236/14/20246/13/20240ERROR_TIMBRE 
32000120151/4/20246/14/20246/13/20241ERROR_TIMBREKeep
320001201612/26/20236/28/20246/27/20241ERROR_TIMBRE 
32000120161/4/20246/28/20246/27/20241ERROR_TIMBREKeep
32001297232/22/2024 6/7/20240ERROR_TIMBRE 
32001297232/22/2024 6/12/20241ERROR_TIMBRE 
32001297232/25/2024 6/7/20241ERROR_TIMBRE 
32001297232/25/2024 6/12/20241ERROR_TIMBREKeep
32001297292/22/20246/14/20246/14/20240Emitido 
32001297292/25/20246/14/20246/14/20241ERROR_TIMBREKeep
32001297302/22/20246/21/20246/21/20241ERROR_TIMBRE 
32001297302/25/20246/21/20246/21/20241ERROR_TIMBREKeep
32001297312/22/20246/28/20246/28/20240Emitido 
32001297312/25/20246/28/20246/28/20241ERROR_TIMBREKeep
        
Expected results       
number_of_peoplesequence_numbermodified_datepayment_datecreation_dateFilter AddressstatusResult
32000120151/4/20246/14/20246/13/20241ERROR_TIMBREKeep
32000120161/4/20246/28/20246/27/20241ERROR_TIMBREKeep
32001297232/25/2024 6/12/20241ERROR_TIMBREKeep
32001297292/25/20246/14/20246/14/20241ERROR_TIMBREKeep
32001297302/25/20246/21/20246/21/20241ERROR_TIMBREKeep
32001297312/25/20246/28/20246/28/20241ERROR_TIMBREKeep

and be the most recent among the three dates

That's the ambiguous statement - I think what you really want is "the most recent date in one of the columns if the other column have matching dates".  Or something like that - sequence number 23 seems to be even more complex.

 

lbendlin_0-1725038412049.png

 

Helpful resources

Announcements
July 2025 community update carousel

Fabric Community Update - July 2025

Find out what's new and trending in the Fabric community.

July PBI25 Carousel

Power BI Monthly Update - July 2025

Check out the July 2025 Power BI update to learn about new features.

Join our Fabric User Panel

Join our Fabric User Panel

This is your chance to engage directly with the engineering team behind Fabric and Power BI. Share your experiences and shape the future.