Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Next up in the FabCon + SQLCon recap series: The roadmap for Microsoft SQL and Maximizing Developer experiences in Fabric. All sessions are available on-demand after the live show. Register now

Reply
Anonymous
Not applicable

Remove Duplicates Per Group (keep first) measure help

Hi All

 

Is it possible to create a measure in PBI that remove duplicate values per group, and keeps the first occurrence of the value? Perhaps not a measure but maybe a new table, I'm unsure of how to work around this. For context, the database I'm working with has a funky treatment of some data that has a function back-end but doesn't make sense when trying to visualise user behaviour.

 

Currently, the data looks like:

USERENTRY IDENTRYSTATUSTIME SUBMITTED (HH:mm:ss)
userA1unicornfail10:23:10
userA2forestpass10:30:49
userA1unicornfail10:30:49
userB1unicornfail13:40:22
userB1fairypass13:43:59

 

I want to clean it so it looks like:

USERENTRY IDENTRYSTATUSTIME SUBMITTED (HH:mm:ss)
userA1unicornfail10:23:10
userA2forestpass10:30:49
userB1unicornfail13:40:22
userB1fairypass13:43:59
...............

 

Note the row I want to remove has

  1. duplicated ENTRY from the first instance for userA and
  2. duplicated TIME SUBMITTED from the first instance for userA

Also, the ENTRY ID cannot be used.

Any pointers would be greatly appreciated 🙂 

2 ACCEPTED SOLUTIONS
Pragati11
Super User
Super User

Hi @Anonymous ,

 

Check if this existing thread helps:

https://community.powerbi.com/t5/Power-Query/Remove-duplicates-keeping-the-most-recent-row/m-p/757837

 

Thanks,

Pragati

Best Regards,

Pragati Jain


MVP logo


LinkedIn | Twitter | Blog YouTube 

Did I answer your question? Mark my post as a solution! This will help others on the forum!

Appreciate your Kudos!!

Proud to be a Super User!!

View solution in original post

Anonymous
Not applicable

Here's the M code that does what you want:

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WKi1OLXJU0lEyBOLSvMzk/KI8ICstMTMHJGhgZWRsZWigFKuDUGkEks8vSi0uATIKEouLIQqNDaxMLFEU4jISVaUTTpXGViZA+40wVALliyqR7AaqM7YyBZoYCwA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [User = _t, EntryID = _t, Entry = _t, Status = _t, Time = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"User", type text}, {"EntryID", Int64.Type}, {"Entry", type text}, {"Status", type text}, {"Time", type time}}),
    #"Added Custom" = Table.AddColumn(#"Changed Type", "FirstTimeEntry", 
        each List.Min( 
            Table.SelectRows(
                #"Changed Type",
                (r) => r[User] = [User] and r[Entry] = [Entry]
            )[Time] 
        ) = [Time]),
    #"Filtered Rows" = Table.SelectRows(#"Added Custom", each ([FirstTimeEntry] = true)),
    #"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"FirstTimeEntry"})
in
    #"Removed Columns"

 

Best

D

View solution in original post

5 REPLIES 5
Anonymous
Not applicable

Here's the M code that does what you want:

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WKi1OLXJU0lEyBOLSvMzk/KI8ICstMTMHJGhgZWRsZWigFKuDUGkEks8vSi0uATIKEouLIQqNDaxMLFEU4jISVaUTTpXGViZA+40wVALliyqR7AaqM7YyBZoYCwA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [User = _t, EntryID = _t, Entry = _t, Status = _t, Time = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"User", type text}, {"EntryID", Int64.Type}, {"Entry", type text}, {"Status", type text}, {"Time", type time}}),
    #"Added Custom" = Table.AddColumn(#"Changed Type", "FirstTimeEntry", 
        each List.Min( 
            Table.SelectRows(
                #"Changed Type",
                (r) => r[User] = [User] and r[Entry] = [Entry]
            )[Time] 
        ) = [Time]),
    #"Filtered Rows" = Table.SelectRows(#"Added Custom", each ([FirstTimeEntry] = true)),
    #"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"FirstTimeEntry"})
in
    #"Removed Columns"

 

Best

D

Anonymous
Not applicable

Thanks @Anonymous, had to have a little play around and add in r[entry id] = [entry id] but it's working well now 👍 cheers again

Anonymous
Not applicable

Yeah... That's a perfect job for Power Query. You can do it in DAX as well as a calculated table but it really should be performed in PQ as this is the data-munging tool. I can create some sample M code for you so that you can see how such cleaning is done...

Best
D
Pragati11
Super User
Super User

Hi @Anonymous ,

 

Check if this existing thread helps:

https://community.powerbi.com/t5/Power-Query/Remove-duplicates-keeping-the-most-recent-row/m-p/757837

 

Thanks,

Pragati

Best Regards,

Pragati Jain


MVP logo


LinkedIn | Twitter | Blog YouTube 

Did I answer your question? Mark my post as a solution! This will help others on the forum!

Appreciate your Kudos!!

Proud to be a Super User!!

Anonymous
Not applicable

Thanks @Pragati11 , the buffer got me halfway there! Now it's just removing the wrong duplicate, hopefully the other reply will resolve this 🙂 

Helpful resources

Announcements
New to Fabric survey Carousel

New to Fabric Survey

If you have recently started exploring Fabric, we'd love to hear how it's going. Your feedback can help with product improvements.

Power BI DataViz World Championships carousel

Power BI DataViz World Championships - June 2026

A new Power BI DataViz World Championship is coming this June! Don't miss out on submitting your entry.

Join our Fabric User Panel

Join our Fabric User Panel

Share feedback directly with Fabric product managers, participate in targeted research studies and influence the Fabric roadmap.

March Power BI Update Carousel

Power BI Community Update - March 2026

Check out the March 2026 Power BI update to learn about new features.