Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Get inspired! Check out the entries from the Power BI DataViz World Championships preliminary rounds and give kudos to your favorites. View the vizzies.

Reply
Anonymous
Not applicable

Remove Duplicates From Specific Columns but keep the latest datetime value

Hello everyone, i have a rating table

i selected 3 columns, customer ID, Date and Movie as Table.Distinct to remove their duplicates

however some customers rated twice or more in the same movie on the same day but different time, i need to tell powerquery to keep the latest time of one single Date

so when i remove duplicated for the columns i stated, the created date isnt the latest date

so in theory i have total of 5 columns

Customer id, created date(date time format), Date(Date format), rating and Movie

what i need is to show unique rating of a customer of a movie for one day, 

example 

id || created date || date || rating || movie

1090 || 1/1/2024 11:00:00 AM || 1/1/2024 || 10 || Titanic
1090 || 1/1/2024 11:42:30 AM || 1/1/2024 || 9 || Titanic
1090 || 1/2/2024 10:00:00 PM || 1/2/2024 || 9.5 || Titanic

what i want is to keep the lates created date of 1 date

result

id || created date || date || rating || movie

1090 || 1/1/2024 11:42:30 AM || 1/1/2024 || 9 || Titanic
1090 || 1/2/2024 10:00:00 PM || 1/2/2024 || 9.5 || Titanic

any help would be appreciated


2 ACCEPTED SOLUTIONS

Your expected result does not match with description: you asked for latest "time" (you probably meant [date] column), but you have in your expected result 2 rows for same movie and same id (but you want only latest...)

 

I've edited your sample data a bit:

 

Before:

dufoq3_0-1713978607234.png

 

After

dufoq3_1-1713978620342.png

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMjSwsFTSUTLUN9Q3MjAyUTA0tDIwACIFR18kYRDTAEiEZJYk5mUmK8XqgHRaGqDpNDGyMsbUaYlLoxFUowHUygBfJGGQRj1TZK2xAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [id = _t, #"created date" = _t, date = _t, rating = _t, movie = _t]),
    ChangedTypeUS = Table.TransformColumnTypes(Source,{{"created date", type datetime}, {"date", type date}}, "en-US"),
    GroupedRows = Table.Group(ChangedTypeUS, {"id", "movie"}, {{"Latest Date", each Table.SelectRows(_, (x)=> x[date] = List.Max([date])), type table}}),
    FilteredLatestDate = Table.Combine(GroupedRows[Latest Date])
in
    FilteredLatestDate

Note: Check this link to learn how to use my query.
Check this link if you don't know how to provide sample data.

View solution in original post

Oh, I thought you want to consider just date. Replace that [date] twice in Grouped Rows step in my query and it should work.


Note: Check this link to learn how to use my query.
Check this link if you don't know how to provide sample data.

View solution in original post

5 REPLIES 5
dufoq3
Super User
Super User

Hi @Anonymous, you should provide sample data. You can check this query and edit with your needs.

 

Before

dufoq3_2-1713978134544.png

 

After

dufoq3_1-1713978057422.png

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlTSUTLUN9Q3MjAyATJNEExjIHZUitXBUGOMpByXGiMkJhA7YVNjhmCawtWAVBsaIGQMDbG5CEMVmn3YVVmgOhyHhQYYzooFAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"Customer ID" = _t, #"Created Date" = _t, Date = _t, Rating = _t, Movie = _t]),
    ChangedType = Table.TransformColumnTypes(Source,{{"Created Date", type date}, {"Date", type date}}),
    GroupedRows = Table.Group(ChangedType, {"Customer ID","Movie"}, {{"Latest Date", each Table.SelectRows(_, (x)=> x[Date] = List.Max([Date])), type table}}),
    CombinedLatestDate = Table.Combine(GroupedRows[Latest Date])
in
    CombinedLatestDate

 


Note: Check this link to learn how to use my query.
Check this link if you don't know how to provide sample data.

Anonymous
Not applicable

hello @dufoq3 , thank you for sharing, i added a bit more detail to my original post if you want to check what i need

example 

id || created date || date || rating || movie

1090 || 1/1/2024 11:00:00 AM || 1/1/2024 || 10 || Titanic
1090 || 1/1/2024 11:42:30 AM || 1/1/2024 || 9 || Titanic
1090 || 1/2/2024 10:00:00 PM || 1/2/2024 || 9.5 || Titanic

what i want is to keep the lates created date of 1 date

expected result

id || created date || date || rating || movie

1090 || 1/1/2024 11:42:30 AM || 1/1/2024 || 9 || Titanic
1090 || 1/2/2024 10:00:00 PM || 1/2/2024 || 9.5 || Titanic


Your expected result does not match with description: you asked for latest "time" (you probably meant [date] column), but you have in your expected result 2 rows for same movie and same id (but you want only latest...)

 

I've edited your sample data a bit:

 

Before:

dufoq3_0-1713978607234.png

 

After

dufoq3_1-1713978620342.png

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMjSwsFTSUTLUN9Q3MjAyUTA0tDIwACIFR18kYRDTAEiEZJYk5mUmK8XqgHRaGqDpNDGyMsbUaYlLoxFUowHUygBfJGGQRj1TZK2xAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [id = _t, #"created date" = _t, date = _t, rating = _t, movie = _t]),
    ChangedTypeUS = Table.TransformColumnTypes(Source,{{"created date", type datetime}, {"date", type date}}, "en-US"),
    GroupedRows = Table.Group(ChangedTypeUS, {"id", "movie"}, {{"Latest Date", each Table.SelectRows(_, (x)=> x[date] = List.Max([date])), type table}}),
    FilteredLatestDate = Table.Combine(GroupedRows[Latest Date])
in
    FilteredLatestDate

Note: Check this link to learn how to use my query.
Check this link if you don't know how to provide sample data.

Anonymous
Not applicable

almost but why did you change the first id to 1089 i meant by the example, all are 1090 wit hsame movie but different date and create date

example 

id || created date || date || rating || movie

1090 || 1/1/2024 11:00:00 AM || 1/1/2024 || 10 || Titanic
1090 || 1/1/2024 11:42:30 AM || 1/1/2024 || 9 || Titanic
1090 || 1/2/2024 10:00:00 PM || 1/2/2024 || 9.5 || Titanic

what i want is to keep the lates created date of 1 date

expected result

id || created date || date || rating || movie

1090 || 1/1/2024 11:42:30 AM || 1/1/2024 || 9 || Titanic
1090 || 1/2/2024 10:00:00 PM || 1/2/2024 || 9.5 || Titanic

Oh, I thought you want to consider just date. Replace that [date] twice in Grouped Rows step in my query and it should work.


Note: Check this link to learn how to use my query.
Check this link if you don't know how to provide sample data.

Helpful resources

Announcements
Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code FABINSIDER for a $400 discount!

FebPBI_Carousel

Power BI Monthly Update - February 2025

Check out the February 2025 Power BI update to learn about new features.

March2025 Carousel

Fabric Community Update - March 2025

Find out what's new and trending in the Fabric community.