Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Get certified in Microsoft Fabric—for free! For a limited time, get a free DP-600 exam voucher to use by the end of 2024. Register now

Reply
gigotomo
Frequent Visitor

How to remove duplicates based on latest dates

I am trying to remove the duplicate "INC_NUM" and only the latest date (Submit) should remain alongwith its summary.

 

How do i do it.

 

Capture.JPG

2 ACCEPTED SOLUTIONS
thedatahiker
Microsoft Employee
Microsoft Employee

@gigotomo 
Here is one way to do this

  1. Open Power Query Editor
  2. Duplicate this table in PowerQuery
  3. delete summary column from new table
  4. Right click on INC_Num and select Group by
  5. In Group By pop up box give New Column Name the value "Submit", set operation to "MAX", and select Submit under Column. Click OK
  6. Duplicate INC_Num & Submit columns
  7. Duplicate INC_num column & Submit column in current Table
  8. select both "Inc_num - Copy" & "Submit - Copy", right click, and select Merge Columns. 
  9. Click OK in Merge Column popup box
  10. Return to the original table and repeat steps 6 - 8
  11. Now you have a shared Key you can Merge Queries together. From the Original Table click on "Merge Queries" in the top Home Ribbon
  12. The top query will be your Original Table and you need to select the new aggregated table as the second table. Then click on the merged column in each table as your join column. Under Join Kind select "Inner (Only matching rows)
  13. That's it. This will have only grabbed the INC_Num with the last date.  

View solution in original post

Tahreem24
Super User
Super User

@gigotomo ,

Please refer the below thread to achieve your need.

https://community.powerbi.com/t5/Desktop/Drop-duplicate-rows-retaining-latest-date/m-p/878537

 

Don't forget to give thumbs up and accept this as a solution if it helped you!!!

Please take a quick glance at newly created dashboards : Restaurant Management Dashboard , HR Analytics Report , Hotel Management Report, Sales Analysis Report , Fortune 500 Companies Analysis , Revenue Tracking Dashboard

View solution in original post

8 REPLIES 8
IDoLogistics
Frequent Visitor

Hi All,

 

I know this has already been answered, but it looks like Pragmatic Works has a more simple soloution to the problem involving "Table.Buffer".

https://www.google.com/search?q=power+query+how+to+sort+by+newest+date+then+remove+older+duplicates&...

This worked for me also!

This was the most helpful for me.

Tahreem24
Super User
Super User

@gigotomo ,

Please refer the below thread to achieve your need.

https://community.powerbi.com/t5/Desktop/Drop-duplicate-rows-retaining-latest-date/m-p/878537

 

Don't forget to give thumbs up and accept this as a solution if it helped you!!!

Please take a quick glance at newly created dashboards : Restaurant Management Dashboard , HR Analytics Report , Hotel Management Report, Sales Analysis Report , Fortune 500 Companies Analysis , Revenue Tracking Dashboard

Brilliant!!!

Super !!!

thedatahiker
Microsoft Employee
Microsoft Employee

@gigotomo 
Here is one way to do this

  1. Open Power Query Editor
  2. Duplicate this table in PowerQuery
  3. delete summary column from new table
  4. Right click on INC_Num and select Group by
  5. In Group By pop up box give New Column Name the value "Submit", set operation to "MAX", and select Submit under Column. Click OK
  6. Duplicate INC_Num & Submit columns
  7. Duplicate INC_num column & Submit column in current Table
  8. select both "Inc_num - Copy" & "Submit - Copy", right click, and select Merge Columns. 
  9. Click OK in Merge Column popup box
  10. Return to the original table and repeat steps 6 - 8
  11. Now you have a shared Key you can Merge Queries together. From the Original Table click on "Merge Queries" in the top Home Ribbon
  12. The top query will be your Original Table and you need to select the new aggregated table as the second table. Then click on the merged column in each table as your join column. Under Join Kind select "Inner (Only matching rows)
  13. That's it. This will have only grabbed the INC_Num with the last date.  
Anonymous
Not applicable

Thanks for the post

 

Took me a while to get my head around it, but it once I got it, it is actually real simple and effective.

 

I found it helpful to convert the dates to numbers.  My other challenge was that I had two date columns.  A start date but a termination date that was blank so I had to create a new custom column and give the blank date todays date in order to merge the start and termination date to get the biggest value to group by.

 

Cheers

Helpful resources

Announcements
November Carousel

Fabric Community Update - November 2024

Find out what's new and trending in the Fabric Community.

Live Sessions with Fabric DB

Be one of the first to start using Fabric Databases

Starting December 3, join live sessions with database experts and the Fabric product team to learn just how easy it is to get started.

Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount! Early Bird pricing ends December 9th.

Nov PBI Update Carousel

Power BI Monthly Update - November 2024

Check out the November 2024 Power BI update to learn about new features.