Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Get certified in Microsoft Fabric—for free! For a limited time, get a free DP-600 exam voucher to use by the end of 2024. Register now

Reply
pawelj795
Post Prodigy
Post Prodigy

Remove duplicates with grouping by, leave most recent value

Hi,
I'm currently working on filtering out old BOM's which my company replaced with new ones.
Below, there is a table with BOM lines.

pawelj795_0-1617102685879.png


The same BOM always has the same parent_product_tmpl_id, but different BOM_id.

With that being said in the above scenario, I want to keep the most recent rows, which is row 3 and 4.

I tried solutions from the below videos, but unsuccessful.
I think I should mix these 2 solutions in my case, but I really don't know how.

 

https://www.youtube.com/watch?v=9dX9PK_KlSA
https://www.youtube.com/watch?v=S3X_HK7yl1w

 

Sample file:
https://drive.google.com/file/d/1ybE1_ju1nm8NZBMmlQ6_7PeBUc3izb4p/view?usp=sharing


1 ACCEPTED SOLUTION
PhilipTreacy
Super User
Super User

Hi @pawelj795 

 

Download example PBIX file and data

 

I'm not sure I've followed your logic totally as I end up with a LOT of rows for each parent_product_tmpl_id.

But according to what you've described I soudl be grouping on the parent_product_tmpl_id field and not the bom_id field.

boms.png

 

In my file above, check the Merge1 query for this results table.  I had to recreate your data from your tables in the Data Model as I couldn't open your query and access your SQL source.  The data is with the PBIX file above.

 

Regards

Phil



Did I answer your question? Then please mark my post as the solution.
If I helped you, click on the Thumbs Up to give Kudos.


Blog :: YouTube Channel :: Connect on Linkedin


Proud to be a Super User!


View solution in original post

6 REPLIES 6
EBrusse
Frequent Visitor

Hello,


You can solve this problem in both DAX and Power Query in which grouping by is not neccesary.

You want to define the latest record as the current record and define records with an older date as history.

 

The solution in DAX would look like:

 

COLUMN = 
VAR LatestRecordParentID = 
    CALCULATE(
        MAX( table[create_date] ) ,
        ALLEXCEPT( table , [parent_product_tmpl_id] ) ,
    )

RETURN
    IF(
        table[create_date] = LatestRecordParentID ,
        "Current record" ,
        "History record"
    )

You can use this column to filter out previous records and only keep the current.

If you want to solve this problem in Power Query you can apply the same logic of the DAX formula using MAX and FOR EACH.

 

If you need any help, please let me know!

PhilipTreacy
Super User
Super User

Hi @pawelj795 

 

Download example PBIX file and data

 

I'm not sure I've followed your logic totally as I end up with a LOT of rows for each parent_product_tmpl_id.

But according to what you've described I soudl be grouping on the parent_product_tmpl_id field and not the bom_id field.

boms.png

 

In my file above, check the Merge1 query for this results table.  I had to recreate your data from your tables in the Data Model as I couldn't open your query and access your SQL source.  The data is with the PBIX file above.

 

Regards

Phil



Did I answer your question? Then please mark my post as the solution.
If I helped you, click on the Thumbs Up to give Kudos.


Blog :: YouTube Channel :: Connect on Linkedin


Proud to be a Super User!


@PhilipTreacy 
Well, your result seems okay.
But, could you explain to me step by step, how you've done it?

Hi @pawelj795 

1. The query called Table1 loads the data.

2. I duplicated this and the new query is called Table 1 (2). 

3. I grouped the data by the parent_product_tmpl_id column and set the operation to pick the Max value (the latest date) from the create_date column

grp2.png

This results in a table that has 1 row per parent_product_tmpl_id showing the latest date associated with that id.

 

5. Merge the 2 queries (Merge as new to create a 3rd query) using these settings

merge2.png

What this does is pull the rows from Table1 that have the (latest) date shown in the Table1 (2) query for each parent_product_tmpl_id.  All the other rows, the ones with earlier dates, are ignored.

 

6. Expand the Tables in the merged query to show the other columns you want.

 

Regards

Phil



Did I answer your question? Then please mark my post as the solution.
If I helped you, click on the Thumbs Up to give Kudos.


Blog :: YouTube Channel :: Connect on Linkedin


Proud to be a Super User!


@PhilipTreacy

Thanks, but I don't see your file

@pawelj795 Sorry, try again, I had to update the link.

https://d13ot9o61jdzpp.cloudfront.net/files/pbiforum/pawel.zip

Phil

 



Did I answer your question? Then please mark my post as the solution.
If I helped you, click on the Thumbs Up to give Kudos.


Blog :: YouTube Channel :: Connect on Linkedin


Proud to be a Super User!


Helpful resources

Announcements
November Carousel

Fabric Community Update - November 2024

Find out what's new and trending in the Fabric Community.

Live Sessions with Fabric DB

Be one of the first to start using Fabric Databases

Starting December 3, join live sessions with database experts and the Fabric product team to learn just how easy it is to get started.

Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount! Early Bird pricing ends December 9th.

Nov PBI Update Carousel

Power BI Monthly Update - November 2024

Check out the November 2024 Power BI update to learn about new features.