Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Compete to become Power BI Data Viz World Champion! First round ends August 18th. Get started.

Reply
afmcjarre
Helper I
Helper I

Find Most Recent Date Relative to another Date Column in Power Query

I have a large set of data where for each Number/Max Date Reported I need to identify the most recent dateAdded. As you can see, the score and dateAdded for each contactID is repeated for each Number/Max Date Reported. This is a tricky problem and I have not been able to figure out how to accomplish this in Power Query. 

 

Here is a subset of data for one contactID:

 

 

contactIdchangeIdpolicyIdNumberMax Date ReportedscoredateAdded
ABC123111-222-3332023ABCABC-2015-1179/14/201568712/29/2016 11:35
ABC123111-222-3332023ABCABC-2015-1179/14/201563012/20/2017 14:05
ABC123111-222-3332023ABCABC-2015-1179/14/20159093/13/2012 13:38
ABC123111-222-3332023ABCABC-2015-1179/14/20156781/4/2016 12:11
ABC123111-222-3332023ABCABC-2015-1179/14/20156871/3/2017 13:07
ABC123222-233-6662023ABCABC-2022-31611/22/202268712/29/2016 11:35
ABC123222-233-6662023ABCABC-2022-31611/22/202263012/20/2017 14:05
ABC123222-233-6662023ABCABC-2022-31611/22/20229093/13/2012 13:38
ABC123222-233-6662023ABCABC-2022-31611/22/20226781/4/2016 12:11
ABC123222-233-6662023ABCABC-2022-31611/22/20226871/3/2017 13:07
ABC123885-555-5552023ABCABC-2019-1035/16/201968712/29/2016 11:35
ABC123885-555-5552023ABCABC-2019-1035/16/201963012/20/2017 14:05
ABC123885-555-5552023ABCABC-2019-1035/16/20199093/13/2012 13:38
ABC123885-555-5552023ABCABC-2019-1035/16/20196781/4/2016 12:11
ABC123885-555-5552023ABCABC-2019-1035/16/20196871/3/2017 13:07

 

And what the end result should look like:

 

contactIdchangeIdpolicyIdNumberMax Date ReportedscoredateAdded
ABC123111-222-3332023ABCABC-2015-1179/14/20156781/4/2016 12:11
ABC123222-233-6662023ABCABC-2022-31611/22/202263012/20/2017 14:05
ABC123885-555-5552023ABCABC-2019-1035/16/201963012/20/2017 14:05

 

Please note that the dateAdded should always be the most recent date in the past relative to the Max Date Reported.

 

Thank you!

2 ACCEPTED SOLUTIONS
MarkLaf
Super User
Super User

I'm assuming that the dateAdded: 1/4/2016 12:11 for Max Date Reported: 9/14/2015 is a mistake given the max dateAdded is 12/20/2017 14:05 like for the two other Max Date Reported groups.

 

let
    Source = Original,
    PreSort = Table.Sort(
        Source,
        {{"contactId", Order.Ascending}, {"Max Date Reported", Order.Ascending}, {"dateAdded", Order.Descending}}
    ),
    LocalGroupFirstRow = Table.Group(
        PreSort,
        {"Number", "Max Date Reported"},
        {{"FirstRow", each Table.FirstN(_, 1), Value.Type(Source)}},
        GroupKind.Local
    ),
    CombineGroups = Table.Combine(LocalGroupFirstRow[FirstRow])
in
    CombineGroups

 

MarkLaf_1-1754581032544.png

 

 

View solution in original post

Thank you for pointing that out my mistake.  The dateAdded for the Max Date Reported  of 9/14/2015 should actually be 3/13/2012, so the result would look like this:

afmcjarre_1-1754583544083.png

I do not want the score/DateAdded for the Max Date Reported of 9/14/2015 to be 12/20/2017. The score must be the most recent one based on the Max Date Reported.

Thank you!

View solution in original post

9 REPLIES 9
MarkLaf
Super User
Super User

I'm assuming that the dateAdded: 1/4/2016 12:11 for Max Date Reported: 9/14/2015 is a mistake given the max dateAdded is 12/20/2017 14:05 like for the two other Max Date Reported groups.

 

let
    Source = Original,
    PreSort = Table.Sort(
        Source,
        {{"contactId", Order.Ascending}, {"Max Date Reported", Order.Ascending}, {"dateAdded", Order.Descending}}
    ),
    LocalGroupFirstRow = Table.Group(
        PreSort,
        {"Number", "Max Date Reported"},
        {{"FirstRow", each Table.FirstN(_, 1), Value.Type(Source)}},
        GroupKind.Local
    ),
    CombineGroups = Table.Combine(LocalGroupFirstRow[FirstRow])
in
    CombineGroups

 

MarkLaf_1-1754581032544.png

 

 

Thank you! This looks like it is working in my report. Could you explain exactly what you are doing in the last 2 steps of your query? I can see it working but cannot understand exactly WHY it is working. 

 

Sure. Including solution + explanation for comprehension's sake.

 

The problem is:

 

Get the row per [Number] and [Max Dat Reported] with closest earlier dateAdded relative to Max Date Reported. Logic for dateAdded vs Max Date Reported:

  • target dateAdded must be equal to or earlier than Max Date Reported
    AND
  • target dateAdded should have smallest difference with Max Date Reported

 

The solution is:

 

 

let
    Source = Original,
    RemoveAddedAfterMax = Table.SelectRows(
        Source, 
        each [dateAdded] <= DateTime.From([Max Date Reported])
    ),
    AddDiff = Table.AddColumn(
        RemoveAddedAfterMax, "Diff", 
        each DateTime.From([Max Date Reported]) - [dateAdded], 
        type duration
    ),
    SortDiff = Table.Sort(AddDiff, {{"Diff", Order.Ascending}}),
    GetSmallestDiffRow = Table.Group(
        SortDiff,
        {"contactId", "Max Date Reported"},
        {{"group", each Table.RemoveColumns(Table.FirstN(_, 1), "Diff"), Value.Type(Source)}}
    ),
    CombineSmallestDiffRows = Table.Combine(GetSmallestDiffRow[group])
in
    CombineSmallestDiffRows

 

 

MarkLaf_5-1754650536593.png

 

And, as requested, here is an explanation on last two steps.

 

GetSmallestDiffRow:

 

The objective of the step is to find the row that meets our criteria by every [Number] and [Max Dat Reported] pair. In scenarios like this, Group By is the route to take.

 

First, consider the following simple "Group By" UI operation on the SortDiff step from the solution.

 

MarkLaf_2-1754647991670.png

 

This produces the following

 

output (with preview of All Rows output per group row) 

 

MarkLaf_3-1754649151619.png

 

and M code (check out here for good explanation on Table.Group).

 

MarkLaf_4-1754649200290.png

 

Now, back to explaining solution code.

 

In both UI code and solution code, you can see we are taking SortDiff table and grouping by columns {Number, Max Date Reported}. The difference is the aggregation code in the third argument. Compare: 

 

UI All Rows Solution Comment
"AllRows" "group" Name of agg column to add
each _ each Table.RemoveColumns( Table.FirstN( _, 1 ), "Diff" ) Rather than simply return all the grouped rows (with otb function "each _", where _ = grouped table rows), we use a custom function to (a) grab the first row  of _ with Table.FirstN which gets us desired row with smallest Diff given we sorted Diff ascending in previous step, and (b) remove the extra Diff column with Table.RemoveColumns
type table [ ... ] Value.Type( Source )

Similar to Table.AddColumn and Table.TransformColumns, this part tells Power Query the output column type. Rather than hard-code the table type like the UI, we use Value.Type to dynamically use the Source's table type, which should match exactly the grouped rows given we are removing the only difference: the Diff column we added.

 

And, actually, as I was reviewing as part of this response, I'm realizing we don't need to specify the output type here at all. The resulting output of the next step (CombineSmallestDiffRows) retains proper table (and column) types without needing to specify output type here. Ie, the solution agg column definition could just follow syntax "{agg column name, agg function}" and ignore output type altogether.

 

In conclusion for this step, we are producing a grouped agg column "group" where each row is a single row table with the target "closest earlier dateAdded relative to Max Date Reported"

 

CombineSmallestDiffRows:

 

So, we have what we want from the previous step. The problem is that they are nested inside a table column. Rather than this column of separated target rows per [Number] and [Max Date Reported] pair, we want a single table that is the combination of all those single-row tables. Our objective for this step is to perform this transformation (column of tables -> single table combination of tables in column).

 

Again, let's first consider a simple UI example.

 

Take

 

Table1

Column1 Column2
1 2
3 4

 

and

 

Table2

Column1 Column2
5 6
7 8

 

Now, let's Append these through the UI.

 

MarkLaf_0-1754654675072.png

 

Here is resulting output

 

MarkLaf_1-1754654729656.png

 

and resulting code.

 

MarkLaf_2-1754654805745.png

 

So, we append tables together with Table.Combine, which takes as its only argument a list of tables.

 

Now, back to the solution step.

 

As discussed just above, at a high level, the previous step added a column of tables. And when you reference a table column like "Table[Column]", the output is a list of the values of the column. In the case of our "group" column that we added, the values are tables (single target row). In other words, "GetSmallestDiffRow[group]" gives us a list tables, which is exactly the input for Table.Combine that we want in order to get our final output.

Thank you for pointing that out my mistake.  The dateAdded for the Max Date Reported  of 9/14/2015 should actually be 3/13/2012, so the result would look like this:

afmcjarre_1-1754583544083.png

I do not want the score/DateAdded for the Max Date Reported of 9/14/2015 to be 12/20/2017. The score must be the most recent one based on the Max Date Reported.

Thank you!

I'm having trouble understanding the logic - I assumed "most recent dateAdded" meant absolute latest, but that clearly is not correct based on your response.

 

By most recent dateAdded "based on the Max Date Reported", do you mean closest earlier dateAdded relative to Max Date Reported? Ie is logic:

  • target dateAdded must be equal to or earlier than Max Date Reported
    AND
  • target dateAdded should have smallest difference with Max Date Reported

Yes, that is the logic. I need the most recent credit score/Date Added for each Max Date Reported. 

I think this meets your reqs:

 

let
    Source = Original,
    RemoveAddedAfterMax = Table.SelectRows(
        Source, 
        each [dateAdded] <= DateTime.From([Max Date Reported])
    ),
    AddDiff = Table.AddColumn(
        RemoveAddedAfterMax, "Diff", 
        each DateTime.From([Max Date Reported]) - [dateAdded], 
        type duration
    ),
    SortDiff = Table.Sort(AddDiff, {{"Diff", Order.Ascending}}),
    GetSmallestDiffRow = Table.Group(
        SortDiff,
        {"contactId", "Max Date Reported"},
        {{"group", each Table.RemoveColumns(Table.FirstN(_, 1), "Diff"), Value.Type(Source)}}
    ),
    CombineSmallestDiffRows = Table.Combine(GetSmallestDiffRow[group])
in
    CombineSmallestDiffRows

 

MarkLaf_0-1754589266848.png

 

Wow, thank you so much for this detailed explanation. This has taken report building to a new level for me, and I appreciate it greatly!

afmcjarre
Helper I
Helper I

Here are screenshots of the sample data for readability. 

Current Data

afmcjarre_0-1754578739574.png

 

End Goal

afmcjarre_1-1754578774353.png

 

Helpful resources

Announcements
August Power BI Update Carousel

Power BI Monthly Update - August 2025

Check out the August 2025 Power BI update to learn about new features.

August 2025 community update carousel

Fabric Community Update - August 2025

Find out what's new and trending in the Fabric community.

Top Solution Authors