Solved: Grouping overlapping contracts with first/last dat...

Anonymous · ‎06-12-2020

I have an export of contract data where overlapping contracts need to be grouped into continuous dates, with the first/last dates for each group. Individual IDs may have more than one group of contracts, as not all contracts will overlap.

e.g.

ID	Contract_Start	Contract_End
100	01/01/2018	30/04/2018
100	01/04/2018	31/12/2018
100	01/06/2018	31/10/2018
100	01/04/2019	31/10/2019
100	01/06/2019	31/12/2019
101	01/01/2018	30/06/2018
101	01/05/2018	30/11/2018
101	01/01/2019	31/05/2019
101	01/02/2019	30/06/2019
102	01/01/2018	31/08/2018
102	01/04/2019	31/08/2019
102	01/05/2019	30/06/2019

would become:

ID	Continuous_Start	Continuous_End
100	01/01/2018	31/12/2018
100	01/04/2019	31/12/2019
101	01/01/2018	30/11/2018
101	01/01/2019	30/06/2019
102	01/01/2018	31/08/2018
102	01/04/2019	31/08/2019

I have grouped all contracts with the start/end of all contracts for an ID using a merge into new table and then grouping by ID, but I haven't been able to find a way to split these into groups of overlapping/continuous dates, with a new grouping for an ID where there has been a gap between contracts.

Please can you help?

Anonymous · ‎06-13-2020

Here a complete solution (to test, of course, with more significant data).

let

    #"Raggruppate righe" = Table.Group(yourTable, {"ID"}, {{"Conteggio", each Table.RowCount(_), type number}, {"startend", each CSEsorted( _[Contract_Start], _[Contract_End])}}),
    #"Tabella startend espansa" = Table.ExpandListColumn(#"Raggruppate righe", "startend"),
    #"Valori estratti" = Table.TransformColumns(#"Tabella startend espansa", {"startend", each Text.Combine(List.Transform(_, Text.From), ":"), type text})
in
    #"Valori estratti"

this modified function does not rely on the order of identifiers.

let 
CS_CE=(start as list, end as list)=>
let

startend=List.Sort(List.Zip({start,end}),(x,y)=>Value.Compare(x{0},y{0})),
unionInt=List.Accumulate(startend, [CS={List.Min(start)},CE={List.Min(start)}], 
                                             (s,c)=> [CS= if c{0}>List.Last(s[CE]) then s[CS]&{c{0}} else s[CS],
                                                      CE= if c{0}>List.Last(s[CE]) then s[CE]&{c{1}} else 
                                                  if c{1}>List.Last(s[CE]) then List.RemoveLastN(s[CE],1)&{c{1}} else s[CE]])
in

List.Zip(Record.ToList(unionInt))

in CS_CE

in this case the result seems what you expect, but I suggest you test other situations as well

View solution in original post

Anonymous · ‎06-13-2020

A solution that does not use the table.group:

    let
    cp = Table.AddColumn(yourtab, "grpIdx", each Table.PositionOf(tab1[[ID],[Contract_Start],[Contract_End]],[[ID],[Contract_Start],[Contract_End]],Occurrence.All,(x,y)=> x[ID]=y[ID] and 
                                                                   IntervalIntersect({x[Contract_Start],x[Contract_End]},{y[Contract_Start],y[Contract_End]})  )),
    ap = Table.AddColumn(cp, "union", each let un=List.Accumulate(cp[grpIdx],{},(s,c)=> if not List.IsEmpty(List.Intersect({_[grpIdx],c})) then List.Union({s,c}) else s) 
                    in Text.Combine(List.Transform({cp[Contract_Start]{List.Min(un)},cp[Contract_End]{List.Max(un)}}, Text.From), ":")),
    #"Rimossi duplicati" = Table.Distinct(ap)
in
    #"Rimossi duplicati"

where IntervalIntersect is the following function:

= (x ,y)=>
let ovl= x{1}>y{0} and x{0}<y{1} in ovl

Anonymous · ‎06-14-2020

in an attempt to solve this problem, I thought it might be useful (even just as an exercise) to have a function that, from a generic list of intervals, obtains the union-of-these-intervals-list.

Here the code:

let 
union=(lstInt as list)=>
let
sortU=List.Sort(lstInt,(x,y)=>Value.Compare(x{0},y{0})),
UI=List.Accumulate(sortU, {{null,null}}, 
                        (s,c)=> if IntIntersection(c,List.Last(s)) 
                                then List.Union({List.RemoveLastN(s,1),{{List.Min({c{0},List.Last(s){0}}),List.Max({c{1},List.Last(s){1}})}}})
                                else List.Union({s , {c}})
                   )
in UI
in union

the function IntIntersection, to check if two intervals intersect

let
overlapInt=(x ,y)=>
let  ovl= x{1}>=y{0} and x{0}<=y{1} , 
OVL=if ovl=null then true else ovl
in OVL
in overlapInt

here how you can use it in the problem:

...

    grp = Table.Group(youTab, {"ID"}, {"StartEnd", each IntUnion(List.Zip( {_[Contract_Start], _[Contract_End]}))}),
    te = Table.ExpandListColumn(grp, "StartEnd"),
    ve = Table.TransformColumns(te, {"StartEnd", each Text.Combine(List.Transform(_, Text.From), ":"), type text})
in
    ve

Anonymous · ‎06-13-2020

Try using this function for each group:

let 
CS_CE=(start as list, end as list)=>
let

aaa=List.Accumulate(List.Zip({start, end}), [CS={List.Min(start)},CE={List.Min(start)}], 
                                             (s,c)=> [CS= if c{0}>List.Last(s[CE]) then s[CS]&{c{0}} else s[CS],
                                                      CE= if c{0}>List.Last(s[CE]) then s[CE]&{c{1}} else 
                                                  if c{1}>List.Last(s[CE]) then List.RemoveLastN(s[CE],1)&{c{1}} else s[CE]])
in

aaa

in CS_CE

At the moment I don't have enough time to write a complete solution.
If it is needed and the function is considered valid (tests can be done by providing ad hoc lists), I will do it later.

Anonymous · ‎06-13-2020

Here a complete solution (to test, of course, with more significant data).

let

    #"Raggruppate righe" = Table.Group(yourTable, {"ID"}, {{"Conteggio", each Table.RowCount(_), type number}, {"startend", each CSEsorted( _[Contract_Start], _[Contract_End])}}),
    #"Tabella startend espansa" = Table.ExpandListColumn(#"Raggruppate righe", "startend"),
    #"Valori estratti" = Table.TransformColumns(#"Tabella startend espansa", {"startend", each Text.Combine(List.Transform(_, Text.From), ":"), type text})
in
    #"Valori estratti"

this modified function does not rely on the order of identifiers.

let 
CS_CE=(start as list, end as list)=>
let

startend=List.Sort(List.Zip({start,end}),(x,y)=>Value.Compare(x{0},y{0})),
unionInt=List.Accumulate(startend, [CS={List.Min(start)},CE={List.Min(start)}], 
                                             (s,c)=> [CS= if c{0}>List.Last(s[CE]) then s[CS]&{c{0}} else s[CS],
                                                      CE= if c{0}>List.Last(s[CE]) then s[CE]&{c{1}} else 
                                                  if c{1}>List.Last(s[CE]) then List.RemoveLastN(s[CE],1)&{c{1}} else s[CE]])
in

List.Zip(Record.ToList(unionInt))

in CS_CE

in this case the result seems what you expect, but I suggest you test other situations as well

lbendlin · ‎06-12-2020

Interesting problem. Does it have to be done in Power Query? If this data would reside in SQL server then it would be much easier to do the logic there.

In your sample data some durations are embedded in others (for example row 3 for ID 100 is fully covered by row 2). Is that intended?

Anonymous · ‎06-15-2020

Thanks for your response @lbendlin .

Currently the data is not able to be supplied via SQL server, but it is my intention to move to this in the longer term where, as you point out, the logic would be much easier.

Also, some contracts are embedded within earlier contracts, due to the way the contracts have been issued historically, so the sample data is intended to replicate this issue.

Grouping overlapping contracts with first/last dates

Helpful resources

Join our Fabric User Panel

Power BI Monthly Update - June 2025

Fabric Community Update - June 2025

Join the #PBI10 DataViz contest

Grouping overlapping contracts with first/last dates

Helpful resources

Join our Fabric User Panel

Power BI Monthly Update - June 2025

Fabric Community Update - June 2025