Solved: Re: Group rows with same dates and consecutive dat... - Page 2

JoMont · ‎08-06-2025

Hi,

I have data with medical registration numbers, discipline types, start and end dates and FTE equivalent, about medical training placements. There are other columns, but they are not material to the issue (although I do want them to appear in the final table as well).

Index	Medical Registration Number	Discipline Type	Start Date	End Date	FTE equivalent
1	MED0143	General Medicine	15/10/2023	28/10/2023	1.86
2	MED0143	General Medicine	29/10/2023	6/01/2024	10
3	MED0497	Obstetrics and Gynaecology	8/01/2023	15/04/2023	10.5
4	MED0497	Obstetrics and Gynaecology	8/01/2023	15/04/2023	3.5
5	MED0629	Paediatrics	8/01/2023	15/04/2023	14
6	MED0629	Obstetrics and Gynaecology	16/04/2023	22/07/2023	14
7	MED0629	Intensive Care Unit	23/07/2023	14/10/2023	12

I need help to group the rows

a) where a medical number is doing two different placements at the same time but in the same discipline (eg rows 3 and 4)

AND also

b) where a medical number is doing two different placements in the same discipline, but the date period is continuous (eg rows 1 and 2)

What I need for output is all the same columns, with the FTE Equivalent summed, and the date ranges showing the earliest and the latest dates eg

Index	Medical Registration Number	Discipline Type	Start Date	End Date	FTE equivalent
1	MED0143	General Medicine	15/10/2023	6/01/2024	11.86
2	MED0497	Obstetrics and Gynaecology	8/01/2023	15/04/2023	14
3	MED0629	Paediatrics	8/01/2023	15/04/2023	14
4	MED0629	Obstetrics and Gynaecology	16/04/2023	22/07/2023	14
5	MED0629	Intensive Care Unit	23/07/2023	14/10/2023	12

I have searched online extensively for an answer and have tried a lot of different PQ solutions but somehow I'm never quite able to adapt the code to work for me 😞

Looking forward to a reasonable solution that works well on a larger dataset than provided above.

Balakrishnan_J · ‎08-14-2025

Hi @JoMont,
Hope you are doing well.

Open advanced editor in power query and copy paste the below M-code.

Duplicate Entry is removed and all columns are retained.

let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("rZRNbsIwEIWvErEmyJ78LyuKaBdpK1Gpi5ZFGiwUKXFQEpC4Tc/Sk3VMgdgmfyA2yI7w+2bmPfvzcxTOHgmho/FozjgrotQI2SqJE87w01NebpIqSnFpUZMSEwgAbqhjEio2Fm4+HvAnDEOxPpw9/J/S0XJ8VBdHZhkr1ozH+9v1f39knQbGqYO3IoqrJGbNn6griVvEpNDUyTTPsi1PKsGxg4nveNSuadZgGoE+mqPRwJkEgDyV1je/fo6tTe+SY9/Hpw6XRK+v32XFqiKJSyPiK2O+5xGL8zRf71USWCaBAYmQ9L62GARXk7TdCXoHwW1VKGk5V2FfXYUDEyzCd+sq3GGu0jOd2MeBCPrL+z/d1acNNcEbfic8yVXitbgqp5RS/U74d3hDoPUNCQZMS/EKcGP1ZpOohGmacAE3FqzY4WiM52xT5DuWMV5p6fQkefSIuI3vYVLGLE0jzvJt2dDR4BSCBABiEr+J1pdBvdtut4BKGAiO1nW8+FepiyfqLEjtpnvVlgUE1eqLLZpZaOPCWRHoNEcETRw9qfuqem/SFD9oix9q0rT7P4ijSOPqckpwySF4La17cxr6OXKWfw==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"Medical Registration Number" = _t, #"Discipline Type" = _t, #"Facility Type" = _t, #"Start Date" = _t, #"End Date" = _t, State = _t, MMM = _t, #"Discipline Group" = _t, #"FTE equivalent" = _t]),

    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Medical Registration Number", type text}, {"Discipline Type", type text}, {"Facility Type", type text}, {"Start Date", type date}, {"End Date", type date}, {"State", type text}, {"MMM", type text}, {"Discipline Group", type text}, {"FTE equivalent", type number}}),

// Sorting by Medical Registration Number and Start Date is important here.
// Without applying sorting logic, sometimes gives you different result.

Sort_Logic = Table.Sort(#"Changed Type",{{"Medical Registration Number", Order.Ascending}, {"Start Date", Order.Ascending}}),

Duplicate_Logic = Table.Distinct(Sort_Logic, {"Medical Registration Number", "Discipline Type", "Facility Type", "Start Date", "End Date", "State", "Discipline Group", "FTE equivalent"}),


Group_Logic_1 = Table.Group(Duplicate_Logic, {"Medical Registration Number", "Discipline Type","Facility Type", "State", "MMM", "Discipline Group"}, {{"AllRows", (x) => x}}, GroupKind.Local),

//Assuming your data has continuous date period where a medical number is doing two different placements in the same discipline.
// If Date range is not continous, sometimes gives you different result.

Transform_Logic = Table.TransformColumns(Group_Logic_1, {{"AllRows", (x) => Table.TransformColumns(x, {{"Start Date", (y) => List.Min(x[Start Date])},{"End Date", (y) => List.Max(x[End Date])}})}}),

Combine_Logic = Table.Combine(Transform_Logic[AllRows]),

Group_Logic_2 = Table.Group(Combine_Logic, {"Medical Registration Number", "Discipline Type", "Facility Type", "Start Date", "End Date", "State", "MMM", "Discipline Group"}, {{"FTE equivalent", each List.Sum([FTE equivalent]), type nullable number}}),
Output = Table.TransformColumnTypes(Group_Logic_2,{{"Medical Registration Number", type text}, {"Discipline Type", type text}, {"Start Date", type date}, {"End Date", type date}, {"FTE equivalent", type number}})
in
    Output

Regards,
Balakrishnan_J

Did I answer your question? If Yes,
Then mark my post as a solution and click on the Thumbs Up 👍 to give Kudos.
Remember: You can mark multiple answers as a solution...

View solution in original post

AlienSx · ‎08-07-2025

just in case you have overlapping periods...

let
    fx_same_group = (x, y, max_date) => x{1} = y{1} and x{2} = y{2} and (x{3} <= max_date  + #duration(1, 0, 0, 0)),
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    sort = Table.Sort(Source, {"Medical Registration Number","Discipline Type", "Start Date"}),
    rows = List.Buffer(Table.ToList(sort, (x) => x)), 
    gen = List.Generate(
        () => [i = 0, group_num = 1, dmax = rows{0}{4}],
        (x) => x[i] < List.Count(rows),
        (x) => [
            i = x[i] + 1,
            group_num = x[group_num] + Number.From(not fx_same_group(rows{i}, rows{x[i]}, x[dmax])),
            dmax = if group_num = x[group_num] then List.Max({x[dmax], rows{i}{4}}) else rows{i}{4}
        ],
        (x) => rows{x[i]} & {x[group_num]}
    ),
    tbl = Table.FromList(gen, (x) => x, Table.ColumnNames(Source) & {"group"}),
    result = Table.Group(
        tbl, 
        {"group", "Medical Registration Number","Discipline Type"},
        {
            {"Start Date", (x) => List.Min(x[Start Date])},
            {"End Date", (x) => List.Max(x[End Date])},
            {"FTE equivalent", (x) => List.Sum(x[FTE equivalent])}
        }
    )
in
    result

v-agajavelly · ‎08-06-2025

Hi @JoMont ,

I understand you're trying to group rows in two scenarios:

When the same medical registration number and discipline have overlapping date ranges (e.g. same placement split across rows).
When the same person and discipline have continuous dates (i.e., the next placement starts the day after the last one ends).

Here's a working solution in Power Query that handles both cases reliably even on large datasets:

Steps in Power Query.

Load your table into Power Query.
Make sure Start Date and End Date are of type data.
Paste the following code into the Advanced Editor:

let
    Source = YourTableNameHere,  // replace with actual step/table
    ChangedTypes = Table.TransformColumnTypes(Source, {{"Start Date", type date},{"End Date", type date},{"FTE equivalent", type number}}),
    Sorted = Table.Sort(ChangedTypes, {{"Medical Registration Number", Order.Ascending},{"Discipline Type", Order.Ascending},{"Start Date", Order.Ascending}}),
    AddIndex = Table.AddIndexColumn(Sorted, "IndexHelper", 0, 1),
    AddPrevious = Table.AddColumn(AddIndex, "Previous", each try AddIndex{[IndexHelper]-1} otherwise null),
    FlagNewGroup = Table.AddColumn(AddPrevious, "NewGroup", each
    let
    curr = [Medical Registration Number],
    prev = [Previous][Medical Registration Number],
    sameDisc = [Discipline Type] = [Previous][Discipline Type],
    isContiguous = [Start Date] <= Date.AddDays([Previous][End Date], 1),
    samePerson = curr = prev
    in
if samePerson and sameDisc and isContiguous then 0 else 1),
RemovePrev = Table.RemoveColumns(FlagNewGroup, {"Previous"}),
AddRunningGroup = Table.AddIndexColumn(RemovePrev, "RowIndex", 0, 1),
AddGroupID = Table.AddColumn(AddRunningGroup, "GroupID", each List.Sum(List.FirstN(RemovePrev[NewGroup], [RowIndex]+1))),
Grouped = Table.Group(AddGroupID,{"Medical Registration Number", "Discipline Type", "GroupID"}, {{"Start Date", each List.Min([Start Date]), type date},
{"End Date", each List.Max([End Date]), type date},{"FTE equivalent", each List.Sum([FTE equivalent]), type number}}),
AddFinalIndex = Table.AddIndexColumn(Grouped, "Index", 1, 1)
    in
        AddFinalIndex

After all above steps follow bellow steps.

Merges rows for the same person + same discipline where dates either overlap or are continuous.
Sums the FTE equivalent.
Keeps only one record per logical placement block.

Let me know if you'd like to retain other columns happy to help adapt further.

Regards,
Akhil.

JoMont · ‎08-07-2025

Thank you for the code. So far it throws this error:

I tried amending the changed type code to the below - to ensure there were no 'Null" types

ChangedTypes = Table.TransformColumnTypes(#"Removed Columns1", {{"Start Date", type date},{"End Date", type date},{"FTE equivalent", type number},{"Medical Registration Number",type text},{"Discipline Type",type text}}),

But I still can't figure out where the type Null is coming from? Grateful for your help.

FYI, I am doing this for work, and live in Australia. So it might be a little while until I get back to you while I enjoy my weekend!

Group rows with same dates and consecutive dates

Helpful resources

Fabric Data Days

Power BI Monthly Update - October 2025

FabCon Atlanta 2026

Fabric Data Days starts November 4th!

Group rows with same dates and consecutive dates

Helpful resources

Fabric Data Days

Power BI Monthly Update - October 2025

FabCon Atlanta 2026