Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Get Fabric Certified for FREE during Fabric Data Days. Don't miss your chance! Request now

Reply
JoMont
Frequent Visitor

Group rows with same dates and consecutive dates

Hi,

I have data with medical registration numbers, discipline types, start and end dates and FTE equivalent, about medical training placements. There are other columns, but they are not material to the issue (although I do want them to appear in the final table as well).

Index

Medical Registration Number

Discipline Type

Start Date

End Date

FTE equivalent

1

MED0143

General Medicine

15/10/2023

28/10/2023

1.86

2

MED0143

General Medicine

29/10/2023

6/01/2024

10

3

MED0497

Obstetrics and Gynaecology

8/01/2023

15/04/2023

10.5

4

MED0497

Obstetrics and Gynaecology

8/01/2023

15/04/2023

3.5

5

MED0629

Paediatrics

8/01/2023

15/04/2023

14

6

MED0629

Obstetrics and Gynaecology

16/04/2023

22/07/2023

14

7

MED0629

Intensive Care Unit

23/07/2023

14/10/2023

12

 

I need help to group the rows 

a) where a medical number is doing two different placements at the same time but in the same discipline (eg rows 3 and 4)

 

AND also

 

b) where a medical number is doing two different placements in the same discipline, but the date period is continuous (eg rows 1 and 2)

 What I need for output is all the same columns, with the FTE Equivalent summed, and the date ranges showing the earliest and the latest dates eg

Index

Medical Registration Number

Discipline Type

Start Date

End Date

FTE equivalent

1

MED0143

General Medicine

15/10/2023

6/01/2024

11.86

2

MED0497

Obstetrics and Gynaecology

8/01/2023

15/04/2023

14

3

MED0629

Paediatrics

8/01/2023

15/04/2023

14

4

MED0629

Obstetrics and Gynaecology

16/04/2023

22/07/2023

14

5

MED0629

Intensive Care Unit

23/07/2023

14/10/2023

12

 

I have searched online extensively for an answer and have tried a lot of different PQ solutions but somehow I'm never quite able to adapt the code to work for me 😞

 

Looking forward to a reasonable solution that works well on a larger dataset than provided above. 

 

1 ACCEPTED SOLUTION

Hi @JoMont,
Hope you are doing well.

 

Open advanced editor in power query and copy paste the below M-code.

Duplicate Entry is removed and all columns are retained.

 

 

let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("rZRNbsIwEIWvErEmyJ78LyuKaBdpK1Gpi5ZFGiwUKXFQEpC4Tc/Sk3VMgdgmfyA2yI7w+2bmPfvzcxTOHgmho/FozjgrotQI2SqJE87w01NebpIqSnFpUZMSEwgAbqhjEio2Fm4+HvAnDEOxPpw9/J/S0XJ8VBdHZhkr1ozH+9v1f39knQbGqYO3IoqrJGbNn6griVvEpNDUyTTPsi1PKsGxg4nveNSuadZgGoE+mqPRwJkEgDyV1je/fo6tTe+SY9/Hpw6XRK+v32XFqiKJSyPiK2O+5xGL8zRf71USWCaBAYmQ9L62GARXk7TdCXoHwW1VKGk5V2FfXYUDEyzCd+sq3GGu0jOd2MeBCPrL+z/d1acNNcEbfic8yVXitbgqp5RS/U74d3hDoPUNCQZMS/EKcGP1ZpOohGmacAE3FqzY4WiM52xT5DuWMV5p6fQkefSIuI3vYVLGLE0jzvJt2dDR4BSCBABiEr+J1pdBvdtut4BKGAiO1nW8+FepiyfqLEjtpnvVlgUE1eqLLZpZaOPCWRHoNEcETRw9qfuqem/SFD9oix9q0rT7P4ijSOPqckpwySF4La17cxr6OXKWfw==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"Medical Registration Number" = _t, #"Discipline Type" = _t, #"Facility Type" = _t, #"Start Date" = _t, #"End Date" = _t, State = _t, MMM = _t, #"Discipline Group" = _t, #"FTE equivalent" = _t]),

    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Medical Registration Number", type text}, {"Discipline Type", type text}, {"Facility Type", type text}, {"Start Date", type date}, {"End Date", type date}, {"State", type text}, {"MMM", type text}, {"Discipline Group", type text}, {"FTE equivalent", type number}}),

// Sorting by Medical Registration Number and Start Date is important here.
// Without applying sorting logic, sometimes gives you different result.

Sort_Logic = Table.Sort(#"Changed Type",{{"Medical Registration Number", Order.Ascending}, {"Start Date", Order.Ascending}}),

Duplicate_Logic = Table.Distinct(Sort_Logic, {"Medical Registration Number", "Discipline Type", "Facility Type", "Start Date", "End Date", "State", "Discipline Group", "FTE equivalent"}),


Group_Logic_1 = Table.Group(Duplicate_Logic, {"Medical Registration Number", "Discipline Type","Facility Type", "State", "MMM", "Discipline Group"}, {{"AllRows", (x) => x}}, GroupKind.Local),

//Assuming your data has continuous date period where a medical number is doing two different placements in the same discipline.
// If Date range is not continous, sometimes gives you different result.

Transform_Logic = Table.TransformColumns(Group_Logic_1, {{"AllRows", (x) => Table.TransformColumns(x, {{"Start Date", (y) => List.Min(x[Start Date])},{"End Date", (y) => List.Max(x[End Date])}})}}),

Combine_Logic = Table.Combine(Transform_Logic[AllRows]),

Group_Logic_2 = Table.Group(Combine_Logic, {"Medical Registration Number", "Discipline Type", "Facility Type", "Start Date", "End Date", "State", "MMM", "Discipline Group"}, {{"FTE equivalent", each List.Sum([FTE equivalent]), type nullable number}}),
Output = Table.TransformColumnTypes(Group_Logic_2,{{"Medical Registration Number", type text}, {"Discipline Type", type text}, {"Start Date", type date}, {"End Date", type date}, {"FTE equivalent", type number}})
in
    Output

 

 

Regards,
Balakrishnan_J

Did I answer your question? If Yes,
Then mark my post as a solution and click on the Thumbs Up 👍 to give Kudos.
Remember: You can mark multiple answers as a solution...

View solution in original post

23 REPLIES 23
AlienSx
Super User
Super User

just in case you have overlapping periods...

let
    fx_same_group = (x, y, max_date) => x{1} = y{1} and x{2} = y{2} and (x{3} <= max_date  + #duration(1, 0, 0, 0)),
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    sort = Table.Sort(Source, {"Medical Registration Number","Discipline Type", "Start Date"}),
    rows = List.Buffer(Table.ToList(sort, (x) => x)), 
    gen = List.Generate(
        () => [i = 0, group_num = 1, dmax = rows{0}{4}],
        (x) => x[i] < List.Count(rows),
        (x) => [
            i = x[i] + 1,
            group_num = x[group_num] + Number.From(not fx_same_group(rows{i}, rows{x[i]}, x[dmax])),
            dmax = if group_num = x[group_num] then List.Max({x[dmax], rows{i}{4}}) else rows{i}{4}
        ],
        (x) => rows{x[i]} & {x[group_num]}
    ),
    tbl = Table.FromList(gen, (x) => x, Table.ColumnNames(Source) & {"group"}),
    result = Table.Group(
        tbl, 
        {"group", "Medical Registration Number","Discipline Type"},
        {
            {"Start Date", (x) => List.Min(x[Start Date])},
            {"End Date", (x) => List.Max(x[End Date])},
            {"FTE equivalent", (x) => List.Sum(x[FTE equivalent])}
        }
    )
in
    result

 

v-agajavelly
Community Support
Community Support

Hi @JoMont ,

I understand you're trying to group rows in two scenarios:

  1. When the same medical registration number and discipline have overlapping date ranges (e.g. same placement split across rows).
  2. When the same person and discipline have continuous dates (i.e., the next placement starts the day after the last one ends).

Here's a working solution in Power Query that handles both cases reliably  even on large datasets:

Steps in Power Query.

  1. Load your table into Power Query.
  2. Make sure Start Date and End Date are of type data.
  3. Paste the following code into the Advanced Editor:
let
    Source = YourTableNameHere,  // replace with actual step/table
ChangedTypes = Table.TransformColumnTypes(Source, {{"Start Date", type date},{"End Date", type date},{"FTE equivalent", type number}}),
Sorted = Table.Sort(ChangedTypes, {{"Medical Registration Number", Order.Ascending},{"Discipline Type", Order.Ascending},{"Start Date", Order.Ascending}}),
AddIndex = Table.AddIndexColumn(Sorted, "IndexHelper", 0, 1),
AddPrevious = Table.AddColumn(AddIndex, "Previous", each try AddIndex{[IndexHelper]-1} otherwise null),
FlagNewGroup = Table.AddColumn(AddPrevious, "NewGroup", each
let
curr = [Medical Registration Number],
prev = [Previous][Medical Registration Number],
sameDisc = [Discipline Type] = [Previous][Discipline Type],
isContiguous = [Start Date] <= Date.AddDays([Previous][End Date], 1),
samePerson = curr = prev
in
if samePerson and sameDisc and isContiguous then 0 else 1),
RemovePrev = Table.RemoveColumns(FlagNewGroup, {"Previous"}),
AddRunningGroup = Table.AddIndexColumn(RemovePrev, "RowIndex", 0, 1),
AddGroupID = Table.AddColumn(AddRunningGroup, "GroupID", each List.Sum(List.FirstN(RemovePrev[NewGroup], [RowIndex]+1))),
Grouped = Table.Group(AddGroupID,{"Medical Registration Number", "Discipline Type", "GroupID"}, {{"Start Date", each List.Min([Start Date]), type date},
{"End Date", each List.Max([End Date]), type date},{"FTE equivalent", each List.Sum([FTE equivalent]), type number}}),
AddFinalIndex = Table.AddIndexColumn(Grouped, "Index", 1, 1)
in
AddFinalIndex


After all above steps follow bellow steps.

  • Merges rows for the same person + same discipline where dates either overlap or are continuous.
  • Sums the FTE equivalent.
  • Keeps only one record per logical placement block.

Let me know if you'd like to retain other columns happy to help adapt further.

Regards,
Akhil.

Thank you for the code. So far it throws this error:

JoMont_0-1754616972910.png

I tried amending the changed type code to the below - to ensure there were no 'Null" types

 

ChangedTypes = Table.TransformColumnTypes(#"Removed Columns1", {{"Start Date", type date},{"End Date", type date},{"FTE equivalent", type number},{"Medical Registration Number",type text},{"Discipline Type",type text}}),

 

But I still can't figure out where the type Null is coming from? Grateful for your help.

 

FYI, I am doing this for work, and live in Australia. So it might be a little while until I get back to you while I enjoy my weekend!

 

Helpful resources

Announcements
Fabric Data Days Carousel

Fabric Data Days

Advance your Data & AI career with 50 days of live learning, contests, hands-on challenges, study groups & certifications and more!

October Power BI Update Carousel

Power BI Monthly Update - October 2025

Check out the October 2025 Power BI update to learn about new features.

FabCon Atlanta 2026 carousel

FabCon Atlanta 2026

Join us at FabCon Atlanta, March 16-20, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.