Advance your Data & AI career with 50 days of live learning, dataviz contests, hands-on challenges, study groups & certifications and more!
Get registeredGet Fabric Certified for FREE during Fabric Data Days. Don't miss your chance! Request now
Hi,
I have data with medical registration numbers, discipline types, start and end dates and FTE equivalent, about medical training placements. There are other columns, but they are not material to the issue (although I do want them to appear in the final table as well).
Index | Medical Registration Number | Discipline Type | Start Date | End Date | FTE equivalent |
1 | MED0143 | General Medicine | 15/10/2023 | 28/10/2023 | 1.86 |
2 | MED0143 | General Medicine | 29/10/2023 | 6/01/2024 | 10 |
3 | MED0497 | Obstetrics and Gynaecology | 8/01/2023 | 15/04/2023 | 10.5 |
4 | MED0497 | Obstetrics and Gynaecology | 8/01/2023 | 15/04/2023 | 3.5 |
5 | MED0629 | Paediatrics | 8/01/2023 | 15/04/2023 | 14 |
6 | MED0629 | Obstetrics and Gynaecology | 16/04/2023 | 22/07/2023 | 14 |
7 | MED0629 | Intensive Care Unit | 23/07/2023 | 14/10/2023 | 12 |
I need help to group the rows
a) where a medical number is doing two different placements at the same time but in the same discipline (eg rows 3 and 4)
AND also
b) where a medical number is doing two different placements in the same discipline, but the date period is continuous (eg rows 1 and 2)
What I need for output is all the same columns, with the FTE Equivalent summed, and the date ranges showing the earliest and the latest dates eg
Index | Medical Registration Number | Discipline Type | Start Date | End Date | FTE equivalent |
1 | MED0143 | General Medicine | 15/10/2023 | 6/01/2024 | 11.86 |
2 | MED0497 | Obstetrics and Gynaecology | 8/01/2023 | 15/04/2023 | 14 |
3 | MED0629 | Paediatrics | 8/01/2023 | 15/04/2023 | 14 |
4 | MED0629 | Obstetrics and Gynaecology | 16/04/2023 | 22/07/2023 | 14 |
5 | MED0629 | Intensive Care Unit | 23/07/2023 | 14/10/2023 | 12 |
I have searched online extensively for an answer and have tried a lot of different PQ solutions but somehow I'm never quite able to adapt the code to work for me 😞
Looking forward to a reasonable solution that works well on a larger dataset than provided above.
Solved! Go to Solution.
Hi @JoMont,
Hope you are doing well.
Open advanced editor in power query and copy paste the below M-code.
Duplicate Entry is removed and all columns are retained.
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("rZRNbsIwEIWvErEmyJ78LyuKaBdpK1Gpi5ZFGiwUKXFQEpC4Tc/Sk3VMgdgmfyA2yI7w+2bmPfvzcxTOHgmho/FozjgrotQI2SqJE87w01NebpIqSnFpUZMSEwgAbqhjEio2Fm4+HvAnDEOxPpw9/J/S0XJ8VBdHZhkr1ozH+9v1f39knQbGqYO3IoqrJGbNn6griVvEpNDUyTTPsi1PKsGxg4nveNSuadZgGoE+mqPRwJkEgDyV1je/fo6tTe+SY9/Hpw6XRK+v32XFqiKJSyPiK2O+5xGL8zRf71USWCaBAYmQ9L62GARXk7TdCXoHwW1VKGk5V2FfXYUDEyzCd+sq3GGu0jOd2MeBCPrL+z/d1acNNcEbfic8yVXitbgqp5RS/U74d3hDoPUNCQZMS/EKcGP1ZpOohGmacAE3FqzY4WiM52xT5DuWMV5p6fQkefSIuI3vYVLGLE0jzvJt2dDR4BSCBABiEr+J1pdBvdtut4BKGAiO1nW8+FepiyfqLEjtpnvVlgUE1eqLLZpZaOPCWRHoNEcETRw9qfuqem/SFD9oix9q0rT7P4ijSOPqckpwySF4La17cxr6OXKWfw==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"Medical Registration Number" = _t, #"Discipline Type" = _t, #"Facility Type" = _t, #"Start Date" = _t, #"End Date" = _t, State = _t, MMM = _t, #"Discipline Group" = _t, #"FTE equivalent" = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Medical Registration Number", type text}, {"Discipline Type", type text}, {"Facility Type", type text}, {"Start Date", type date}, {"End Date", type date}, {"State", type text}, {"MMM", type text}, {"Discipline Group", type text}, {"FTE equivalent", type number}}),
// Sorting by Medical Registration Number and Start Date is important here.
// Without applying sorting logic, sometimes gives you different result.
Sort_Logic = Table.Sort(#"Changed Type",{{"Medical Registration Number", Order.Ascending}, {"Start Date", Order.Ascending}}),
Duplicate_Logic = Table.Distinct(Sort_Logic, {"Medical Registration Number", "Discipline Type", "Facility Type", "Start Date", "End Date", "State", "Discipline Group", "FTE equivalent"}),
Group_Logic_1 = Table.Group(Duplicate_Logic, {"Medical Registration Number", "Discipline Type","Facility Type", "State", "MMM", "Discipline Group"}, {{"AllRows", (x) => x}}, GroupKind.Local),
//Assuming your data has continuous date period where a medical number is doing two different placements in the same discipline.
// If Date range is not continous, sometimes gives you different result.
Transform_Logic = Table.TransformColumns(Group_Logic_1, {{"AllRows", (x) => Table.TransformColumns(x, {{"Start Date", (y) => List.Min(x[Start Date])},{"End Date", (y) => List.Max(x[End Date])}})}}),
Combine_Logic = Table.Combine(Transform_Logic[AllRows]),
Group_Logic_2 = Table.Group(Combine_Logic, {"Medical Registration Number", "Discipline Type", "Facility Type", "Start Date", "End Date", "State", "MMM", "Discipline Group"}, {{"FTE equivalent", each List.Sum([FTE equivalent]), type nullable number}}),
Output = Table.TransformColumnTypes(Group_Logic_2,{{"Medical Registration Number", type text}, {"Discipline Type", type text}, {"Start Date", type date}, {"End Date", type date}, {"FTE equivalent", type number}})
in
Output
Regards,
Balakrishnan_J
Did I answer your question? If Yes,
Then mark my post as a solution and click on the Thumbs Up 👍 to give Kudos.
Remember: You can mark multiple answers as a solution...
just in case you have overlapping periods...
let
fx_same_group = (x, y, max_date) => x{1} = y{1} and x{2} = y{2} and (x{3} <= max_date + #duration(1, 0, 0, 0)),
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
sort = Table.Sort(Source, {"Medical Registration Number","Discipline Type", "Start Date"}),
rows = List.Buffer(Table.ToList(sort, (x) => x)),
gen = List.Generate(
() => [i = 0, group_num = 1, dmax = rows{0}{4}],
(x) => x[i] < List.Count(rows),
(x) => [
i = x[i] + 1,
group_num = x[group_num] + Number.From(not fx_same_group(rows{i}, rows{x[i]}, x[dmax])),
dmax = if group_num = x[group_num] then List.Max({x[dmax], rows{i}{4}}) else rows{i}{4}
],
(x) => rows{x[i]} & {x[group_num]}
),
tbl = Table.FromList(gen, (x) => x, Table.ColumnNames(Source) & {"group"}),
result = Table.Group(
tbl,
{"group", "Medical Registration Number","Discipline Type"},
{
{"Start Date", (x) => List.Min(x[Start Date])},
{"End Date", (x) => List.Max(x[End Date])},
{"FTE equivalent", (x) => List.Sum(x[FTE equivalent])}
}
)
in
result
Hi @JoMont ,
I understand you're trying to group rows in two scenarios:
Here's a working solution in Power Query that handles both cases reliably even on large datasets:
Steps in Power Query.
let
Source = YourTableNameHere, // replace with actual step/table
ChangedTypes = Table.TransformColumnTypes(Source, {{"Start Date", type date},{"End Date", type date},{"FTE equivalent", type number}}),
Sorted = Table.Sort(ChangedTypes, {{"Medical Registration Number", Order.Ascending},{"Discipline Type", Order.Ascending},{"Start Date", Order.Ascending}}),
AddIndex = Table.AddIndexColumn(Sorted, "IndexHelper", 0, 1),
AddPrevious = Table.AddColumn(AddIndex, "Previous", each try AddIndex{[IndexHelper]-1} otherwise null),
FlagNewGroup = Table.AddColumn(AddPrevious, "NewGroup", each
let
curr = [Medical Registration Number],
prev = [Previous][Medical Registration Number],
sameDisc = [Discipline Type] = [Previous][Discipline Type],
isContiguous = [Start Date] <= Date.AddDays([Previous][End Date], 1),
samePerson = curr = prev
in
if samePerson and sameDisc and isContiguous then 0 else 1),
RemovePrev = Table.RemoveColumns(FlagNewGroup, {"Previous"}),
AddRunningGroup = Table.AddIndexColumn(RemovePrev, "RowIndex", 0, 1),
AddGroupID = Table.AddColumn(AddRunningGroup, "GroupID", each List.Sum(List.FirstN(RemovePrev[NewGroup], [RowIndex]+1))),
Grouped = Table.Group(AddGroupID,{"Medical Registration Number", "Discipline Type", "GroupID"}, {{"Start Date", each List.Min([Start Date]), type date},
{"End Date", each List.Max([End Date]), type date},{"FTE equivalent", each List.Sum([FTE equivalent]), type number}}),
AddFinalIndex = Table.AddIndexColumn(Grouped, "Index", 1, 1)
in
AddFinalIndex
After all above steps follow bellow steps.
Let me know if you'd like to retain other columns happy to help adapt further.
Regards,
Akhil.
Thank you for the code. So far it throws this error:
I tried amending the changed type code to the below - to ensure there were no 'Null" types
ChangedTypes = Table.TransformColumnTypes(#"Removed Columns1", {{"Start Date", type date},{"End Date", type date},{"FTE equivalent", type number},{"Medical Registration Number",type text},{"Discipline Type",type text}}),
But I still can't figure out where the type Null is coming from? Grateful for your help.
FYI, I am doing this for work, and live in Australia. So it might be a little while until I get back to you while I enjoy my weekend!
Advance your Data & AI career with 50 days of live learning, contests, hands-on challenges, study groups & certifications and more!
Check out the October 2025 Power BI update to learn about new features.
| User | Count |
|---|---|
| 8 | |
| 7 | |
| 7 | |
| 4 | |
| 3 |