The ultimate Fabric, Power BI, SQL, and AI community-led learning event. Save €200 with code FABCOMM.
Get registeredCompete to become Power BI Data Viz World Champion! First round ends August 18th. Get started.
Hi,
I have data with medical registration numbers, discipline types, start and end dates and FTE equivalent, about medical training placements. There are other columns, but they are not material to the issue (although I do want them to appear in the final table as well).
Index | Medical Registration Number | Discipline Type | Start Date | End Date | FTE equivalent |
1 | MED0143 | General Medicine | 15/10/2023 | 28/10/2023 | 1.86 |
2 | MED0143 | General Medicine | 29/10/2023 | 6/01/2024 | 10 |
3 | MED0497 | Obstetrics and Gynaecology | 8/01/2023 | 15/04/2023 | 10.5 |
4 | MED0497 | Obstetrics and Gynaecology | 8/01/2023 | 15/04/2023 | 3.5 |
5 | MED0629 | Paediatrics | 8/01/2023 | 15/04/2023 | 14 |
6 | MED0629 | Obstetrics and Gynaecology | 16/04/2023 | 22/07/2023 | 14 |
7 | MED0629 | Intensive Care Unit | 23/07/2023 | 14/10/2023 | 12 |
I need help to group the rows
a) where a medical number is doing two different placements at the same time but in the same discipline (eg rows 3 and 4)
AND also
b) where a medical number is doing two different placements in the same discipline, but the date period is continuous (eg rows 1 and 2)
What I need for output is all the same columns, with the FTE Equivalent summed, and the date ranges showing the earliest and the latest dates eg
Index | Medical Registration Number | Discipline Type | Start Date | End Date | FTE equivalent |
1 | MED0143 | General Medicine | 15/10/2023 | 6/01/2024 | 11.86 |
2 | MED0497 | Obstetrics and Gynaecology | 8/01/2023 | 15/04/2023 | 14 |
3 | MED0629 | Paediatrics | 8/01/2023 | 15/04/2023 | 14 |
4 | MED0629 | Obstetrics and Gynaecology | 16/04/2023 | 22/07/2023 | 14 |
5 | MED0629 | Intensive Care Unit | 23/07/2023 | 14/10/2023 | 12 |
I have searched online extensively for an answer and have tried a lot of different PQ solutions but somehow I'm never quite able to adapt the code to work for me 😞
Looking forward to a reasonable solution that works well on a larger dataset than provided above.
If your data is sorted as you show, so that rows that should be together are listed together, then you can create the Grouping logic using the fourth and fifth arguments of the Table.Group function:
let
//Replace next line with your actual data source
Source = Excel.CurrentWorkbook(){[Name="Table3"]}[Content],
//set the column data types
#"Changed Type" = Table.TransformColumnTypes(Source,{
{"Index", Int64.Type}, {"Medical Registration Number", type text},
{"Discipline Type", type text}, {"Start Date", type date},
{"End Date", type date}, {"FTE equivalent", type number}}),
//Add a shifted end date column for comparisons
#"Add Shifted End Date" = Table.FromColumns(
Table.ToColumns(#"Changed Type")
& {{null} & List.RemoveLastN(#"Changed Type"[End Date],1)},
type table[Index=Int64.Type, Medical Registration Number=text, Discipline Type=text,
Start Date=date, End Date=date, FTE equivalent=number, Shifted End Date=date]),
//Group rows by the columns we will use to determine the local groups
#"Grouped Rows" = Table.Group(#"Add Shifted End Date",
{"Medical Registration Number", "Discipline Type", "Start Date", "End Date","Shifted End Date"}, {
{"FTE Equivalent", each List.Sum([FTE equivalent]), type number},
{"Start Date.1", each List.Min([Start Date]), type date},
{"End Date.1", each List.Max([End Date]), type date}
}, GroupKind.Local,(x,y)=>Number.From(
//logic to replicate the groupings
((x[Medical Registration Number] <> y[Medical Registration Number]) or (x[Discipline Type] <> y[Discipline Type])
or (Duration.Days(y[Start Date] - y[Shifted End Date]) <> 1))
and (List.RemoveLastN(Record.FieldValues(x),1) <> List.RemoveLastN(Record.FieldValues(y),1))
)),
//Remove unneeded columns
#"Removed Columns" = Table.RemoveColumns(#"Grouped Rows",{"Start Date", "End Date", "Shifted End Date"}),
//Rename the Date columns
#"Renamed Columns" = Table.RenameColumns(#"Removed Columns",{{"Start Date.1", "Start Date"}, {"End Date.1", "End Date"}}),
//Add index column for numbering
#"Added Index" = Table.AddIndexColumn(#"Renamed Columns", "Index", 1, 1, Int64.Type),
//Re-arrange the column order
#"Reordered Columns" = Table.ReorderColumns(#"Added Index",{"Index", "Medical Registration Number", "Discipline Type", "Start Date", "End Date", "FTE Equivalent"})
in
#"Reordered Columns"
Your data
Results from PQ:
Hi @JoMont,
Hope you are doing well!
I have used Grouping and Transformation Technique to handle the issue. Find the below M-Code.
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("rZGxasMwEIZfRWiOwt1JtuW5LSFDaJdOJoPqiCAICtimkLePpaiplILbodv9Et/H/VLXceQrvnt5BlRynjbW28Gc2M4eXO+8nY+wEgiCgMI96SzgWtd8v+o4/eagNsOgFoAhqOCAaJDJoNpmnl4/xslOg+tHZvyBbS7e2P58Oh8vAdcJl7flQN0DrKtoU/9hk0lWJVlN7Ty9mbmVibblXVSE6wJe3ATrDCcS0Dy4msK19ZP1o/u07MkMlr17NwVOFlz+WRQl+ksi4WcbFED3DfKQfqkt4MU2ocA33giQD20Q/i5DnfNt/k43Fy60IshgCSW8vwI=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Index = _t, #"Medical Registration Number" = _t, #"Discipline Type" = _t, #"Start Date" = _t, #"End Date" = _t, #"FTE equivalent" = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Index", Int64.Type}, {"Medical Registration Number", type text}, {"Discipline Type", type text}, {"Start Date", type date}, {"End Date", type date}, {"FTE equivalent", type number}}),
// Sorting by Medical Registration Number and Start Date is important here.
// Without applying sorting logic, sometimes gives you different result.
Sort_Logic = Table.Sort(#"Changed Type",{{"Medical Registration Number", Order.Ascending}, {"Start Date", Order.Ascending}}),
Group_Logic_1 = Table.Group(Sort_Logic, {"Medical Registration Number", "Discipline Type"}, {{"AllRows", (x) => x}}, GroupKind.Local),
//Assuming your data has continuous date period where a medical number is doing two different placements in the same discipline.
// If Date range is not continous, sometimes gives you different result.
Transform_Logic = Table.TransformColumns(Group_Logic_1, {{"AllRows", (x) => Table.TransformColumns(x, {{"Start Date", (y) => List.Min(x[Start Date])},{"End Date", (y) => List.Max(x[End Date])}})}}),
Combine_Logic = Table.Combine(Transform_Logic[AllRows]),
Group_Logic_2 = Table.Group(Combine_Logic, {"Medical Registration Number", "Discipline Type", "Start Date", "End Date"}, {{"FTE equivalent", each List.Sum([FTE equivalent]), type nullable number}}),
Output = Table.TransformColumnTypes(Group_Logic_2,{{"Medical Registration Number", type text}, {"Discipline Type", type text}, {"Start Date", type date}, {"End Date", type date}, {"FTE equivalent", type number}})
in
Output
Output Snapshot:
Regards,
Balakrishnan_J
Did I answer your question? If Yes,
Then mark my post as a solution and click on the Thumbs Up 👍 to give Kudos.
Remember: You can mark multiple answers as a solution...
Hi,
Thank you, but I need some additional help. I have this existing code, which creates the column "FTE Equivalent" and I'm not sure how to work in your 'let t = ' statements in the Source line, once I've created the "FTE Equivalent" column.
Here is my existing code:
let
Source = Excel.CurrentWorkbook(){[Name="Table2"]}[Content],
#"Removed Columns" = Table.RemoveColumns(Source,{"Placement Number", "Column1", "Facility Name", "Address Line 1", "Address Line 2", "Street Address / Address Line 3", "Suburb/town", "Postcode#(tab)"}),
#"Trimmed Text" = Table.TransformColumns(#"Removed Columns",{{"Discipline Type", Text.Trim, type text}}),
#"Added Conditional Column" = Table.AddColumn(#"Trimmed Text", "Discipline Group", each if [Discipline Type] = "Acute Medicine" then "Medical" else if [Discipline Type] = "Cardiology" then "Medical" else if [Discipline Type] = "General Medicine" then "Medical" else if [Discipline Type] = "Gastroenterology" then "Medical" else if [Discipline Type] = "Geriatrics" then "Medical" else if [Discipline Type] = "Internal Medicine" then "Medical" else if [Discipline Type] = "Medical or Ward Call" then "Medical" else if [Discipline Type] = "Paediatrics" then "Medical" else if [Discipline Type] = "Palliative Care" then "Medical" else if [Discipline Type] = "Respiratory" then "Medical" else if [Discipline Type] = "Rehabilitation" then "Medical" else if [Discipline Type] = "Rural Medicine" then "Medical" else if [Discipline Type] = "Sub Acute" then "Medical" else if [Discipline Type] = "Anaesthetics" then "Anaesthetics" else if [Discipline Type] = "ENT" then "Surgical" else if [Discipline Type] = "General Surgery" then "Surgical" else if [Discipline Type] = "General surgery" then "Surgical" else if [Discipline Type] = "Neurosurgery" then "Surgical" else if [Discipline Type] = "Orthopaedics" then "Surgical" else if [Discipline Type] = "Plastic Surgery" then "Surgical" else if [Discipline Type] = "Emergency Medicine" then "Emergency" else if [Discipline Type] = "Intensive Care Unit" then "ICU" else if [Discipline Type] = "Psychiatry" then "Psychiatry" else if [Discipline Type] = "Mental Health" then "Psychiatry" else if [Discipline Type] = "General Practice" then "Community" else if [Discipline Type] = "Rural Community" then "Community" else if [Discipline Type] = "Residency Program" then "Community" else if [Discipline Type] = "PIERCE" then "Community" else if [Discipline Type] = "Primary Health" then "Community" else if [Discipline Type] = "John Flynn Prevocational Doctor Program" then "Community" else if [Discipline Type] = "Rural Junior Doctor Training Innovation Fund" then "Community" else if [Discipline Type] = "Rural Community Residency Program" then "Community" else if [Discipline Type] = "Obstetrics and Gynaecology" then "Obstetrics & Gynaecology" else if [Discipline Type] = "Radiology" then "Radiology" else if [Discipline Type] = "Clinical Service Improvement" then "Miscellaneous" else if [Discipline Type] = "Relieving" then "Miscellaneous" else if [Discipline Type] = "Research" then "Miscellaneous" else if [Discipline Type] = "Composite" then "Miscellaneous" else if [Discipline Type] = "Medical Education" then "Miscellaneous" else if [Discipline Type] = "Oncology" then "Medical" else if [Discipline Type] = "Neurology" then "Medical" else null),
#"Added Custom" = Table.AddColumn(#"Added Conditional Column", "FTE equivalent", each if [Full time Weeks] > 1 then [Full time Weeks] else ((Duration.Days ([End Date] - [Start Date])+1)/7)*[Full time Weeks]),
#"Removed Columns1" = Table.RemoveColumns(#"Added Custom",{"Full time Weeks"})
in
#Removed Columns1
Hi,
Thanks so much for the amended code, it works perfectly. However, I lose some of the columns I need, such as Discipline Group, State, MMM (which I had to leave out of my original request to avoid overrunning the character limit)
This info should just be carried across, no need to group (they should be the same for both circumstances of grouping but I think there's one or two that aren't...)
but do I just add these columns to the group statement?
For privacy reasons I can't share the spreadsheet sorry
This message shows you responding to yourself. For clarity, it would be helpful if you could includ the name of the person you are responding to. If you type "@" you will see some choices.
And perhaps you could add sample columns to your data sample with made up data that are representative of your actual data layout. That way things can be tested without violating confidentiality.
Thanks for the tips, you can tell I haven't used these kind of forums much before 🙂
Hi @JoMont,
Hope you are doing well.
Open advanced editor in power query and copy paste the below M-code.
Duplicate Entry is removed and all columns are retained.
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("rZRNbsIwEIWvErEmyJ78LyuKaBdpK1Gpi5ZFGiwUKXFQEpC4Tc/Sk3VMgdgmfyA2yI7w+2bmPfvzcxTOHgmho/FozjgrotQI2SqJE87w01NebpIqSnFpUZMSEwgAbqhjEio2Fm4+HvAnDEOxPpw9/J/S0XJ8VBdHZhkr1ozH+9v1f39knQbGqYO3IoqrJGbNn6griVvEpNDUyTTPsi1PKsGxg4nveNSuadZgGoE+mqPRwJkEgDyV1je/fo6tTe+SY9/Hpw6XRK+v32XFqiKJSyPiK2O+5xGL8zRf71USWCaBAYmQ9L62GARXk7TdCXoHwW1VKGk5V2FfXYUDEyzCd+sq3GGu0jOd2MeBCPrL+z/d1acNNcEbfic8yVXitbgqp5RS/U74d3hDoPUNCQZMS/EKcGP1ZpOohGmacAE3FqzY4WiM52xT5DuWMV5p6fQkefSIuI3vYVLGLE0jzvJt2dDR4BSCBABiEr+J1pdBvdtut4BKGAiO1nW8+FepiyfqLEjtpnvVlgUE1eqLLZpZaOPCWRHoNEcETRw9qfuqem/SFD9oix9q0rT7P4ijSOPqckpwySF4La17cxr6OXKWfw==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"Medical Registration Number" = _t, #"Discipline Type" = _t, #"Facility Type" = _t, #"Start Date" = _t, #"End Date" = _t, State = _t, MMM = _t, #"Discipline Group" = _t, #"FTE equivalent" = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Medical Registration Number", type text}, {"Discipline Type", type text}, {"Facility Type", type text}, {"Start Date", type date}, {"End Date", type date}, {"State", type text}, {"MMM", type text}, {"Discipline Group", type text}, {"FTE equivalent", type number}}),
// Sorting by Medical Registration Number and Start Date is important here.
// Without applying sorting logic, sometimes gives you different result.
Sort_Logic = Table.Sort(#"Changed Type",{{"Medical Registration Number", Order.Ascending}, {"Start Date", Order.Ascending}}),
Duplicate_Logic = Table.Distinct(Sort_Logic, {"Medical Registration Number", "Discipline Type", "Facility Type", "Start Date", "End Date", "State", "Discipline Group", "FTE equivalent"}),
Group_Logic_1 = Table.Group(Duplicate_Logic, {"Medical Registration Number", "Discipline Type","Facility Type", "State", "MMM", "Discipline Group"}, {{"AllRows", (x) => x}}, GroupKind.Local),
//Assuming your data has continuous date period where a medical number is doing two different placements in the same discipline.
// If Date range is not continous, sometimes gives you different result.
Transform_Logic = Table.TransformColumns(Group_Logic_1, {{"AllRows", (x) => Table.TransformColumns(x, {{"Start Date", (y) => List.Min(x[Start Date])},{"End Date", (y) => List.Max(x[End Date])}})}}),
Combine_Logic = Table.Combine(Transform_Logic[AllRows]),
Group_Logic_2 = Table.Group(Combine_Logic, {"Medical Registration Number", "Discipline Type", "Facility Type", "Start Date", "End Date", "State", "MMM", "Discipline Group"}, {{"FTE equivalent", each List.Sum([FTE equivalent]), type nullable number}}),
Output = Table.TransformColumnTypes(Group_Logic_2,{{"Medical Registration Number", type text}, {"Discipline Type", type text}, {"Start Date", type date}, {"End Date", type date}, {"FTE equivalent", type number}})
in
Output
Regards,
Balakrishnan_J
Did I answer your question? If Yes,
Then mark my post as a solution and click on the Thumbs Up 👍 to give Kudos.
Remember: You can mark multiple answers as a solution...
Hi @JoMont,
Hope you are doing well.
Open Advanced Editor of Power Query and copy paste the below M-code,
let
Source = Excel.CurrentWorkbook(){[Name="Table2"]}[Content],
#"Removed Columns" = Table.RemoveColumns(Source,{"Placement Number", "Column1", "Facility Name", "Address Line 1", "Address Line 2", "Street Address / Address Line 3", "Suburb/town", "Postcode#(tab)"}),
#"Trimmed Text" = Table.TransformColumns(#"Removed Columns",{{"Discipline Type", Text.Trim, type text}}),
#"Added Conditional Column" = Table.AddColumn(#"Trimmed Text", "Discipline Group", each if [Discipline Type] = "Acute Medicine" then "Medical" else if [Discipline Type] = "Cardiology" then "Medical" else if [Discipline Type] = "General Medicine" then "Medical" else if [Discipline Type] = "Gastroenterology" then "Medical" else if [Discipline Type] = "Geriatrics" then "Medical" else if [Discipline Type] = "Internal Medicine" then "Medical" else if [Discipline Type] = "Medical or Ward Call" then "Medical" else if [Discipline Type] = "Paediatrics" then "Medical" else if [Discipline Type] = "Palliative Care" then "Medical" else if [Discipline Type] = "Respiratory" then "Medical" else if [Discipline Type] = "Rehabilitation" then "Medical" else if [Discipline Type] = "Rural Medicine" then "Medical" else if [Discipline Type] = "Sub Acute" then "Medical" else if [Discipline Type] = "Anaesthetics" then "Anaesthetics" else if [Discipline Type] = "ENT" then "Surgical" else if [Discipline Type] = "General Surgery" then "Surgical" else if [Discipline Type] = "General surgery" then "Surgical" else if [Discipline Type] = "Neurosurgery" then "Surgical" else if [Discipline Type] = "Orthopaedics" then "Surgical" else if [Discipline Type] = "Plastic Surgery" then "Surgical" else if [Discipline Type] = "Emergency Medicine" then "Emergency" else if [Discipline Type] = "Intensive Care Unit" then "ICU" else if [Discipline Type] = "Psychiatry" then "Psychiatry" else if [Discipline Type] = "Mental Health" then "Psychiatry" else if [Discipline Type] = "General Practice" then "Community" else if [Discipline Type] = "Rural Community" then "Community" else if [Discipline Type] = "Residency Program" then "Community" else if [Discipline Type] = "PIERCE" then "Community" else if [Discipline Type] = "Primary Health" then "Community" else if [Discipline Type] = "John Flynn Prevocational Doctor Program" then "Community" else if [Discipline Type] = "Rural Junior Doctor Training Innovation Fund" then "Community" else if [Discipline Type] = "Rural Community Residency Program" then "Community" else if [Discipline Type] = "Obstetrics and Gynaecology" then "Obstetrics & Gynaecology" else if [Discipline Type] = "Radiology" then "Radiology" else if [Discipline Type] = "Clinical Service Improvement" then "Miscellaneous" else if [Discipline Type] = "Relieving" then "Miscellaneous" else if [Discipline Type] = "Research" then "Miscellaneous" else if [Discipline Type] = "Composite" then "Miscellaneous" else if [Discipline Type] = "Medical Education" then "Miscellaneous" else if [Discipline Type] = "Oncology" then "Medical" else if [Discipline Type] = "Neurology" then "Medical" else null),
#"Added Custom" = Table.AddColumn(#"Added Conditional Column", "FTE equivalent", each if [Full time Weeks] > 1 then [Full time Weeks] else ((Duration.Days ([End Date] - [Start Date])+1)/7)*[Full time Weeks]),
#"Removed Columns1" = Table.RemoveColumns(#"Added Custom",{"Full time Weeks"}),
// Sorting by Medical Registration Number and Start Date is important here.
// Without applying sorting logic, sometimes gives you different result.
Sort_Logic = Table.Sort(#"Removed Columns1",{{"Medical Registration Number", Order.Ascending}, {"Start Date", Order.Ascending}}),
Group_Logic_1 = Table.Group(Sort_Logic, {"Medical Registration Number", "Discipline Type"}, {{"AllRows", (x) => x}}, GroupKind.Local),
//Assuming your data has continuous date period where a medical number is doing two different placements in the same discipline.
// If Date range is not continous, sometimes gives you different result.
Transform_Logic = Table.TransformColumns(Group_Logic_1, {{"AllRows", (x) => Table.TransformColumns(x, {{"Start Date", (y) => List.Min(x[Start Date])},{"End Date", (y) => List.Max(x[End Date])}})}}),
Combine_Logic = Table.Combine(Transform_Logic[AllRows]),
Group_Logic_2 = Table.Group(Combine_Logic, {"Medical Registration Number", "Discipline Type", "Start Date", "End Date"}, {{"FTE equivalent", each List.Sum([FTE equivalent]), type nullable number}}),
Output = Table.TransformColumnTypes(Group_Logic_2,{{"Medical Registration Number", type text}, {"Discipline Type", type text}, {"Start Date", type date}, {"End Date", type date}, {"FTE equivalent", type number}})
in
Output
Note: Kindly share excel file if the above code does not work.
Regards,
Balakrishnan_J
Did I answer your question? If Yes,
Then mark my post as a solution and click on the Thumbs Up 👍 to give Kudos.
Remember: You can mark multiple answers as a solution...
Thanks so much for the amended code, it works perfectly. However, I lose some of the columns I need, such as Discipline Group, State, MMM. I tried to put this in my original post but it told me weirdly I was over the character limit.
My Data set actually looks like this
Medical Registration Number | Discipline Type | Facility Type | Start Date | End Date | State | MMM | Discipline Group | FTE equivalent |
MED001 | General Medicine | Hospital | 31/10/2022 | 15/01/2023 | WA | MMM3 | Medical | 11 |
MED002 | Emergency Medicine | Hospital | 31/10/2022 | 15/01/2023 | WA | MMM3 | Emergency | 11 |
MED002 | General Practice | General Practice | 16/01/2023 | 30/12/2023 | WA | MMM3 | Community | 49.85714 |
MED003 | General Practice | General Practice | 2/01/2023 | 30/12/2023 | WA | MMM5 | Community | 25.92857 |
MED003 | Emergency Medicine | Hospital | 2/01/2023 | 30/12/2023 | WA | MMM4 | Emergency | 25.92857 |
MED004 | Emergency Medicine | Hospital | 31/10/2022 | 15/01/2023 | WA | MMM3 | Emergency | 11 |
MED005 | Obstetrics and Gynaecology | Hospital | 23/02/2022 | 15/01/2023 | WA | MMM3 | Obstetrics & Gynaecology | 46.71429 |
MED005 | Obstetrics and Gynaecology | Hospital | 16/01/2023 | 15/01/2024 | WA | MMM3 | Obstetrics & Gynaecology | 52.14286 |
MED006 | Emergency Medicine | Hospital | 2/11/2023 | 4/02/2024 | NT | MMM6 | Emergency | 12 |
MED007 | General Practice | General Practice | 17/10/2022 | 7/01/2023 | WA | MMM3 | Community | 11.85714 |
MED008 | General Medicine | Hospital | 31/10/2022 | 15/01/2023 | WA | MMM2 | Medical | 11 |
MED009 | Emergency Medicine | Hospital | 16/01/2023 | 26/03/2023 | WA | MMM3 | Emergency | 10 |
MED009 | Clinical Service Improvement | Hospital | 27/03/2023 | 11/06/2023 | WA | MMM3 | Miscellaneous | 11 |
MED009 | Obstetrics and Gynaecology | Hospital | 12/06/2023 | 20/08/2023 | WA | MMM3 | Obstetrics & Gynaecology | 10 |
MED009 | General Medicine | Hospital | 21/08/2023 | 29/10/2023 | WA | MMM3 | Medical | 10 |
MED009 | General Medicine | Hospital | 30/10/2023 | 14/01/2024 | WA | MMM3 | Medical | 11 |
MED010 | General Surgery | Hospital | 6/02/2023 | 11/06/2023 | WA | MMM2 | Surgical | 18 |
MED010 | Emergency Medicine | Hospital | 12/06/2023 | 21/08/2023 | WA | MMM3 | Emergency | 10.14286 |
MED010 | Emergency Medicine | Hospital | 21/08/2023 | 8/01/2024 | WA | MMM2 | Emergency | 10.07143 |
MED010 | Emergency Medicine | Hospital | 21/08/2023 | 8/01/2024 | WA | MMM3 | Emergency | 10.07143 |
1. How do I get the extra columns carried across in the grouping?
2. I note that MED010 has two placements of Emergency Medicine at the same time across two different MMM (must be a border region). This will cause issues for grouping. If it's a one off in the data set I can just change it manually, but how do I check the entire dataset for this kind of occurrence before I group?
Thanks for your help
can't delete this reply
@JoMont You can achieve the result bu using Group by Rows as well. Refer below code
et
Source = Table,
Result = let
varGroupedRows =
Table.Group(Source, {"Medical Registration Number", "Discipline Type"}, {{"Start Date", each List.Min([Start Date]), type nullable date}, {"End Date", each List.Max([End Date]), type nullable date}, {"FTE Equivalent", each List.Sum([FTE equivalent]), type nullable number}}),
varAddIndex = Table.AddIndexColumn(varGroupedRows,"Index",1,1),
varResult = Table.ReorderColumns(varAddIndex,{"Index"} & Table.ColumnNames(varGroupedRows))
in
varResult
in
Result
Below screenshot
Regards,
sanalytics
Thank you for this assistance. I have already tried this approach, and it grouped non-consecutive occurences as well so I had to abandon this process.
For instance, where the trainee had done a paediactric rotation 01/02/23 - 21/02/03, a GP rotation 22/02/23 - 17/03/23 and then another paediactric rotation 18/03/23 - 19/04/23, which should count as 3 rotations. Your approach results in 2 rotations with a paediatric rotation of 01/02/23 - 19/04/23 and a GP rotation 22/02/23. This also then confuses the calculation of how many FTE the paediatric rotation is, as it's calculated based on the date duration.
I note that I didn't add this info into the context of my question. I had to draft my question several times as it kept telling me I was over the character limit.
just in case you have overlapping periods...
let
fx_same_group = (x, y, max_date) => x{1} = y{1} and x{2} = y{2} and (x{3} <= max_date + #duration(1, 0, 0, 0)),
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
sort = Table.Sort(Source, {"Medical Registration Number","Discipline Type", "Start Date"}),
rows = List.Buffer(Table.ToList(sort, (x) => x)),
gen = List.Generate(
() => [i = 0, group_num = 1, dmax = rows{0}{4}],
(x) => x[i] < List.Count(rows),
(x) => [
i = x[i] + 1,
group_num = x[group_num] + Number.From(not fx_same_group(rows{i}, rows{x[i]}, x[dmax])),
dmax = if group_num = x[group_num] then List.Max({x[dmax], rows{i}{4}}) else rows{i}{4}
],
(x) => rows{x[i]} & {x[group_num]}
),
tbl = Table.FromList(gen, (x) => x, Table.ColumnNames(Source) & {"group"}),
result = Table.Group(
tbl,
{"group", "Medical Registration Number","Discipline Type"},
{
{"Start Date", (x) => List.Min(x[Start Date])},
{"End Date", (x) => List.Max(x[End Date])},
{"FTE equivalent", (x) => List.Sum(x[FTE equivalent])}
}
)
in
result
Hi @JoMont ,
I understand you're trying to group rows in two scenarios:
Here's a working solution in Power Query that handles both cases reliably even on large datasets:
Steps in Power Query.
let
Source = YourTableNameHere, // replace with actual step/table
ChangedTypes = Table.TransformColumnTypes(Source, {{"Start Date", type date},{"End Date", type date},{"FTE equivalent", type number}}),
Sorted = Table.Sort(ChangedTypes, {{"Medical Registration Number", Order.Ascending},{"Discipline Type", Order.Ascending},{"Start Date", Order.Ascending}}),
AddIndex = Table.AddIndexColumn(Sorted, "IndexHelper", 0, 1),
AddPrevious = Table.AddColumn(AddIndex, "Previous", each try AddIndex{[IndexHelper]-1} otherwise null),
FlagNewGroup = Table.AddColumn(AddPrevious, "NewGroup", each
let
curr = [Medical Registration Number],
prev = [Previous][Medical Registration Number],
sameDisc = [Discipline Type] = [Previous][Discipline Type],
isContiguous = [Start Date] <= Date.AddDays([Previous][End Date], 1),
samePerson = curr = prev
in
if samePerson and sameDisc and isContiguous then 0 else 1),
RemovePrev = Table.RemoveColumns(FlagNewGroup, {"Previous"}),
AddRunningGroup = Table.AddIndexColumn(RemovePrev, "RowIndex", 0, 1),
AddGroupID = Table.AddColumn(AddRunningGroup, "GroupID", each List.Sum(List.FirstN(RemovePrev[NewGroup], [RowIndex]+1))),
Grouped = Table.Group(AddGroupID,{"Medical Registration Number", "Discipline Type", "GroupID"}, {{"Start Date", each List.Min([Start Date]), type date},
{"End Date", each List.Max([End Date]), type date},{"FTE equivalent", each List.Sum([FTE equivalent]), type number}}),
AddFinalIndex = Table.AddIndexColumn(Grouped, "Index", 1, 1)
in
AddFinalIndex
After all above steps follow bellow steps.
Let me know if you'd like to retain other columns happy to help adapt further.
Regards,
Akhil.
Thank you for the code. So far it throws this error:
I tried amending the changed type code to the below - to ensure there were no 'Null" types
ChangedTypes = Table.TransformColumnTypes(#"Removed Columns1", {{"Start Date", type date},{"End Date", type date},{"FTE equivalent", type number},{"Medical Registration Number",type text},{"Discipline Type",type text}}),
But I still can't figure out where the type Null is coming from? Grateful for your help.
FYI, I am doing this for work, and live in Australia. So it might be a little while until I get back to you while I enjoy my weekend!