Find everything you need to get certified on Fabric—skills challenges, live sessions, exam prep, role guidance, and more. Get started
Hi,
I have a table that I need to restructure based on row values of the table. I found a long convoluted way to do it in Power Query but it is causing long refreshes and sometimes timing out (uses sort, buffer, merges, multiple index columns...)
I was wondering if someone can help me find a more elegant, memory efficient way to execute this.
My table looks like this:
people_id | el_actual_date | program_status |
clientA | 1/1/2020 | M |
clientA | 2/1/2020 | I |
clientA | 3/1/2020 | |
clientB | 1/1/2020 | M |
clientB | 1/1/2021 | |
clientB | 2/1/2021 | M |
clientB | 1/1/2022 | I |
clientC | 1/1/2020 | O |
I want to group by client and give an episode-of-care(EOC) start and end date. However, if a client has nothing for 90+ days, then it starts a new grouping. Each group would get what number EOC it is, and it's own start and end date. I do not need the dates in between the start and end date of each EOC. And finally, for program status, I only want to include a program status if it is M or O, at any point in the EOC (it will never be both).
The desired results would be:
people_id | episode_of_care_num | eoc_start_date | eoc_end_date | M_or_O_status |
clientA | 1 | 1/1/2020 | 3/1/2020 | M |
clientB | 1 | 1/1/2020 | 1/1/2020 | M |
clientB | 2 | 1/1/2021 | 2/1/2021 | M |
clientB | 3 | 1/1/2022 | 1/1/2022 | |
clientC | 1 | 1/1/2020 | 1/1/2020 | O |
I can do it in Power Query or DAX (or both) but want to do it in the most efficient way possible.
TIA
Solved! Go to Solution.
It's not super simple, but List.Generate might be faster than what you've tried.
Example:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WSs7JTM0rcVTSUTLUN9Q3MjAyADJ9lWJ1kKWMEFKeaFLGCCkFJCkn3AYiSRli0WWEkMKhywjNGc6odvkrxcYCAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [people_id = _t, el_actual_date = _t, program_status = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"people_id", type text}, {"el_actual_date", type date}, {"program_status", type text}}),
#"Grouped Rows" = Table.Group(#"Changed Type", {"people_id"}, {{"SubTable",
(A) => Table.Skip(Table.FromRecords(
List.Generate(
() => [t = [people_id = null, el_actual_date = null, program_status = null],
i = 0, j = 0, d = #date(1900,1,1)],
each [i] <= Table.RowCount(A),
each [
i = [i] + 1,
d = A{[i]}[el_actual_date],
j = (if d < Date.AddDays([d], 90) then [j] else [j] + 1),
t = A{[i]}
],
each [t] & [episode_of_care_num = [j]]
)))
}}),
#"Expanded SubTable" = Table.ExpandTableColumn(#"Grouped Rows", "SubTable", {"el_actual_date", "program_status", "episode_of_care_num"}, {"el_actual_date", "program_status", "episode_of_care_num"}),
#"Remove Non-MO" = Table.TransformColumns(#"Expanded SubTable",{{"program_status", each if _ = "M" or _ = "O" then _ else "", type text}}),
#"Grouped Rows1" = Table.Group(#"Remove Non-MO", {"people_id", "episode_of_care_num"}, {{"eoc_start_date", each List.Min([el_actual_date]), type date}, {"eoc_end_date", each List.Max([el_actual_date]), type date}, {"M_or_O_status", each List.Max([program_status]), type text}})
in
#"Grouped Rows1"
It's not super simple, but List.Generate might be faster than what you've tried.
Example:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WSs7JTM0rcVTSUTLUN9Q3MjAyADJ9lWJ1kKWMEFKeaFLGCCkFJCkn3AYiSRli0WWEkMKhywjNGc6odvkrxcYCAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [people_id = _t, el_actual_date = _t, program_status = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"people_id", type text}, {"el_actual_date", type date}, {"program_status", type text}}),
#"Grouped Rows" = Table.Group(#"Changed Type", {"people_id"}, {{"SubTable",
(A) => Table.Skip(Table.FromRecords(
List.Generate(
() => [t = [people_id = null, el_actual_date = null, program_status = null],
i = 0, j = 0, d = #date(1900,1,1)],
each [i] <= Table.RowCount(A),
each [
i = [i] + 1,
d = A{[i]}[el_actual_date],
j = (if d < Date.AddDays([d], 90) then [j] else [j] + 1),
t = A{[i]}
],
each [t] & [episode_of_care_num = [j]]
)))
}}),
#"Expanded SubTable" = Table.ExpandTableColumn(#"Grouped Rows", "SubTable", {"el_actual_date", "program_status", "episode_of_care_num"}, {"el_actual_date", "program_status", "episode_of_care_num"}),
#"Remove Non-MO" = Table.TransformColumns(#"Expanded SubTable",{{"program_status", each if _ = "M" or _ = "O" then _ else "", type text}}),
#"Grouped Rows1" = Table.Group(#"Remove Non-MO", {"people_id", "episode_of_care_num"}, {{"eoc_start_date", each List.Min([el_actual_date]), type date}, {"eoc_end_date", each List.Max([el_actual_date]), type date}, {"M_or_O_status", each List.Max([program_status]), type text}})
in
#"Grouped Rows1"
Hi @AlexisOlson ,
The parameters of this dataset have changed and I am stuck again!
I tried adjusting your code but I do not have a good grasp of List.Generate; I would appreciate your help.
I am still trying to list people by episode of care number, with a start date and end date.
I no longer need program_status.
I have 3 new columns:
[keepforLOS] which has each row categorized as either "session", "intake" or "discharge".
[source] which is either "Accumed" or "Netsmart"
[providerID]
if, for a particular client, the row is [source] = "Accumed" and the row directly before was [keepforLOS] "Intake" that is the start date of a new Episode of Care (EOC).
any sessions following that date for that client are part of the same EOC until there is a new row [keepforLOS]="intake".
if the [source] = "Netsmart" and there are no rows for that client previously, then that is a new EOC. the EOC continues until there is a "discharge", which ends the EOC and the next row would start a new EOC.
1 exception (to complicate matters even more!!) If the session row following Discharge is the same provider as the last Session before Discharge, AND it has been less than 90 days between the 2 sessions, then there is no new episode of care.
ex:
client | date | keepforLOS | source | Provider
A | 1/1/2015| session | Accumed | A
A | 2/1/2015 | intake | Accumed | B
A | 3/1/2015 | session | Accumed | B
A | 1/1/2020 | session | Netsmart | B
A | 2/1/2020 | discharge | Netsmart | B
A | 3/1/2020 | session | Netsmart | C
B | 1/1/2020| session | Netsmart | A
B| 2/1/2020 | discharge | Netsmart | A
B | 3/1/2020 | session | Netsmart | B
B | 4/1/2020 | sesion | Netsmart | C
C | 1/1/2020 | session | Netsmart | A
C | 2/1/2020 | discharge | Netsmart | A
C | 2/15/2020 | session | Netsmart | A
expected results:
Client | EOC# | start_date | end_date
A | 1 | 2/1/2015 | 1/1/2020
A | 2 | 3/1/2020 | 3/1/2020
B | 1 | 1/1/2020 | 1/1/2020
B | 2 | 3/1/2020 | 4/1/2020
C | 1 | 1/1/2020 | 2/15/2020
Can your code be adjusted for these new parameters?
TIA!
EF
@AlexisOlson it works!!
Thank you so much!
I didn't even mention but I had lost a lot of my work so you saved me hours of work AND gave me a much better solution!
Check out the September 2024 Power BI update to learn about new features.
Learn from experts, get hands-on experience, and win awesome prizes.
User | Count |
---|---|
114 | |
94 | |
90 | |
35 | |
35 |
User | Count |
---|---|
153 | |
99 | |
82 | |
63 | |
54 |