Join us at FabCon Atlanta from March 16 - 20, 2026, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.
Register now!The Power BI Data Visualization World Championships is back! Get ahead of the game and start preparing now! Learn more
I was kindly supplied a query to calculate running totals for call data, original post here. This works perfectly for the data shown in the example however this is not scaling and I suspect thats because of the recursive method used.
Are there any more efficent ways to acheive what I want to do? The current query shown below, for clarity the data is 1,760,813 rows at the step marked in red;
let
Source = Sql.Databases("REDACTED"),
REDACTED = Source{[Name="REDACTED"]}[Data],
REDACTED = REDACTED{[Schema="dbo",Item="REDACTED"]}[Data],
#"Filtered CallDate" = Table.SelectRows(REDACTED, each Date.IsInPreviousNDays([CallDate], 365)),
#"Filtered AccountNumber" = Table.SelectRows(#"Filtered CallDate", each [AccountNumber] <> "REDACTED" and [AccountNumber] <> "INTERNAL"),
#"Added IHT" = Table.AddColumn(#"Filtered AccountNumber", "IHT", each if [AccountDuration] <> null then [AccountDuration] else if [IncidentDuration] <> null then [IncidentDuration] else if [ClassificationDuration] <> null then [ClassificationDuration] else 0),
#"Removed Duplicates" = Table.Distinct(#"Added IHT", {"CallDate", "OpenAccount", "CreateStaffDetailId"}),
#"Extracted Text After Delimiter" = Table.TransformColumns(#"Removed Duplicates", {{"RegDescription", each Text.AfterDelimiter(_, " - ", {0, RelativePosition.FromEnd}), type text}}),
#"Changed Type" = Table.TransformColumnTypes(#"Extracted Text After Delimiter",{{"CallDate", type date}}),
#"Removed Other Columns" = Table.SelectColumns(#"Changed Type",{"IncidentdetailId", "AccountNumber", "CallDate", "IHT"}),
fnGetAllCallsWithinSevenDays = (tbl as table, accountNumber as any, curDate as date, curIncidentdetailIds as list) as table =>
let
// filter out all rows having max distance of 7 days of the same AccountNumber
FilteredTable = Table.SelectRows(
tbl,
each
[AccountNumber] = accountNumber
and Number.Abs(Duration.Days([CallDate] - curDate)) <= 7
and not List.Contains(curIncidentdetailIds, [IncidentdetailId])
),
Result =
if Table.IsEmpty(FilteredTable) then
FilteredTable // return from recursion
else
let
// a list of all visited incidents
newIncidentdetailIds = List.Combine({curIncidentdetailIds, Table.Column(FilteredTable, "IncidentdetailId")}),
// recursive all incidents before minimum CallDate of already found incidents
checkBeforeCurrentDate = @fnGetAllCallsWithinSevenDays(
tbl,
accountNumber,
Table.Min(FilteredTable, "CallDate")[CallDate],
newIncidentdetailIds
),
// recursive all incidents after maximum CallDate of already found incidents
checkAfterCurrentDate = @fnGetAllCallsWithinSevenDays(
tbl,
accountNumber,
Table.Max(FilteredTable, "CallDate")[CallDate],
newIncidentdetailIds
),
// combine all tables coming from recursion
combinedResult = Table.Combine({checkBeforeCurrentDate, FilteredTable, checkAfterCurrentDate})
in
combinedResult
in
Result,
// unique records
fnGetUniqueCallsWithinSevenDays = (tbl as table, accountNumber as any, curDate as date) as table =>
let
distinctTable = Table.Distinct(fnGetAllCallsWithinSevenDays(tbl, accountNumber, curDate, {}), "IncidentdetailId"),
indexedTable = Table.AddIndexColumn(distinctTable, "Index", 1)
in
indexedTable,
// unique records within seven days
runningTotalsTable = Table.AddColumn(#"Removed Other Columns", "runningTotalsTable", each fnGetUniqueCallsWithinSevenDays(#"Changed Type", [AccountNumber], [CallDate])),
// aggregations
#"Added TotalCallsInRepeatPeriod" = Table.AddColumn(runningTotalsTable, "TotalCallsInRepeatPeriod", each Table.RowCount([runningTotalsTable]), Int64.Type),
#"Added CallNumber" = Table.AddColumn(#"Added TotalCallsInRepeatPeriod", "CallNumber", (rootRecord) => Table.SelectRows(rootRecord[runningTotalsTable], (childRecord) => rootRecord[IncidentdetailId] = childRecord[IncidentdetailId]){0}[Index]),
#"Added TotalIHTInRepeatPeriod" = Table.AddColumn(#"Added CallNumber", "TotalIHTInRepeatPeriod", each List.Sum(Table.Column([runningTotalsTable], "IHT"))),
#"Added RunningTotalIHTInRepeatPeriod" = Table.AddColumn(#"Added TotalIHTInRepeatPeriod", "RunningTotalIHTInRepeatPeriod", (rootRecord) => List.Sum(Table.SelectRows(rootRecord[runningTotalsTable], (childRecord) => rootRecord[IncidentdetailId] >= childRecord[IncidentdetailId])[IHT])),
#"Added FirstIncidentDetailId" = Table.AddColumn(#"Added RunningTotalIHTInRepeatPeriod", "FirstIncidentDetailId", each List.Min(Table.Column([runningTotalsTable], "IncidentdetailId")))
in
#"Added FirstIncidentDetailId"
Current Data
| IncidentDetailId | AccountName | AccountNumber | AccountType | CallDate | CallTime | MajorComment | MinorComment | ThirdTier | AccountDuration |
| 33059568 | Example Company | 98765432 | Customer | 04/12/2018 | 16:20:26 | Payroll | RTI | FPS | 728 |
| 33925147 | Example Company | 98765432 | Customer | 03/05/2019 | 09:50:39 | Payroll | Installation and Activation | Install Program | 735 |
| 33928425 | Example Company | 98765432 | Customer | 03/05/2019 | 14:13:58 | Payroll | Installation and Activation | Install Program | 828 |
| 33928851 | Example Company | 98765432 | Customer | 03/05/2019 | 14:46:08 | Payroll | Installation and Activation | Install Program | 429 |
| 33929819 | Example Company | 98765432 | Customer | 03/05/2019 | 16:13:20 | Payroll | Installation and Activation | Install Program | 1255 |
| 33930022 | Example Company | 98765432 | Customer | 03/05/2019 | 16:29:46 | Payroll | Installation and Activation | Install Program | 561 |
| 33930533 | Example Company | 98765432 | Customer | 03/05/2019 | 17:40:37 | Payroll | Installation and Activation | Install Program | 1952 |
| 33935776 | Example Company | 98765432 | Customer | 04/05/2019 | 15:52:54 | Payroll | Setup and Manage | Employee | 206 |
| 33936491 | Example Company | 98765432 | Customer | 04/05/2019 | 16:58:43 | Payroll | Processing | Correcting a mistake | 3359 |
| 33942334 | Example Company | 98765432 | Customer | 05/05/2019 | 16:34:21 | Payroll | Processing | Correcting a mistake | 3767 |
| 33951852 | Example Company | 98765432 | Customer | 11/05/2019 | 13:32:33 | Payroll | Processing | Calculations | 266 |
| 33935776 | Example Company | 98765432 | Customer | 10/06/2019 | 15:52:54 | Payroll | Setup and Manage | Employee | 206 |
What I want to achieve in bold
| IncidentDetailId | AccountName | AccountNumber | AccountType | CallDate | CallTime | MajorComment | MinorComment | ThirdTier | AccountDuration | TotalCallsInRepeatPeriod | CallNumber | TotalIHTInRepeatPeriod | RunningTotalIHTInRepeatPeriod | FirstIncidentDetailId |
| 33059568 | Example Company | 98765432 | Customer | 04/12/2018 | 16:20:26 | Payroll | RTI | FPS | 728 | 1 | 1 | 728 | 728 | 33059568 |
| 33925147 | Example Company | 98765432 | Customer | 03/05/2019 | 09:50:39 | Payroll | Installation and Activation | Install Program | 735 | 10 | 1 | 13358 | 735 | 33925147 |
| 33928425 | Example Company | 98765432 | Customer | 03/05/2019 | 14:13:58 | Payroll | Installation and Activation | Install Program | 828 | 10 | 2 | 13358 | 1563 | 33925147 |
| 33928851 | Example Company | 98765432 | Customer | 03/05/2019 | 14:46:08 | Payroll | Installation and Activation | Install Program | 429 | 10 | 3 | 13358 | 1992 | 33925147 |
| 33929819 | Example Company | 98765432 | Customer | 03/05/2019 | 16:13:20 | Payroll | Installation and Activation | Install Program | 1255 | 10 | 4 | 13358 | 3247 | 33925147 |
| 33930022 | Example Company | 98765432 | Customer | 03/05/2019 | 16:29:46 | Payroll | Installation and Activation | Install Program | 561 | 10 | 5 | 13358 | 3808 | 33925147 |
| 33930533 | Example Company | 98765432 | Customer | 03/05/2019 | 17:40:37 | Payroll | Installation and Activation | Install Program | 1952 | 10 | 6 | 13358 | 5760 | 33925147 |
| 33935776 | Example Company | 98765432 | Customer | 04/05/2019 | 15:52:54 | Payroll | Setup and Manage | Employee | 206 | 10 | 7 | 13358 | 5966 | 33925147 |
| 33936491 | Example Company | 98765432 | Customer | 04/05/2019 | 16:58:43 | Payroll | Processing | Correcting a mistake | 3359 | 10 | 8 | 13358 | 9325 | 33925147 |
| 33942334 | Example Company | 98765432 | Customer | 05/05/2019 | 16:34:21 | Payroll | Processing | Correcting a mistake | 3767 | 10 | 9 | 13358 | 13092 | 33925147 |
| 33951852 | Example Company | 98765432 | Customer | 11/05/2019 | 13:32:33 | Payroll | Processing | Calculations | 266 | 10 | 10 | 13358 | 13358 | 33925147 |
| 33935776 | Example Company | 98765432 | Customer | 10/06/2019 | 15:52:54 | Payroll | Setup and Manage | Employee | 206 | 1 | 1 | 206 | 207 | 33935776 |
Solved! Go to Solution.
Hi @mark_carlisle,
I'm still thinking how to improve the performance. I've implemented another way without recursion, check it out and please test with your big dataset.
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("pdRLi9swEADgv2JyXoI0oxk9bmVpYQ+F0O1tyUGkJoT6EWynNP++kh27cjEFxweLkR/yx2hGHx87REGW2Oxedp9/+/Ja5NlrXV59dQ93rNFMCiGEr7e2q8u8CaFQewl7EDJ+JNmBcMAhPPh7UxdFiL59fwvjl8N7GDX0rz2uYTaM06+PL9FhgaTSaxy4FxQdNk6sI+HQzhxvVdv5ovDdpa4yX/3IPp26y69++vdpdmjqc+PL6EKKTjGCEclMtyfhyDUK6FmuVE6i61d/nmuG1EYuJFxJjIteQ3KDV7ET27wK7OjF1GstLHmt6X/9nJdjfkFs8kqgqR5UAkbo6/RfMAoBsAEMNuR4E5hYjl5KvUaYRS8hPu3VToV+09sSbAlGMCdg0iyWwKQ1rzuoEjA5AkdqBn7Pu9u1l371lT/ncfGwdH3PYwiCR51OdZZ5ScfKruovNd9+Mk7hTBcSdcrb9lKd45d10+QhndU581l5Can8mfcKmprKJEiLsHRoKUBUa5A0R6JyINcjNesRadPOR7HY+iQNrekkKVMlOgSH/0ulL063oUrbuM08bfMwTLpH+2wsQyn2greX4eMaZiAeJ9CAOR7/AA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [IncidentId = _t, AccountName = _t, AccountNumber = _t, AccountType = _t, CallDate = _t, CallTime = _t, MajorComment = _t, MinorComment = _t, ThirdTier = _t, AccountDuration = _t, TotalCalls = _t, CallNumber = _t, TotalDuration = _t, SeqDuration = _t, FirstIncidentId = _t]),
#"Removed Columns" = Table.RemoveColumns(Source,{"AccountName", "AccountType", "MajorComment", "MinorComment", "ThirdTier", "CallTime", "TotalCalls", "CallNumber", "TotalDuration", "SeqDuration", "FirstIncidentId"}),
#"Changed Type with Locale" = Table.TransformColumnTypes(#"Removed Columns", {{"CallDate", type date}}, "de-DE"),
#"Changed Type" = Table.TransformColumnTypes(#"Changed Type with Locale",{{"AccountDuration", Int64.Type}}),
fnSummarizeIncidents = (tbl as table) as table =>
let
SortByCallDate = Table.Sort(tbl, {{"CallDate", Order.Ascending}}),
IndexColumn = Table.AddIndexColumn(SortByCallDate, "Index", 0, 1),
// a new serie starts if the count of days between 2 incidents of one account is longer than days
IncidentSerieStart = Table.AddColumn(IndexColumn, "IncidentSerieStart", each
if [Index] = 0 then
1
else if Duration.Days([CallDate] - tbl[CallDate]{[Index]-1}) >= 7 then
1
else
0
),
// sum all previous IncidentSerieStarts to have an ID for a serie
IncidentSerieNumber = Table.AddColumn(IncidentSerieStart, "IncidentSerieNumber",
(parentRecord) =>
List.Sum(
Table.Column(
Table.SelectRows(
IncidentSerieStart,
(childRecord) => parentRecord[Index] >= childRecord[Index]
),
"IncidentSerieStart"
)
)
),
// group by series
Groups = Table.Group(IncidentSerieNumber, {"IncidentSerieNumber"}, {{"IncidentSerie", each _}}),
// remove incident serie number
RemoveIncidentSerieNumber = Table.RemoveColumns(Groups,{"IncidentSerieNumber"}),
// add an inner index for every group
GroupInnerIndex = Table.TransformColumns(RemoveIncidentSerieNumber, {"IncidentSerie", each Table.AddIndexColumn(_, "SerieIndex", 1, 1)}),
// calculate a running total of AccountDuration
GroupInnerRunningTotalOfAccountDuration = Table.TransformColumns(GroupInnerIndex, {"IncidentSerie", (tbl) => Table.AddColumn(tbl, "Sequence Duration", (rec) =>
List.Sum(Table.Column(Table.SelectRows(tbl, each [SerieIndex] <= rec[SerieIndex]), "AccountDuration"))
)}),
// aggregations
TotalCalls = Table.AddColumn(GroupInnerRunningTotalOfAccountDuration, "Total Calls", each Table.RowCount([IncidentSerie]), Int64.Type),
TotalDuration = Table.AddColumn(TotalCalls, "Total Duration", each List.Sum(Table.Column([IncidentSerie], "AccountDuration"))),
FirstIncident = Table.AddColumn(TotalDuration, "First Incident", each List.Min(Table.Column([IncidentSerie], "IncidentId"))),
// expand group
ExpandTable = Table.ExpandTableColumn(
FirstIncident,
"IncidentSerie",
{"IncidentId", "AccountNumber", "CallDate", "AccountDuration", "SerieIndex", "Sequence Duration"}
)
in
ExpandTable,
Groups = Table.Group(#"Changed Type", {"AccountNumber"}, {{"PerAccount", each _}}),
RemoveAccountNumber = Table.RemoveColumns(Groups, "AccountNumber"),
GroupsSummarized = Table.TransformColumns(RemoveAccountNumber, {"PerAccount", fnSummarizeIncidents}),
ExpandTable = Table.ExpandTableColumn(GroupsSummarized, "PerAccount", {"IncidentId", "AccountNumber", "CallDate", "AccountDuration", "SerieIndex", "Sequence Duration", "Total Calls", "Total Duration", "First Incident"})
in
ExpandTable
Hi @mark_carlisle,
I've tried some optimizations but I've appended them to the old post: https://community.powerbi.com/t5/Power-Query/Dates-in-period-for-a-dynamic-period/m-p/714354/highlig...
Hi @mark_carlisle,
I'm still thinking how to improve the performance. I've implemented another way without recursion, check it out and please test with your big dataset.
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("pdRLi9swEADgv2JyXoI0oxk9bmVpYQ+F0O1tyUGkJoT6EWynNP++kh27cjEFxweLkR/yx2hGHx87REGW2Oxedp9/+/Ja5NlrXV59dQ93rNFMCiGEr7e2q8u8CaFQewl7EDJ+JNmBcMAhPPh7UxdFiL59fwvjl8N7GDX0rz2uYTaM06+PL9FhgaTSaxy4FxQdNk6sI+HQzhxvVdv5ovDdpa4yX/3IPp26y69++vdpdmjqc+PL6EKKTjGCEclMtyfhyDUK6FmuVE6i61d/nmuG1EYuJFxJjIteQ3KDV7ET27wK7OjF1GstLHmt6X/9nJdjfkFs8kqgqR5UAkbo6/RfMAoBsAEMNuR4E5hYjl5KvUaYRS8hPu3VToV+09sSbAlGMCdg0iyWwKQ1rzuoEjA5AkdqBn7Pu9u1l371lT/ncfGwdH3PYwiCR51OdZZ5ScfKruovNd9+Mk7hTBcSdcrb9lKd45d10+QhndU581l5Can8mfcKmprKJEiLsHRoKUBUa5A0R6JyINcjNesRadPOR7HY+iQNrekkKVMlOgSH/0ulL063oUrbuM08bfMwTLpH+2wsQyn2greX4eMaZiAeJ9CAOR7/AA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [IncidentId = _t, AccountName = _t, AccountNumber = _t, AccountType = _t, CallDate = _t, CallTime = _t, MajorComment = _t, MinorComment = _t, ThirdTier = _t, AccountDuration = _t, TotalCalls = _t, CallNumber = _t, TotalDuration = _t, SeqDuration = _t, FirstIncidentId = _t]),
#"Removed Columns" = Table.RemoveColumns(Source,{"AccountName", "AccountType", "MajorComment", "MinorComment", "ThirdTier", "CallTime", "TotalCalls", "CallNumber", "TotalDuration", "SeqDuration", "FirstIncidentId"}),
#"Changed Type with Locale" = Table.TransformColumnTypes(#"Removed Columns", {{"CallDate", type date}}, "de-DE"),
#"Changed Type" = Table.TransformColumnTypes(#"Changed Type with Locale",{{"AccountDuration", Int64.Type}}),
fnSummarizeIncidents = (tbl as table) as table =>
let
SortByCallDate = Table.Sort(tbl, {{"CallDate", Order.Ascending}}),
IndexColumn = Table.AddIndexColumn(SortByCallDate, "Index", 0, 1),
// a new serie starts if the count of days between 2 incidents of one account is longer than days
IncidentSerieStart = Table.AddColumn(IndexColumn, "IncidentSerieStart", each
if [Index] = 0 then
1
else if Duration.Days([CallDate] - tbl[CallDate]{[Index]-1}) >= 7 then
1
else
0
),
// sum all previous IncidentSerieStarts to have an ID for a serie
IncidentSerieNumber = Table.AddColumn(IncidentSerieStart, "IncidentSerieNumber",
(parentRecord) =>
List.Sum(
Table.Column(
Table.SelectRows(
IncidentSerieStart,
(childRecord) => parentRecord[Index] >= childRecord[Index]
),
"IncidentSerieStart"
)
)
),
// group by series
Groups = Table.Group(IncidentSerieNumber, {"IncidentSerieNumber"}, {{"IncidentSerie", each _}}),
// remove incident serie number
RemoveIncidentSerieNumber = Table.RemoveColumns(Groups,{"IncidentSerieNumber"}),
// add an inner index for every group
GroupInnerIndex = Table.TransformColumns(RemoveIncidentSerieNumber, {"IncidentSerie", each Table.AddIndexColumn(_, "SerieIndex", 1, 1)}),
// calculate a running total of AccountDuration
GroupInnerRunningTotalOfAccountDuration = Table.TransformColumns(GroupInnerIndex, {"IncidentSerie", (tbl) => Table.AddColumn(tbl, "Sequence Duration", (rec) =>
List.Sum(Table.Column(Table.SelectRows(tbl, each [SerieIndex] <= rec[SerieIndex]), "AccountDuration"))
)}),
// aggregations
TotalCalls = Table.AddColumn(GroupInnerRunningTotalOfAccountDuration, "Total Calls", each Table.RowCount([IncidentSerie]), Int64.Type),
TotalDuration = Table.AddColumn(TotalCalls, "Total Duration", each List.Sum(Table.Column([IncidentSerie], "AccountDuration"))),
FirstIncident = Table.AddColumn(TotalDuration, "First Incident", each List.Min(Table.Column([IncidentSerie], "IncidentId"))),
// expand group
ExpandTable = Table.ExpandTableColumn(
FirstIncident,
"IncidentSerie",
{"IncidentId", "AccountNumber", "CallDate", "AccountDuration", "SerieIndex", "Sequence Duration"}
)
in
ExpandTable,
Groups = Table.Group(#"Changed Type", {"AccountNumber"}, {{"PerAccount", each _}}),
RemoveAccountNumber = Table.RemoveColumns(Groups, "AccountNumber"),
GroupsSummarized = Table.TransformColumns(RemoveAccountNumber, {"PerAccount", fnSummarizeIncidents}),
ExpandTable = Table.ExpandTableColumn(GroupsSummarized, "PerAccount", {"IncidentId", "AccountNumber", "CallDate", "AccountDuration", "SerieIndex", "Sequence Duration", "Total Calls", "Total Duration", "First Incident"})
in
ExpandTable
Thanks for all this. I've optimised by initial query to the SQL DB to take advantage of folding right up the point of the function now so that's a bit faster.
This version of the function, with some additional steps after it to complete the query, is managing to complete for a date range of 01/05/2019 - 17/06/2019 (approx 185,000 rows) in around 15 minutes on my laptop. The recursive version was not able to do a single day so this is a definite improvement.
I'm now working on add the whole thing to PBI Dataflows to automate the whole process.
Thanks again for the help.
The Power BI Data Visualization World Championships is back! Get ahead of the game and start preparing now!