Join us at FabCon Atlanta from March 16 - 20, 2026, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.
Register now!The Power BI Data Visualization World Championships is back! It's time to submit your entry. Live now!
I was kindly supplied a query to calculate running totals for call data, original post here. This works perfectly for the data shown in the example however this is not scaling and I suspect thats because of the recursive method used.
Are there any more efficent ways to acheive what I want to do? The current query shown below, for clarity the data is 1,760,813 rows at the step marked in red;
let
Source = Sql.Databases("REDACTED"),
REDACTED = Source{[Name="REDACTED"]}[Data],
REDACTED = REDACTED{[Schema="dbo",Item="REDACTED"]}[Data],
#"Filtered CallDate" = Table.SelectRows(REDACTED, each Date.IsInPreviousNDays([CallDate], 365)),
#"Filtered AccountNumber" = Table.SelectRows(#"Filtered CallDate", each [AccountNumber] <> "REDACTED" and [AccountNumber] <> "INTERNAL"),
#"Added IHT" = Table.AddColumn(#"Filtered AccountNumber", "IHT", each if [AccountDuration] <> null then [AccountDuration] else if [IncidentDuration] <> null then [IncidentDuration] else if [ClassificationDuration] <> null then [ClassificationDuration] else 0),
#"Removed Duplicates" = Table.Distinct(#"Added IHT", {"CallDate", "OpenAccount", "CreateStaffDetailId"}),
#"Extracted Text After Delimiter" = Table.TransformColumns(#"Removed Duplicates", {{"RegDescription", each Text.AfterDelimiter(_, " - ", {0, RelativePosition.FromEnd}), type text}}),
#"Changed Type" = Table.TransformColumnTypes(#"Extracted Text After Delimiter",{{"CallDate", type date}}),
#"Removed Other Columns" = Table.SelectColumns(#"Changed Type",{"IncidentdetailId", "AccountNumber", "CallDate", "IHT"}),
fnGetAllCallsWithinSevenDays = (tbl as table, accountNumber as any, curDate as date, curIncidentdetailIds as list) as table =>
let
// filter out all rows having max distance of 7 days of the same AccountNumber
FilteredTable = Table.SelectRows(
tbl,
each
[AccountNumber] = accountNumber
and Number.Abs(Duration.Days([CallDate] - curDate)) <= 7
and not List.Contains(curIncidentdetailIds, [IncidentdetailId])
),
Result =
if Table.IsEmpty(FilteredTable) then
FilteredTable // return from recursion
else
let
// a list of all visited incidents
newIncidentdetailIds = List.Combine({curIncidentdetailIds, Table.Column(FilteredTable, "IncidentdetailId")}),
// recursive all incidents before minimum CallDate of already found incidents
checkBeforeCurrentDate = @fnGetAllCallsWithinSevenDays(
tbl,
accountNumber,
Table.Min(FilteredTable, "CallDate")[CallDate],
newIncidentdetailIds
),
// recursive all incidents after maximum CallDate of already found incidents
checkAfterCurrentDate = @fnGetAllCallsWithinSevenDays(
tbl,
accountNumber,
Table.Max(FilteredTable, "CallDate")[CallDate],
newIncidentdetailIds
),
// combine all tables coming from recursion
combinedResult = Table.Combine({checkBeforeCurrentDate, FilteredTable, checkAfterCurrentDate})
in
combinedResult
in
Result,
// unique records
fnGetUniqueCallsWithinSevenDays = (tbl as table, accountNumber as any, curDate as date) as table =>
let
distinctTable = Table.Distinct(fnGetAllCallsWithinSevenDays(tbl, accountNumber, curDate, {}), "IncidentdetailId"),
indexedTable = Table.AddIndexColumn(distinctTable, "Index", 1)
in
indexedTable,
// unique records within seven days
runningTotalsTable = Table.AddColumn(#"Removed Other Columns", "runningTotalsTable", each fnGetUniqueCallsWithinSevenDays(#"Changed Type", [AccountNumber], [CallDate])),
// aggregations
#"Added TotalCallsInRepeatPeriod" = Table.AddColumn(runningTotalsTable, "TotalCallsInRepeatPeriod", each Table.RowCount([runningTotalsTable]), Int64.Type),
#"Added CallNumber" = Table.AddColumn(#"Added TotalCallsInRepeatPeriod", "CallNumber", (rootRecord) => Table.SelectRows(rootRecord[runningTotalsTable], (childRecord) => rootRecord[IncidentdetailId] = childRecord[IncidentdetailId]){0}[Index]),
#"Added TotalIHTInRepeatPeriod" = Table.AddColumn(#"Added CallNumber", "TotalIHTInRepeatPeriod", each List.Sum(Table.Column([runningTotalsTable], "IHT"))),
#"Added RunningTotalIHTInRepeatPeriod" = Table.AddColumn(#"Added TotalIHTInRepeatPeriod", "RunningTotalIHTInRepeatPeriod", (rootRecord) => List.Sum(Table.SelectRows(rootRecord[runningTotalsTable], (childRecord) => rootRecord[IncidentdetailId] >= childRecord[IncidentdetailId])[IHT])),
#"Added FirstIncidentDetailId" = Table.AddColumn(#"Added RunningTotalIHTInRepeatPeriod", "FirstIncidentDetailId", each List.Min(Table.Column([runningTotalsTable], "IncidentdetailId")))
in
#"Added FirstIncidentDetailId"
Current Data
| IncidentDetailId | AccountName | AccountNumber | AccountType | CallDate | CallTime | MajorComment | MinorComment | ThirdTier | AccountDuration |
| 33059568 | Example Company | 98765432 | Customer | 04/12/2018 | 16:20:26 | Payroll | RTI | FPS | 728 |
| 33925147 | Example Company | 98765432 | Customer | 03/05/2019 | 09:50:39 | Payroll | Installation and Activation | Install Program | 735 |
| 33928425 | Example Company | 98765432 | Customer | 03/05/2019 | 14:13:58 | Payroll | Installation and Activation | Install Program | 828 |
| 33928851 | Example Company | 98765432 | Customer | 03/05/2019 | 14:46:08 | Payroll | Installation and Activation | Install Program | 429 |
| 33929819 | Example Company | 98765432 | Customer | 03/05/2019 | 16:13:20 | Payroll | Installation and Activation | Install Program | 1255 |
| 33930022 | Example Company | 98765432 | Customer | 03/05/2019 | 16:29:46 | Payroll | Installation and Activation | Install Program | 561 |
| 33930533 | Example Company | 98765432 | Customer | 03/05/2019 | 17:40:37 | Payroll | Installation and Activation | Install Program | 1952 |
| 33935776 | Example Company | 98765432 | Customer | 04/05/2019 | 15:52:54 | Payroll | Setup and Manage | Employee | 206 |
| 33936491 | Example Company | 98765432 | Customer | 04/05/2019 | 16:58:43 | Payroll | Processing | Correcting a mistake | 3359 |
| 33942334 | Example Company | 98765432 | Customer | 05/05/2019 | 16:34:21 | Payroll | Processing | Correcting a mistake | 3767 |
| 33951852 | Example Company | 98765432 | Customer | 11/05/2019 | 13:32:33 | Payroll | Processing | Calculations | 266 |
| 33935776 | Example Company | 98765432 | Customer | 10/06/2019 | 15:52:54 | Payroll | Setup and Manage | Employee | 206 |
What I want to achieve in bold
| IncidentDetailId | AccountName | AccountNumber | AccountType | CallDate | CallTime | MajorComment | MinorComment | ThirdTier | AccountDuration | TotalCallsInRepeatPeriod | CallNumber | TotalIHTInRepeatPeriod | RunningTotalIHTInRepeatPeriod | FirstIncidentDetailId |
| 33059568 | Example Company | 98765432 | Customer | 04/12/2018 | 16:20:26 | Payroll | RTI | FPS | 728 | 1 | 1 | 728 | 728 | 33059568 |
| 33925147 | Example Company | 98765432 | Customer | 03/05/2019 | 09:50:39 | Payroll | Installation and Activation | Install Program | 735 | 10 | 1 | 13358 | 735 | 33925147 |
| 33928425 | Example Company | 98765432 | Customer | 03/05/2019 | 14:13:58 | Payroll | Installation and Activation | Install Program | 828 | 10 | 2 | 13358 | 1563 | 33925147 |
| 33928851 | Example Company | 98765432 | Customer | 03/05/2019 | 14:46:08 | Payroll | Installation and Activation | Install Program | 429 | 10 | 3 | 13358 | 1992 | 33925147 |
| 33929819 | Example Company | 98765432 | Customer | 03/05/2019 | 16:13:20 | Payroll | Installation and Activation | Install Program | 1255 | 10 | 4 | 13358 | 3247 | 33925147 |
| 33930022 | Example Company | 98765432 | Customer | 03/05/2019 | 16:29:46 | Payroll | Installation and Activation | Install Program | 561 | 10 | 5 | 13358 | 3808 | 33925147 |
| 33930533 | Example Company | 98765432 | Customer | 03/05/2019 | 17:40:37 | Payroll | Installation and Activation | Install Program | 1952 | 10 | 6 | 13358 | 5760 | 33925147 |
| 33935776 | Example Company | 98765432 | Customer | 04/05/2019 | 15:52:54 | Payroll | Setup and Manage | Employee | 206 | 10 | 7 | 13358 | 5966 | 33925147 |
| 33936491 | Example Company | 98765432 | Customer | 04/05/2019 | 16:58:43 | Payroll | Processing | Correcting a mistake | 3359 | 10 | 8 | 13358 | 9325 | 33925147 |
| 33942334 | Example Company | 98765432 | Customer | 05/05/2019 | 16:34:21 | Payroll | Processing | Correcting a mistake | 3767 | 10 | 9 | 13358 | 13092 | 33925147 |
| 33951852 | Example Company | 98765432 | Customer | 11/05/2019 | 13:32:33 | Payroll | Processing | Calculations | 266 | 10 | 10 | 13358 | 13358 | 33925147 |
| 33935776 | Example Company | 98765432 | Customer | 10/06/2019 | 15:52:54 | Payroll | Setup and Manage | Employee | 206 | 1 | 1 | 206 | 207 | 33935776 |
Solved! Go to Solution.
Hi @mark_carlisle,
I'm still thinking how to improve the performance. I've implemented another way without recursion, check it out and please test with your big dataset.
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("pdRLi9swEADgv2JyXoI0oxk9bmVpYQ+F0O1tyUGkJoT6EWynNP++kh27cjEFxweLkR/yx2hGHx87REGW2Oxedp9/+/Ja5NlrXV59dQ93rNFMCiGEr7e2q8u8CaFQewl7EDJ+JNmBcMAhPPh7UxdFiL59fwvjl8N7GDX0rz2uYTaM06+PL9FhgaTSaxy4FxQdNk6sI+HQzhxvVdv5ovDdpa4yX/3IPp26y69++vdpdmjqc+PL6EKKTjGCEclMtyfhyDUK6FmuVE6i61d/nmuG1EYuJFxJjIteQ3KDV7ET27wK7OjF1GstLHmt6X/9nJdjfkFs8kqgqR5UAkbo6/RfMAoBsAEMNuR4E5hYjl5KvUaYRS8hPu3VToV+09sSbAlGMCdg0iyWwKQ1rzuoEjA5AkdqBn7Pu9u1l371lT/ncfGwdH3PYwiCR51OdZZ5ScfKruovNd9+Mk7hTBcSdcrb9lKd45d10+QhndU581l5Can8mfcKmprKJEiLsHRoKUBUa5A0R6JyINcjNesRadPOR7HY+iQNrekkKVMlOgSH/0ulL063oUrbuM08bfMwTLpH+2wsQyn2greX4eMaZiAeJ9CAOR7/AA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [IncidentId = _t, AccountName = _t, AccountNumber = _t, AccountType = _t, CallDate = _t, CallTime = _t, MajorComment = _t, MinorComment = _t, ThirdTier = _t, AccountDuration = _t, TotalCalls = _t, CallNumber = _t, TotalDuration = _t, SeqDuration = _t, FirstIncidentId = _t]),
#"Removed Columns" = Table.RemoveColumns(Source,{"AccountName", "AccountType", "MajorComment", "MinorComment", "ThirdTier", "CallTime", "TotalCalls", "CallNumber", "TotalDuration", "SeqDuration", "FirstIncidentId"}),
#"Changed Type with Locale" = Table.TransformColumnTypes(#"Removed Columns", {{"CallDate", type date}}, "de-DE"),
#"Changed Type" = Table.TransformColumnTypes(#"Changed Type with Locale",{{"AccountDuration", Int64.Type}}),
fnSummarizeIncidents = (tbl as table) as table =>
let
SortByCallDate = Table.Sort(tbl, {{"CallDate", Order.Ascending}}),
IndexColumn = Table.AddIndexColumn(SortByCallDate, "Index", 0, 1),
// a new serie starts if the count of days between 2 incidents of one account is longer than days
IncidentSerieStart = Table.AddColumn(IndexColumn, "IncidentSerieStart", each
if [Index] = 0 then
1
else if Duration.Days([CallDate] - tbl[CallDate]{[Index]-1}) >= 7 then
1
else
0
),
// sum all previous IncidentSerieStarts to have an ID for a serie
IncidentSerieNumber = Table.AddColumn(IncidentSerieStart, "IncidentSerieNumber",
(parentRecord) =>
List.Sum(
Table.Column(
Table.SelectRows(
IncidentSerieStart,
(childRecord) => parentRecord[Index] >= childRecord[Index]
),
"IncidentSerieStart"
)
)
),
// group by series
Groups = Table.Group(IncidentSerieNumber, {"IncidentSerieNumber"}, {{"IncidentSerie", each _}}),
// remove incident serie number
RemoveIncidentSerieNumber = Table.RemoveColumns(Groups,{"IncidentSerieNumber"}),
// add an inner index for every group
GroupInnerIndex = Table.TransformColumns(RemoveIncidentSerieNumber, {"IncidentSerie", each Table.AddIndexColumn(_, "SerieIndex", 1, 1)}),
// calculate a running total of AccountDuration
GroupInnerRunningTotalOfAccountDuration = Table.TransformColumns(GroupInnerIndex, {"IncidentSerie", (tbl) => Table.AddColumn(tbl, "Sequence Duration", (rec) =>
List.Sum(Table.Column(Table.SelectRows(tbl, each [SerieIndex] <= rec[SerieIndex]), "AccountDuration"))
)}),
// aggregations
TotalCalls = Table.AddColumn(GroupInnerRunningTotalOfAccountDuration, "Total Calls", each Table.RowCount([IncidentSerie]), Int64.Type),
TotalDuration = Table.AddColumn(TotalCalls, "Total Duration", each List.Sum(Table.Column([IncidentSerie], "AccountDuration"))),
FirstIncident = Table.AddColumn(TotalDuration, "First Incident", each List.Min(Table.Column([IncidentSerie], "IncidentId"))),
// expand group
ExpandTable = Table.ExpandTableColumn(
FirstIncident,
"IncidentSerie",
{"IncidentId", "AccountNumber", "CallDate", "AccountDuration", "SerieIndex", "Sequence Duration"}
)
in
ExpandTable,
Groups = Table.Group(#"Changed Type", {"AccountNumber"}, {{"PerAccount", each _}}),
RemoveAccountNumber = Table.RemoveColumns(Groups, "AccountNumber"),
GroupsSummarized = Table.TransformColumns(RemoveAccountNumber, {"PerAccount", fnSummarizeIncidents}),
ExpandTable = Table.ExpandTableColumn(GroupsSummarized, "PerAccount", {"IncidentId", "AccountNumber", "CallDate", "AccountDuration", "SerieIndex", "Sequence Duration", "Total Calls", "Total Duration", "First Incident"})
in
ExpandTable
Hi @mark_carlisle,
I've tried some optimizations but I've appended them to the old post: https://community.powerbi.com/t5/Power-Query/Dates-in-period-for-a-dynamic-period/m-p/714354/highlig...
Hi @mark_carlisle,
I'm still thinking how to improve the performance. I've implemented another way without recursion, check it out and please test with your big dataset.
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("pdRLi9swEADgv2JyXoI0oxk9bmVpYQ+F0O1tyUGkJoT6EWynNP++kh27cjEFxweLkR/yx2hGHx87REGW2Oxedp9/+/Ja5NlrXV59dQ93rNFMCiGEr7e2q8u8CaFQewl7EDJ+JNmBcMAhPPh7UxdFiL59fwvjl8N7GDX0rz2uYTaM06+PL9FhgaTSaxy4FxQdNk6sI+HQzhxvVdv5ovDdpa4yX/3IPp26y69++vdpdmjqc+PL6EKKTjGCEclMtyfhyDUK6FmuVE6i61d/nmuG1EYuJFxJjIteQ3KDV7ET27wK7OjF1GstLHmt6X/9nJdjfkFs8kqgqR5UAkbo6/RfMAoBsAEMNuR4E5hYjl5KvUaYRS8hPu3VToV+09sSbAlGMCdg0iyWwKQ1rzuoEjA5AkdqBn7Pu9u1l371lT/ncfGwdH3PYwiCR51OdZZ5ScfKruovNd9+Mk7hTBcSdcrb9lKd45d10+QhndU581l5Can8mfcKmprKJEiLsHRoKUBUa5A0R6JyINcjNesRadPOR7HY+iQNrekkKVMlOgSH/0ulL063oUrbuM08bfMwTLpH+2wsQyn2greX4eMaZiAeJ9CAOR7/AA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [IncidentId = _t, AccountName = _t, AccountNumber = _t, AccountType = _t, CallDate = _t, CallTime = _t, MajorComment = _t, MinorComment = _t, ThirdTier = _t, AccountDuration = _t, TotalCalls = _t, CallNumber = _t, TotalDuration = _t, SeqDuration = _t, FirstIncidentId = _t]),
#"Removed Columns" = Table.RemoveColumns(Source,{"AccountName", "AccountType", "MajorComment", "MinorComment", "ThirdTier", "CallTime", "TotalCalls", "CallNumber", "TotalDuration", "SeqDuration", "FirstIncidentId"}),
#"Changed Type with Locale" = Table.TransformColumnTypes(#"Removed Columns", {{"CallDate", type date}}, "de-DE"),
#"Changed Type" = Table.TransformColumnTypes(#"Changed Type with Locale",{{"AccountDuration", Int64.Type}}),
fnSummarizeIncidents = (tbl as table) as table =>
let
SortByCallDate = Table.Sort(tbl, {{"CallDate", Order.Ascending}}),
IndexColumn = Table.AddIndexColumn(SortByCallDate, "Index", 0, 1),
// a new serie starts if the count of days between 2 incidents of one account is longer than days
IncidentSerieStart = Table.AddColumn(IndexColumn, "IncidentSerieStart", each
if [Index] = 0 then
1
else if Duration.Days([CallDate] - tbl[CallDate]{[Index]-1}) >= 7 then
1
else
0
),
// sum all previous IncidentSerieStarts to have an ID for a serie
IncidentSerieNumber = Table.AddColumn(IncidentSerieStart, "IncidentSerieNumber",
(parentRecord) =>
List.Sum(
Table.Column(
Table.SelectRows(
IncidentSerieStart,
(childRecord) => parentRecord[Index] >= childRecord[Index]
),
"IncidentSerieStart"
)
)
),
// group by series
Groups = Table.Group(IncidentSerieNumber, {"IncidentSerieNumber"}, {{"IncidentSerie", each _}}),
// remove incident serie number
RemoveIncidentSerieNumber = Table.RemoveColumns(Groups,{"IncidentSerieNumber"}),
// add an inner index for every group
GroupInnerIndex = Table.TransformColumns(RemoveIncidentSerieNumber, {"IncidentSerie", each Table.AddIndexColumn(_, "SerieIndex", 1, 1)}),
// calculate a running total of AccountDuration
GroupInnerRunningTotalOfAccountDuration = Table.TransformColumns(GroupInnerIndex, {"IncidentSerie", (tbl) => Table.AddColumn(tbl, "Sequence Duration", (rec) =>
List.Sum(Table.Column(Table.SelectRows(tbl, each [SerieIndex] <= rec[SerieIndex]), "AccountDuration"))
)}),
// aggregations
TotalCalls = Table.AddColumn(GroupInnerRunningTotalOfAccountDuration, "Total Calls", each Table.RowCount([IncidentSerie]), Int64.Type),
TotalDuration = Table.AddColumn(TotalCalls, "Total Duration", each List.Sum(Table.Column([IncidentSerie], "AccountDuration"))),
FirstIncident = Table.AddColumn(TotalDuration, "First Incident", each List.Min(Table.Column([IncidentSerie], "IncidentId"))),
// expand group
ExpandTable = Table.ExpandTableColumn(
FirstIncident,
"IncidentSerie",
{"IncidentId", "AccountNumber", "CallDate", "AccountDuration", "SerieIndex", "Sequence Duration"}
)
in
ExpandTable,
Groups = Table.Group(#"Changed Type", {"AccountNumber"}, {{"PerAccount", each _}}),
RemoveAccountNumber = Table.RemoveColumns(Groups, "AccountNumber"),
GroupsSummarized = Table.TransformColumns(RemoveAccountNumber, {"PerAccount", fnSummarizeIncidents}),
ExpandTable = Table.ExpandTableColumn(GroupsSummarized, "PerAccount", {"IncidentId", "AccountNumber", "CallDate", "AccountDuration", "SerieIndex", "Sequence Duration", "Total Calls", "Total Duration", "First Incident"})
in
ExpandTable
Thanks for all this. I've optimised by initial query to the SQL DB to take advantage of folding right up the point of the function now so that's a bit faster.
This version of the function, with some additional steps after it to complete the query, is managing to complete for a date range of 01/05/2019 - 17/06/2019 (approx 185,000 rows) in around 15 minutes on my laptop. The recursive version was not able to do a single day so this is a definite improvement.
I'm now working on add the whole thing to PBI Dataflows to automate the whole process.
Thanks again for the help.
| User | Count |
|---|---|
| 20 | |
| 10 | |
| 8 | |
| 8 | |
| 7 |