Find everything you need to get certified on Fabric—skills challenges, live sessions, exam prep, role guidance, and more. Get started
I have a table similar to the one below but with 10's of thousands of rows. What I want to do in power query, is for each row, count the number of rows the Name appears in the previous 12 months and add this as a new column.
e.g. for the last entry, 'Jane' occurs twice in the last 12 month date period 01-10-2022 to 30-09-2023.
I tried this by creating 2 new columns for the previous 12 months start and finish dates, then adding the column with a custom fuction to calculate the number of occurences in the last 12 months. But this takes a long time to run, so I'm hoping there is a faster way to do this.
Custom Function
(nametable as table, start_date as date, finish_date as date, name as text) =>
let
Source = nametable,
#"Filtered Rows1" = Table.SelectRows(Source, each [Name] = name),
#"Filtered Rows2" = Table.SelectRows(#"Filtered Rows1", each [Date] >= start_date and [Date] <= finish_date),
#"Row Count" = Table.RowCount(#"Filtered Rows2")
in
#"Row Count"
Name table
Date | Name |
01/08/2021 | John |
31/08/2021 | Jane |
30/09/2021 | John |
30/10/2021 | Jane |
29/11/2021 | Jim |
29/12/2021 | John |
28/01/2022 | John |
27/02/2022 | Jim |
29/03/2022 | John |
28/04/2022 | Jim |
28/05/2022 | John |
27/06/2022 | John |
27/07/2022 | Jim |
26/08/2022 | John |
25/09/2022 | John |
25/10/2022 | Jim |
24/11/2022 | John |
24/12/2022 | John |
23/01/2023 | Jim |
22/02/2023 | John |
24/03/2023 | John |
23/04/2023 | John |
23/05/2023 | Jane |
22/06/2023 | Jim |
22/07/2023 | Jane |
21/08/2023 | Jim |
20/09/2023 | John |
20/10/2023 | Jane |
Solved! Go to Solution.
Hi @russell80,
Sorry, re-wrote my post as I found a much better solution.
This is a very intersting problem. The way to improve it is to somehow reduce the number of iterations and the size of the data being manipulated with.
The code below runs against 20k lines in about 3 sec on my laptop, which is quite new, so I guess that this is somewhere near the in-cloud performance. This goes more or less in line with @wdx223_Daniel but uses grouping to reduce the number of the rows we have to deal in each particular period of time.
Ignore anything before f this steps are used to set up the scene for testing.
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("HcVJDQAwCARAL7yrgjuAA4J/G012PrNLzEz3lkQEqyo2M+zuOCJwZuKqwt2NZ4buPg==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Name = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Name", type text}}),
Dates = List.Dates(#date(2018,1,1), 2000, #duration(1, 0,0,0)),
Custom1 = Table.AddColumn(#"Changed Type", "Date", each Dates),
#"Expanded Date" = Table.ExpandListColumn(Custom1, "Date"),
#"Reordered Columns" = Table.ReorderColumns(#"Expanded Date",{"Date", "Name"}),
#"Changed Type1" = Table.TransformColumnTypes(#"Reordered Columns",{{"Date", type date}}),
f = (t as table) as table =>
let
mList = List.Buffer(t[Date]),
out = List.Accumulate(mList, {}, (a, n)=>
let
limit = Date.AddMonths(n, -12),
out = a & {[Date = n, Count = 1 + List.Count(List.Select(mList, each _ < n and _ > limit))]}
in out)
in Table.FromRecords(out),
#"Grouped Rows" = Table.Group(#"Changed Type1", {"Name"}, {{"Data", f}}),
#"Expanded Data" = Table.ExpandTableColumn(#"Grouped Rows", "Data", {"Date", "Count"}, {"Date", "Count"}),
#"Changed Type2" = Table.TransformColumnTypes(#"Expanded Data",{{"Date", type date}})
in
#"Changed Type2"
Cheers,
John
Cheers,
John
Hi @russell80,
Sorry, re-wrote my post as I found a much better solution.
This is a very intersting problem. The way to improve it is to somehow reduce the number of iterations and the size of the data being manipulated with.
The code below runs against 20k lines in about 3 sec on my laptop, which is quite new, so I guess that this is somewhere near the in-cloud performance. This goes more or less in line with @wdx223_Daniel but uses grouping to reduce the number of the rows we have to deal in each particular period of time.
Ignore anything before f this steps are used to set up the scene for testing.
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("HcVJDQAwCARAL7yrgjuAA4J/G012PrNLzEz3lkQEqyo2M+zuOCJwZuKqwt2NZ4buPg==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Name = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Name", type text}}),
Dates = List.Dates(#date(2018,1,1), 2000, #duration(1, 0,0,0)),
Custom1 = Table.AddColumn(#"Changed Type", "Date", each Dates),
#"Expanded Date" = Table.ExpandListColumn(Custom1, "Date"),
#"Reordered Columns" = Table.ReorderColumns(#"Expanded Date",{"Date", "Name"}),
#"Changed Type1" = Table.TransformColumnTypes(#"Reordered Columns",{{"Date", type date}}),
f = (t as table) as table =>
let
mList = List.Buffer(t[Date]),
out = List.Accumulate(mList, {}, (a, n)=>
let
limit = Date.AddMonths(n, -12),
out = a & {[Date = n, Count = 1 + List.Count(List.Select(mList, each _ < n and _ > limit))]}
in out)
in Table.FromRecords(out),
#"Grouped Rows" = Table.Group(#"Changed Type1", {"Name"}, {{"Data", f}}),
#"Expanded Data" = Table.ExpandTableColumn(#"Grouped Rows", "Data", {"Date", "Count"}, {"Date", "Count"}),
#"Changed Type2" = Table.TransformColumnTypes(#"Expanded Data",{{"Date", type date}})
in
#"Changed Type2"
Cheers,
John
Cheers,
John
Thansk @jbwtp , I've managed to get that running on my side. One thing I'd also like to know is: my source table actually has 26 columns, how can I keep all those columns so they are returned at the end when the table is expanded in #"Expanded Data"? Is that possible?
Hi @russell80,
Yes, this is possible. Sorry, should think about this beforehand.
This is the code that does it - a slight modification of the previous version:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("HcVJDQAwCARAL7yrgjuAA4J/G012PrNLzEz3lkQEqyo2M+zuOCJwZuKqwt2NZ4buPg==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Name = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Name", type text}}),
Dates = List.Dates(#date(2018,1,1), 2000, #duration(1, 0,0,0)),
Custom1 = Table.AddColumn(#"Changed Type", "Date", each Dates),
#"Expanded Date" = Table.ExpandListColumn(Custom1, "Date"),
#"Reordered Columns" = Table.ReorderColumns(#"Expanded Date",{"Date", "Name"}),
#"Changed Type1" = Table.TransformColumnTypes(#"Reordered Columns",{{"Date", type date}}),
f = (t as table) as table =>
let
mList = Table.ToRecords(t),
out = List.Accumulate(mList, {}, (a, n)=>
let
limit = Date.AddMonths(n[Date], -12),
out = a & {n & [Count = 1 + List.Count(List.Select(mList, each _[Date] < n[Date] and _[Date] > limit))]}
in out)
in Table.FromRecords(out),
#"Grouped Rows" = Table.Group(#"Changed Type1", {"Name"}, {{"Data", f}}),
Custom2 = Table.Combine(#"Grouped Rows"[Data], Value.Type(Table.AddColumn(#"Changed Type1", "Count", each null, type number)))
in
Custom2
Kind regards,
John
hope this can help
= let a=List.Buffer(Table.ToRows(Table.Sort(Source,"Date"))) in #table(Table.ColumnNames(Source)&{"Count"},List.Accumulate(a,{{},[]},(x,y)=>let aa=Record.TransformFields(x{1},{y{1},each List.Skip((_??{})&{y{0}},each _<Date.AddMonths(y{0},-12))},2) in {x{0}&{y&{List.Count(Record.Field(aa,y{1}))}},aa}){0})
This should work. No need for a function.
let
Source = NameTable,
#"Filtered Rows" = Table.SelectRows(Source, each Date.IsInPreviousNMonths([Date], 12)),
#"Grouped Rows" = Table.Group(#"Filtered Rows", {"Name"}, {{"Count", each Table.RowCount(_), Int64.Type}}),
#"Merged Queries" = Table.NestedJoin(Source, {"Name"}, #"Grouped Rows", {"Name"}, "Count", JoinKind.LeftOuter),
#"Expanded Count" = Table.ExpandTableColumn(#"Merged Queries", "Count", {"Count"}, {"Count"})
in
#"Expanded Count"
Thanks @talbot but that's not quite giving me the result I need. Here's a smaller table with how it should look.
To get the result, I used the following COUNTIFS formula in Cell C2 and dragged it down:
=COUNTIFS($C$2:$C$10,"="&C2,$B$2:$B$10,"<"&EOMONTH(B2,-1)+1,$B$2:$B$10,">"&EOMONTH(B2,-13))
A | B | C | |
1 | Date | Name | Countifs |
2 | 28/09/2022 | John | 2 |
3 | 28/09/2022 | Jane | 1 |
4 | 01/09/2022 | John | 2 |
5 | 01/09/2022 | Jane | 1 |
6 | 01/08/2022 | John | 2 |
7 | 01/09/2021 | John | 1 |
8 | 01/09/2021 | Jane | 1 |
9 | 31/08/2021 | John | 0 |
10 | 31/08/2021 | Jane | 0 |
Check out the September 2024 Power BI update to learn about new features.
Learn from experts, get hands-on experience, and win awesome prizes.
User | Count |
---|---|
62 | |
40 | |
36 | |
28 | |
15 |