Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Vote for your favorite vizzies from the Power BI Dataviz World Championship submissions. Vote now!

Reply
ChemEnger
Advocate V
Advocate V

VERY Slow Group By

I have a query (from a Power BI Dataflow) with approximately 14000 rows.  The Primary key ([Parent], Int64.Type) can have multiple records ([Batch Number], type text) associated with it.  I need to count the number of [Batch Number]s associated with each [Parent] so that I can exclude [Parent] records with more than 20 [Batch Number] records.

 

No matter what I try, any aggregation step is VERY slow.  I have also tried buffering the data but this step is itself very slow.

 

My current steps (including re-expanding the original records are:

#"Parents Only" = Table.SelectColumns(#"Previous Step", {"Parent"}),
//Group the records, with a count **THIS IS THE SLOW STEP**
#"Grouped Rows" = Table.Group(#"Parents Only", {"Parent"}, {{"Count", each Table.RowCount(_), Int64.Type}}),
//Add a key **Not sure I need this but thought it would speed things up??**
#"Added [Parent] Key" = Table.AddKey(#"Grouped Rows", {"Parent"}, true),
// Filtered to Parents with NGT 20 batches listed
#"Filtered Rows" = Table.SelectRows(#"Added [Parent] Key", each [Count] <= 20),
// Remove count, just leave Parent ID
#"Removed Count" = Table.RemoveColumns(#"Filtered Rows",{"Count"}),
// Re-join the list of Parent ID numbers to the filtered table, now only with Parents with NGT 20 batches
#"Re-join" = Table.NestedJoin(#"Removed Count", {"Parent"}, #"Added [Parent] Key", {"Parent"}, "Original Record"),
#"Expanded Original Record" = Table.ExpandTableColumn(#"Re-join", "Original Record", {"Batch Number"}, {"Batch Number"}),

I have tried the Table.AddKey step before the Table.Group but this but this doesn't speed up the process and is also fairly slow itself.

I can't create a separate query of just the [Parent] records, as some contain e.g. multiple [Batch Number]s separated by commas that need de-aggregating (Table.ExpandListColumn...) to multiple records within the same query.

1 ACCEPTED SOLUTION
ChemEnger
Advocate V
Advocate V

SOLVED.

 

The speed (or lack thereof) was due to a previous step in the query - I was filtering the the first part of the [Batch Number] field against a list of known [Batch Prefix]s.  What was happening (obvious now!) was that the call was being made to this list multiple times.  A List.Buffer wrapper therefore solved it in one go!

 

//Buffer the list of Batch Prefixes to prevent multiple calls
#"Batch Prefixes" = List.Buffer(#"Prefixes"[Batch Prefix]),
//Select only rows with batch prefixes that match the [Batch Number] Prefix
#"Filter for Known Batches" = Table.SelectRows(#"Table with Batch Prefixes", each List.Contains(#"Batch Prefixes", [Batch Prefix])),

 

One point to note is that I did try to create a list query of the Batch Prefixes up front and reference that but got a Formula.Firewall error, hence creating the list from a table within this query (#"Prefixes"[Batch Prefix])

View solution in original post

2 REPLIES 2
ChemEnger
Advocate V
Advocate V

SOLVED.

 

The speed (or lack thereof) was due to a previous step in the query - I was filtering the the first part of the [Batch Number] field against a list of known [Batch Prefix]s.  What was happening (obvious now!) was that the call was being made to this list multiple times.  A List.Buffer wrapper therefore solved it in one go!

 

//Buffer the list of Batch Prefixes to prevent multiple calls
#"Batch Prefixes" = List.Buffer(#"Prefixes"[Batch Prefix]),
//Select only rows with batch prefixes that match the [Batch Number] Prefix
#"Filter for Known Batches" = Table.SelectRows(#"Table with Batch Prefixes", each List.Contains(#"Batch Prefixes", [Batch Prefix])),

 

One point to note is that I did try to create a list query of the Batch Prefixes up front and reference that but got a Formula.Firewall error, hence creating the list from a table within this query (#"Prefixes"[Batch Prefix])

You are a hero @ChemEnger . This post helped me SO MUCH. Thank you!

Helpful resources

Announcements
Power BI DataViz World Championships

Power BI Dataviz World Championships

Vote for your favorite vizzies from the Power BI World Championship submissions!

Sticker Challenge 2026 Carousel

Join our Community Sticker Challenge 2026

If you love stickers, then you will definitely want to check out our Community Sticker Challenge!

January Power BI Update Carousel

Power BI Monthly Update - January 2026

Check out the January 2026 Power BI update to learn about new features.

FabCon Atlanta 2026 carousel

FabCon Atlanta 2026

Join us at FabCon Atlanta, March 16-20, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.