Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Get Fabric Certified for FREE during Fabric Data Days. Don't miss your chance! Request now

Reply
ChemEnger
Advocate V
Advocate V

VERY Slow Group By

I have a query (from a Power BI Dataflow) with approximately 14000 rows.  The Primary key ([Parent], Int64.Type) can have multiple records ([Batch Number], type text) associated with it.  I need to count the number of [Batch Number]s associated with each [Parent] so that I can exclude [Parent] records with more than 20 [Batch Number] records.

 

No matter what I try, any aggregation step is VERY slow.  I have also tried buffering the data but this step is itself very slow.

 

My current steps (including re-expanding the original records are:

#"Parents Only" = Table.SelectColumns(#"Previous Step", {"Parent"}),
//Group the records, with a count **THIS IS THE SLOW STEP**
#"Grouped Rows" = Table.Group(#"Parents Only", {"Parent"}, {{"Count", each Table.RowCount(_), Int64.Type}}),
//Add a key **Not sure I need this but thought it would speed things up??**
#"Added [Parent] Key" = Table.AddKey(#"Grouped Rows", {"Parent"}, true),
// Filtered to Parents with NGT 20 batches listed
#"Filtered Rows" = Table.SelectRows(#"Added [Parent] Key", each [Count] <= 20),
// Remove count, just leave Parent ID
#"Removed Count" = Table.RemoveColumns(#"Filtered Rows",{"Count"}),
// Re-join the list of Parent ID numbers to the filtered table, now only with Parents with NGT 20 batches
#"Re-join" = Table.NestedJoin(#"Removed Count", {"Parent"}, #"Added [Parent] Key", {"Parent"}, "Original Record"),
#"Expanded Original Record" = Table.ExpandTableColumn(#"Re-join", "Original Record", {"Batch Number"}, {"Batch Number"}),

I have tried the Table.AddKey step before the Table.Group but this but this doesn't speed up the process and is also fairly slow itself.

I can't create a separate query of just the [Parent] records, as some contain e.g. multiple [Batch Number]s separated by commas that need de-aggregating (Table.ExpandListColumn...) to multiple records within the same query.

1 ACCEPTED SOLUTION
ChemEnger
Advocate V
Advocate V

SOLVED.

 

The speed (or lack thereof) was due to a previous step in the query - I was filtering the the first part of the [Batch Number] field against a list of known [Batch Prefix]s.  What was happening (obvious now!) was that the call was being made to this list multiple times.  A List.Buffer wrapper therefore solved it in one go!

 

//Buffer the list of Batch Prefixes to prevent multiple calls
#"Batch Prefixes" = List.Buffer(#"Prefixes"[Batch Prefix]),
//Select only rows with batch prefixes that match the [Batch Number] Prefix
#"Filter for Known Batches" = Table.SelectRows(#"Table with Batch Prefixes", each List.Contains(#"Batch Prefixes", [Batch Prefix])),

 

One point to note is that I did try to create a list query of the Batch Prefixes up front and reference that but got a Formula.Firewall error, hence creating the list from a table within this query (#"Prefixes"[Batch Prefix])

View solution in original post

2 REPLIES 2
ChemEnger
Advocate V
Advocate V

SOLVED.

 

The speed (or lack thereof) was due to a previous step in the query - I was filtering the the first part of the [Batch Number] field against a list of known [Batch Prefix]s.  What was happening (obvious now!) was that the call was being made to this list multiple times.  A List.Buffer wrapper therefore solved it in one go!

 

//Buffer the list of Batch Prefixes to prevent multiple calls
#"Batch Prefixes" = List.Buffer(#"Prefixes"[Batch Prefix]),
//Select only rows with batch prefixes that match the [Batch Number] Prefix
#"Filter for Known Batches" = Table.SelectRows(#"Table with Batch Prefixes", each List.Contains(#"Batch Prefixes", [Batch Prefix])),

 

One point to note is that I did try to create a list query of the Batch Prefixes up front and reference that but got a Formula.Firewall error, hence creating the list from a table within this query (#"Prefixes"[Batch Prefix])

You are a hero @ChemEnger . This post helped me SO MUCH. Thank you!

Helpful resources

Announcements
Fabric Data Days Carousel

Fabric Data Days

Advance your Data & AI career with 50 days of live learning, contests, hands-on challenges, study groups & certifications and more!

October Power BI Update Carousel

Power BI Monthly Update - October 2025

Check out the October 2025 Power BI update to learn about new features.

FabCon Atlanta 2026 carousel

FabCon Atlanta 2026

Join us at FabCon Atlanta, March 16-20, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.

Top Solution Authors