Solved: Re: Agg Table Efficiency Min Date

codyraptor · ‎07-16-2022

I'm trying to create an Agg table to take advantage of query folding as well as increase efficiency. I can aggregate the table excluding the inv-date and it's extremely fast. Millions of rows consolidate within seconds. My issue is...I need the invoice date added and I only need the 'min' Inv-Date for every ID. Everything I do seems to force it to look at every row (20M+ records)...I've tried Table.Buffer on sorts...I've tried grouping and adding an index. I've also turned off the background preview. All of my fields are intergers...super basic. I'm also doing my grouping at the end after Filtering my data down. It seems I'm losing the folding once I try to group by the min date. Any help with this structure is greatly appreciated!

Below is an example of the data. I need the Agg results. Oh..I can also do this in SQL...but my source is Oracle and query folding isn't available...so still an efficiency issue.

codyraptor · ‎07-22-2022

In my specific case...the view is the fastest method to refreshing data and getting the groupings necessary. Unfortunately the underlying data has several columns that need grouping and Power Query wasn't providing the needed efficiency. I've improved the sql code for the view and can run sql directly in the model for 2 fact tables ~20mil rows with about 25 columns in 52mins. Hoping this will improve even more once the view is in place and I can push changes to Oracle. Thanks @lbendlin for trying to assist!!

View solution in original post

codyraptor · ‎07-18-2022

I've now have the SQL as an aggregated view table on our warehouse...so PBI is just pulling the aggregated view. Every change I make in Power Query seems to trigger another call back to the warehouse such as removing time from a date field. Is this normal behavior? Also..I lost query folding after extracting the date. I was under the impression query folding would continue since I'm not running the native query in PBI. Thoughts?

codyraptor · ‎07-22-2022

In my specific case...the view is the fastest method to refreshing data and getting the groupings necessary. Unfortunately the underlying data has several columns that need grouping and Power Query wasn't providing the needed efficiency. I've improved the sql code for the view and can run sql directly in the model for 2 fact tables ~20mil rows with about 25 columns in 52mins. Hoping this will improve even more once the view is in place and I can push changes to Oracle. Thanks @lbendlin for trying to assist!!

lbendlin · ‎07-19-2022

You need to distinguish your developer experience from the end user experience when they interact with the report.

You're not using Direct Query, right?

codyraptor · ‎07-19-2022

@lbendlin correct...I'm using import. Our end user slices and dices the data for analytics...it's not just support for specific visuals. We have slicers that allow the end user to look at the data several different ways. That said...One report page could have 7 slicers coming from 7 different columns in the data....state, vendor, product, date, inv date, trade, company for example. It seems pulling the view is just as slow to refresh as if I were using sql directly in PBI.

lbendlin · ‎07-17-2022

Not sure what Query folding has to do with all of this. Can you create a view in Oracle that provides the aggregations?

codyraptor · ‎07-17-2022

@lbendlin I can't unfortunately. I don't have access to the source. That would definitely be the easiest thing. I'm trying to keep PBI from reading every row. Like I said...count is super fast. It's only when I try to fix the min date issue that I lose efficiency in my agg table. I can use native sql...but that seems to be very inefficient from an oracle db

lbendlin · ‎07-17-2022

"I can use native sql...but that seems to be very inefficient from an oracle db"

please elaborate.

codyraptor · ‎07-17-2022

Oracle doesn't allow query folding at this time...so I can't get the benefit of query folding using Power Query to make adjustments after sql. When I aggregate data using the sql statement...it takes almost an hour to run around 8m rows. My sql is doing a count distinct. That's literally the only aggregation in this table. My 'group by' is 7 Keys from the relational db..all intergers. I have 3 sub queries where I have to group things a certain way because the source table is at a more granular level. Sub queries are pretty basic but I'm open to ideas if that's causing my issue.

codyraptor · ‎07-17-2022

@lbendlin quick sql Example...

select count( distinct q.id),

Q.status

from(

select a.id,

case when a.id in (

select b.id

from b

where b.id = a.id and

b.status = active) then active else other end as status

from a) q

group by q.status

Agg Table Efficiency Min Date

Helpful resources

Power BI Dataviz World Championships

Power BI Monthly Update - December 2025

FabCon Atlanta 2026

FabCon is coming to Atlanta

Agg Table Efficiency Min Date

Helpful resources

Power BI Dataviz World Championships

Power BI Monthly Update - December 2025

FabCon Atlanta 2026