Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
LD2022
Helper II
Helper II

Limit Dimension Data based on Fact Table in Power Query

There are 2 tables:

1. Dimension Table D - EMPNO (25,000,000 rows) sourced from Oracle DB

2. Fact Table F - Excel source

join on D.EMPNO=F.EMPNO using Merge queries in Power Query.

 

The performance is extremely slow using Import/DirectQuery mode. I tried to limit the rows in Dimension table by following the below steps:

1. Buffered List to extract unique list of EMPNO in F - EmpList 

Emplist = let
Source = List.Distinct(F[EMPNO]),
#"Buffered List" = List.Buffer(Source)
in
#"Buffered List"

2. Filter Dimension D  i.e. Table.SelectRows(#"Changed Type",each List.Contains(EmpList, [EMPNO]))

 

It works but it is extremely slow when "Enable Load" is enabled for Dimension D. Although the Step (2) returns few rows(129 rows), when I apply the changes in Power Query, it loads all the rows for Dimension D.

 

I was hoping to get some guidance if there is anything wrong or can be done differently.

 

 

2 ACCEPTED SOLUTIONS

 

Ok. Your current query setup won't allow query folding.

Can you try connecting without using an explicit SQL statement please?

 

Connect to the DB again, and leave the SQL statement blank:

BA_Pete_0-1658503499145.png

 

Just hit OK, then select your dim_employee table from the table list and import.

 

Assuming this goes as planned, select the [EMPNO] column in your new dimension table and filter just one value. This should be fast to implement and sets us up a new step with the correct structure.

Edit that new step in the formula bar, so it uses our streaming list filter, something like this:

= Table.SelectRows(
    previousStepName,
    each List.Contains( List.Buffer( factTableName[EMPNO] ), [EMPNO] )
)

 

This setup should allow PQ to stream the [EMPNO] values from your Excel source into the SQL native query sent to the SQL source for your dim_employee table.

 

Pete 



Now accepting Kudos! If my post helped you, why not give it a thumbs-up?

Proud to be a Datanaut!




View solution in original post

Hi @LD2022 ,

 

If you right-click your #"Filtered Rows" step, is 'Native Query' now selectable, or is it still greyed-out?

If it's still greyed out, try changing one of the column data types to the same type, but not a text column. For example, change a date type column to date type, or a decimal type column to decimal. This is purely to force PQ to re-evaluate the query and hopefully force it to generate a native query to source.

If native query IS selectable, then you're probably at the limit of optimisation. The Oracle Data Access Client (ODAC) requirement isn't the best in this regard.

 

Pete

 

Pete



Now accepting Kudos! If my post helped you, why not give it a thumbs-up?

Proud to be a Datanaut!




View solution in original post

22 REPLIES 22

 

No problem, glad we got there in the end. I happily accept kudos as payment 😉

Regarding additional changes, you should be able to make any additional transformations that can be converted into Oracle SQL code. Changing column names should be fine, as this will just be converted to 'AS' aliases in the SQL, but quite often changing data type to text, for example, will fail, as this generates an ambiguous VARCHAR conversion, which PQ won't take a risk on.

There's plenty of resources which outline in detail which transformations will fold to your source or not, but good old trial and error with your favourite/most-used functions is probably the best way. I'm sure, going forward, you now see the immense benefit you can get by ensuring your queries fold to the source.

 

Pete



Now accepting Kudos! If my post helped you, why not give it a thumbs-up?

Proud to be a Datanaut!




Hi @BA_Pete 

I will try to establish relationship and see if it is any faster.

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors
Top Kudoed Authors