Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Reply
Anonymous
Not applicable

Number of rows loading into Power BI after completing Query steps much higher than the real number

Hi everyone thank you for taking the time to look into my question. I am quite new to Power BI and I have been running into an issue which I don't understand the origin. 
I have a table with initially a large amount of rows (6 million) which I have filtered down to a couple of hundred. As I need for each row information from the previous row , I added an index column and, based on it,  I retrieve into two additional columns two pieces of data from the previous line. I also added an if condition to avoid retrieving information if one cell of the previous row displays a specific value. All was done using Power Query formulas. In the Power Query preview, it displays extacly the information I want to see. Yet when I try to load my table in Power BI, the loading of rows never ends and displays a number of rows currently loaded far greater than what it is supposed to be in that table. The loading becomes gradually slower as the number of rows loaded increases. . If I try to load the table without the added columns, it works completely fine. Would anyone know what it could be due to ?

1 REPLY 1
BA_Pete
Super User
Super User

Hi @Anonymous ,

 

This is due to your code requiring your table to be scanned multiple times to achieve the result you want.

By referring back to the table itself in order to pick up the previous row, you are requiring Power Query to load that table again so that it can be scanned to find rows in itself that meet your matching criteria.

 

For example, let's say your table starts at 6M rows, then your create a filter step called 'filterStep' to bring it down to 200 rows. In your next step where you want to find the previous row values (let's call it 'prevRowStep'), you refer to 'filterStep' as the table you want to grab your previous row values from. Here's what Power Query is doing:

 

1) Source = Import 6M rows

2) filterStep = Filter to 200 rows

3) prevRowStep = Get previous row values from filterStep table, therefore has to reload up to filterStep:

    3a) Source(temp) = Import 6M rows

    3b) filterStep(temp) = Filter to 200 rows

    3c) prevRowStep = Apply comparison criteria between filterStep and filterStep(temp) to display output

 

This is likely simplified versus what you are actually implementing (e.g. additional 'if' statements etc.), but you can already see that when you apply this structure to the data model, you are essentially loading 12M rows in order for this transformation to be applied. The 'if' statement itself doesn't reduce this at all, because it has to load all the data to enable the 'if' comparison to take place. It can't just discount rows that don't meet the 'if' criteria without knowing what's in them i.e. loading them.

 

Some things you can do to help:

 

1) If you are working on an SQL Server or another 'foldable' source, ensure that your query steps are folding right up to the filterStep query step. This will allow PQ to offload the initial steps to the server so it's only working with two tables of 200 rows for this last transformation (the server will do the 6M > 200 bit, so PQ doesn't have to).

 

2) You MAY find some mileage in using Buffer functions (Table.Buffer, List.Buffer etc.) in order to put a static version of your query into memory before performing futher transformations on it. PQ can then just refer to the buffered table rather than keep going back to the source to reload it each time you reference it. Without seeing your code, I can't definitely say whether this would work for you or not, but it's a fairly standard way of limiting source scans in PQ.

 

Pete



Now accepting Kudos! If my post helped you, why not give it a thumbs-up?

Proud to be a Datanaut!




Helpful resources

Announcements
April AMA free

Microsoft Fabric AMA Livestream

Join us Tuesday, April 09, 9:00 – 10:00 AM PST for a live, expert-led Q&A session on all things Microsoft Fabric!

March Fabric Community Update

Fabric Community Update - March 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors
Top Kudoed Authors