Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

The ultimate Microsoft Fabric, Power BI, Azure AI & SQL learning event! Join us in Las Vegas from March 26-28, 2024. Use code MSCUST for a $100 discount. Register Now

Reply
campelliann
Post Patron
Post Patron

When to use Table.Buffer

Hi there, 

I would like to know in each steps or before each steps it is recommended to use Table.Buffer to improve performance.

Many thanks

1 ACCEPTED SOLUTION
BA_Pete
Super User
Super User

Hi @campelliann ,

 

This is an enormous question with many, many variables. However, very broadly, once you understand what Table.Buffer and List.Buffer do, this will often inform you of when is a good time to use them.

At their core, these functions load the entire target into memory. Therefore, one key use is to use them when Power Query would otherwise make multiple scans of the target.

For example:

If you want to do a conditional join in PQ (in this example to merge values from an SCD table), you may find yourself creating a column something like this to achieve it:

= Table.AddColumn(
  previousStep,
  "columnName",
  (OT) => Table.SelectRows(
    Table2,  // The 'Target'
    each OT[Field] = [Field]
      and OT[dateField] >= [table2StartDate]
      and OT[matchField2] <= [table2EndDate]
    ){0}[valueToMerge],
  type number
)

 

In this instance, PQ will need to keep rescanning Table2 for the conditions on each row. Therefore, if we buffer Table2 in the Table1 query (T2Buffer = Table.Buffer(Table2) ), then swap the target from Table2 to T2Buffer, PQ can just hit the memory for what it needs each time.

 

List.Buffer example:

If you want to filter a table by the values in another table, you may find yourself righting code like this:

Table.SelectRows(
  TableToFilter,
  each List.Contains(
    FilterTable[ColumnWithFilterValues]),  // The 'Target'
    TableToFilter[ColumnName]
  )
)

 

In this instance, PQ will scan FilterTable in order to check the values against each row value in TableToFilter.

Therefore, if you buffer the FilterTable values first, PQ can just hit the memory for these values. You can do this inline (we created a separate step for this previously), like this:

Table.SelectRows(
  TableToFilter,
  each List.Contains(
    List.Buffer(FilterTable[ColumnWithFilterValues])),  // The 'Target'
    TableToFilter[ColumnName]
  )
)

 

This example is particularly important when applying this technique on foldable sources, as doing it this way means you can stream non-foldable lists to foldable queries and maintain folding 👍

 

Hope this helps?

 

Pete



Now accepting Kudos! If my post helped you, why not give it a thumbs-up?

Proud to be a Datanaut!




View solution in original post

3 REPLIES 3
campelliann
Post Patron
Post Patron

@BA_Pete your posts are always insightful :). 

But for instance creating a Table.Buffer on an excel Sharepoint File, after the "call" to sharepoint. Or let's say I retrieve data from an API and in the step imediatly after the call, could that improve performance (which is affected by many variables as you say).

 

Generally, I'd say no, it wouldn't improve performance significantly, if at all. But, again, there's many variables.

If you bring in a 200 million row dataset with 50 columns you could potentially hurt performance by trying to buffer it as you may get close to, or exceed, your memory limit. This could slow down all of the mashup container processes, cause the query to fail/crash, or may even limit the functioning of the processing machine itself.

Furthermore, I don't think buffering prevents PQ making calls to the source if it needs to, it just helps to limit/prevent multiple table scans internally.

As a rule of thumb, I'd keep buffering limited to relatively small utility tables and lists, such as in my examples. As above, used too generously or incorrectly, buffering data has the potential to cause more harm than good.

 

Pete



Now accepting Kudos! If my post helped you, why not give it a thumbs-up?

Proud to be a Datanaut!




BA_Pete
Super User
Super User

Hi @campelliann ,

 

This is an enormous question with many, many variables. However, very broadly, once you understand what Table.Buffer and List.Buffer do, this will often inform you of when is a good time to use them.

At their core, these functions load the entire target into memory. Therefore, one key use is to use them when Power Query would otherwise make multiple scans of the target.

For example:

If you want to do a conditional join in PQ (in this example to merge values from an SCD table), you may find yourself creating a column something like this to achieve it:

= Table.AddColumn(
  previousStep,
  "columnName",
  (OT) => Table.SelectRows(
    Table2,  // The 'Target'
    each OT[Field] = [Field]
      and OT[dateField] >= [table2StartDate]
      and OT[matchField2] <= [table2EndDate]
    ){0}[valueToMerge],
  type number
)

 

In this instance, PQ will need to keep rescanning Table2 for the conditions on each row. Therefore, if we buffer Table2 in the Table1 query (T2Buffer = Table.Buffer(Table2) ), then swap the target from Table2 to T2Buffer, PQ can just hit the memory for what it needs each time.

 

List.Buffer example:

If you want to filter a table by the values in another table, you may find yourself righting code like this:

Table.SelectRows(
  TableToFilter,
  each List.Contains(
    FilterTable[ColumnWithFilterValues]),  // The 'Target'
    TableToFilter[ColumnName]
  )
)

 

In this instance, PQ will scan FilterTable in order to check the values against each row value in TableToFilter.

Therefore, if you buffer the FilterTable values first, PQ can just hit the memory for these values. You can do this inline (we created a separate step for this previously), like this:

Table.SelectRows(
  TableToFilter,
  each List.Contains(
    List.Buffer(FilterTable[ColumnWithFilterValues])),  // The 'Target'
    TableToFilter[ColumnName]
  )
)

 

This example is particularly important when applying this technique on foldable sources, as doing it this way means you can stream non-foldable lists to foldable queries and maintain folding 👍

 

Hope this helps?

 

Pete



Now accepting Kudos! If my post helped you, why not give it a thumbs-up?

Proud to be a Datanaut!




Helpful resources

Announcements
Fabric Community Conference

Microsoft Fabric Community Conference

Join us at our first-ever Microsoft Fabric Community Conference, March 26-28, 2024 in Las Vegas with 100+ sessions by community experts and Microsoft engineering.

February 2024 Update Carousel

Power BI Monthly Update - February 2024

Check out the February 2024 Power BI update to learn about new features.

Fabric Career Hub

Microsoft Fabric Career Hub

Explore career paths and learn resources in Fabric.

Fabric Partner Community

Microsoft Fabric Partner Community

Engage with the Fabric engineering team, hear of product updates, business opportunities, and resources in the Fabric Partner Community.

Top Solution Authors
Top Kudoed Authors