Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Be one of the first to start using Fabric Databases. View on-demand sessions with database experts and the Microsoft product team to learn just how easy it is to get started. Watch now

Reply
JP8991
Helper IV
Helper IV

PowerBI.Dataflows vs PowerPlatform.Dataflows

Hey All.

It would be great to have some explanation regarding the performance of the two Dataflow connectors and how they differ technically.


I have been doing some testing with semi large Dataflows (35,000,000 rows) in Power BI desktop and have concluded that the PowerBI.Dataflows connector is much faster than the PowerPlatform.Dataflows connector, the former finishing 80-85% quicker than the latter when using the refresh button.

For context in Power Query:

• After the navigation step the rows are filtered by a 0/1 column to exlude some data.
• Then I use a remove other columns step to reduce the columns from 26 to 5.
• Then a replace value step is done to convert a 1-5 column into text results i.e 1 = Red, 2 = Blue etc.
• Then a grouped rows step is apllied to reduce the rows from 35,000,000 to 2,500,000.
• Then a conditional column is added which acts as a numerical sort column for another column.
• Finally a changed type step to ensure all the data is in the right format.


Both queries are identical bar the initial connector, I would love to know why there is such a performance different as I am very reluctant to move my company over to the new connector when the performance is so poor.

6 REPLIES 6
nxlu
Frequent Visitor

Bump.

 

We've been through a similar exercise and have found that the Legacy PowerBI.Dataflows is significantly faster than the new PowerPlatform.Dataflow connector, we're at a point on reverting back to PowerBI.Dataflow connectors, but trying to understand if there are any real advantages on leveraging PowerPlatform.Dataflow connector and what is the future state for the 'Legacy' PowerBI.Dataflow Connector overall.

 

@nxlu OP here.

After much testing I can confirm the following.

Generally speaking for tables that return more than 2,000,000 records PowerBI.Dataflow is faster, less than this there are negligible differences.

Where PowerPlatform.Dataflow comes into their own is when query folding is used, this can be enabled by ensuring the Dataflow itself has Enhanced compute engine settings set as On.

JP8991_0-1732591587141.png

 

When this is set, your Power Query steps will actually write SQL code back to the source (Query Folding) and this can drastically improve performance as the data is requested with these conditions in place, i.e. a filter that reduces the number of rows or a remove column step that reduces the number of columns.

 

PowerBI.Dataflow does not query fold and instead downloads the data first and then applies the subsequent steps. So if your source table is 30,000,000 records but you filter and agregate it down to 10,000 rows then the process still has to download the 30,000,000 records first. However, PowerPlatform.Dataflow would return 10,000 without having to download the 30,000,000.

I hope this makes sense, in summary, it depends...

JP8991
Helper IV
Helper IV

After continued testing using identical query logic (after the navigation step) I can confirm the Power Platform Dataflow Connector is much slower than the Power BI Dataflow Connector, in some instances by 80%.

Below is my code (removed the navigation step ID's).

 

let
    Source = PowerPlatform.Dataflows(null),
    #"Bookings Detail" = *NAVIGATION STEPS*,
    #"Filtered Rows" = Table.SelectRows(#"Bookings Detail", each ([Closure Date Filter] = 0)),
    #"Removed Other Columns" = Table.SelectColumns(#"Filtered Rows",{"Centre Code", "Booking Date", "Booking Type", "Room Type", "Bookings"}),
    #"Replaced Value" = Table.ReplaceValue(#"Removed Other Columns",each [Room Type], each if [Room Type] = 1 then "Nursery" else if [Room Type] = 2 then "Toddler" else if [Room Type] = 3 then "Junior Kindy" else if [Room Type] = 4 then "Kindy" else if [Room Type] = 5 then "Pre-school" else if [Room Type] = 6 then "Before School Care" else if [Room Type] = 7 then "After School Care" else if [Room Type] = 8 then "Vacation Care" else if [Room Type] = 9 then "Before/After School Care" else null,Replacer.ReplaceValue,{"Room Type"}),
    #"Grouped Rows" = Table.Group(#"Replaced Value", {"Centre Code", "Booking Date", "Booking Type", "Room Type"}, {{"Bookings", each List.Sum([Bookings]), type nullable number}}),
    #"Added Conditional Column" = Table.AddColumn(#"Grouped Rows", "Room Type Sort", each if [Room Type] = "Toddler" then 2 else if [Room Type] = "Kindy" then 4 else if [Room Type] = "Pre-school" then 5 else if [Room Type] = "Nursery" then 1 else if [Room Type] = "Junior Kindy" then 3 else if [Room Type] = "After School Care" then 7 else if [Room Type] = "Before School Care" then 6 else if [Room Type] = "Vacation Care" then 9 else if [Room Type] = null then 10 else if [Room Type] = "Before/After School Care" then 8 else null, type number),
    #"Changed Type" = Table.TransformColumnTypes(#"Added Conditional Column",{{"Centre Code", type text}, {"Booking Date", type date}, {"Booking Type", type text}, {"Room Type", type text}, {"Bookings", Int64.Type}, {"Room Type Sort", Int64.Type}})
in
    #"Changed Type"

 

It appears the Power Platform Dataflow Connector loads the data by rows whereas the Power BI Dataflow Connector loads it by data and is much faster. It is worth mentioning I have a gigabit 1000/50 internet connection so I can download from Dataflows very fast.

 

JP8991_0-1675050500103.png

 

It is very dissapointing that Microsoft have made the Power Platform Dataflow the default as clearly the performance is not as good.

It would be great to have an explanation as to why there is a big performance difference.

JP8991
Helper IV
Helper IV

Thanks for your reply however I am not really sure that answers anything, I have been using PowerBI Dataflows since inception so know them pretty well.

What I am trying to understand is why they are so much slower, at least in my case.

Hey JP, 

Did you ever get to the bottom of this? Would also like to understand exactly what the difference and why between these two?

Would love to hear from the product team for their insights.


-Aaron

v-stephen-msft
Community Support
Community Support

Hi @JP8991 ,

 

Here are several relationships between Power Platform dataflows and Power BI dataflows. Please refer to

Power Platform dataflows - Power Platform Release Plan | Microsoft Learn

If you want to know more about Power Platform dataflows, there is a document for your reference.

Create and use dataflows in Microsoft Power Platform - Power Query | Microsoft Learn

Actually, Power Platform dataflow will be easier and faster according to the documentation.

 

 

Best Regards,

Stephen Tao

 

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Helpful resources

Announcements
Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!

Dec Fabric Community Survey

We want your feedback!

Your insights matter. That’s why we created a quick survey to learn about your experience finding answers to technical questions.

ArunFabCon

Microsoft Fabric Community Conference 2025

Arun Ulag shares exciting details about the Microsoft Fabric Conference 2025, which will be held in Las Vegas, NV.

December 2024

A Year in Review - December 2024

Find out what content was popular in the Fabric community during 2024.