Solved: Is it necessary to use a Select statement when cho...

crmorgan7 · ‎10-01-2025

I am fairly new to Power BI dataflow building, so bare with me.

One of our vendors told me that when I am in the workspace and I go to create a new ODBC dataflow on the "Connect to Data Source" page I should click on "Advanced Options" and use the Select * "table name" because it is more efficient than using the folder tree. Instead - on the "Connect to Data Source" page I wrote in my DSN and then clicked Next and used the folder tree to select the table. I was able to filter/transform the data and the dashboard I'm working on is working as expected.

Is there really a difference between using the Select statement and just using the folder tree to select the table?

I tried to add the Select statement after the fact but it throws all kinds of errors, hence my question of why I should even be concerned about this if I have a working dashboard. I appreciate the insight!

DataNinja777 · ‎10-05-2025

Hi @crmorgan7 ,

No, it's not strictly necessary to use a SELECT statement when creating your dataflow. As you've discovered, using the folder tree in the Navigator gets the job done and results in a working dashboard. The primary difference between the two methods isn't about the final outcome, but rather about performance and efficiency, especially when you're dealing with large amounts of data. Your vendor's advice points towards a best practice for optimization.

Think of it like a trip to the supermarket. Using the Navigator is like going to the store, grabbing an entire aisle's worth of products, bringing them all home, and then sorting through them in your kitchen to find what you need. It works, but it's inefficient. Using a SELECT statement, on the other hand, is like giving a detailed shopping list to a store employee at the entrance. They retrieve only the specific items you requested and bring them to you. This approach is far more efficient because the heavy lifting is done at the source (the database), and only the necessary data is transported. This concept of pushing the workload back to the database is known as Query Folding.

The method you choose often depends on the situation. The Navigator is perfectly fine when you're exploring a new dataset or when the tables are small and performance isn't a concern. However, for very large tables with millions of rows, using a SELECT statement from the start is significantly better. It reduces network traffic and the processing load on Power BI, leading to much faster data refresh times.

The errors you encountered when trying to add the statement later are quite common. Power Query records each transformation as a separate, sequential step. When you tried to edit the initial "Source" step, you changed its fundamental output, which broke the subsequent steps that were dependent on the original structure. The SELECT statement is meant to be part of the initial connection setup, not added after the fact.

Ultimately, if your current dashboard is performing well, there's no pressing need to rebuild it. You've found a solution that works. Just keep your vendor's advice in mind for future projects, as starting with a SELECT statement can be a lifesaver when you're working with massive datasets.

Best regards,

View solution in original post

v-priyankata · ‎10-06-2025

Hi @crmorgan7

Thank you for reaching out to the Microsoft Fabric Forum Community.

@Royel @lbendlin @DataNinja777 Thanks for the inputs

I hope the information provided by users was helpful. If you still have questions, please don't hesitate to reach out to the community.

DataNinja777 · ‎10-05-2025

Hi @crmorgan7 ,

No, it's not strictly necessary to use a SELECT statement when creating your dataflow. As you've discovered, using the folder tree in the Navigator gets the job done and results in a working dashboard. The primary difference between the two methods isn't about the final outcome, but rather about performance and efficiency, especially when you're dealing with large amounts of data. Your vendor's advice points towards a best practice for optimization.

Think of it like a trip to the supermarket. Using the Navigator is like going to the store, grabbing an entire aisle's worth of products, bringing them all home, and then sorting through them in your kitchen to find what you need. It works, but it's inefficient. Using a SELECT statement, on the other hand, is like giving a detailed shopping list to a store employee at the entrance. They retrieve only the specific items you requested and bring them to you. This approach is far more efficient because the heavy lifting is done at the source (the database), and only the necessary data is transported. This concept of pushing the workload back to the database is known as Query Folding.

The method you choose often depends on the situation. The Navigator is perfectly fine when you're exploring a new dataset or when the tables are small and performance isn't a concern. However, for very large tables with millions of rows, using a SELECT statement from the start is significantly better. It reduces network traffic and the processing load on Power BI, leading to much faster data refresh times.

The errors you encountered when trying to add the statement later are quite common. Power Query records each transformation as a separate, sequential step. When you tried to edit the initial "Source" step, you changed its fundamental output, which broke the subsequent steps that were dependent on the original structure. The SELECT statement is meant to be part of the initial connection setup, not added after the fact.

Ultimately, if your current dashboard is performing well, there's no pressing need to rebuild it. You've found a solution that works. Just keep your vendor's advice in mind for future projects, as starting with a SELECT statement can be a lifesaver when you're working with massive datasets.

Best regards,

Royel · ‎10-04-2025

Hi @crmorgan7,

When it comes to results, there’s no difference between the two methods you can use either one. However, each has its own advantages depending on your situation.

Use Folder Tree when:

You’re new to Power BI and want a simpler setup
Your data source supports automatic query folding
You don’t need complex filtering at the start
Your dataset is moderately sized

Use SELECT Statement when:

You’re working with large datasets and need better performance
You want to apply specific SQL optimizations
You need early filtering at the database level
You want to leverage special features of your ODBC driver

Note:
Using a SELECT statement is generally more efficient, especially for performance tuning.
However, the Folder Tree method also supports query folding so with proper setup, it can achieve similar performance. You can learn more about query folding from here

Find this helpful? ✔ Give a Kudo • Mark as Solution – help others too!

lbendlin · ‎10-01-2025

If in doubt consult the query diagnostics. It's possible that both options result in the same level of query folding.

Is it necessary to use a Select statement when choosing a table for my data flow?

Helpful resources

Power BI Monthly Update - November 2025

Fabric Data Days

FabCon Atlanta 2026

FabCon is coming to Atlanta

Is it necessary to use a Select statement when choosing a table for my data flow?

Helpful resources

Power BI Monthly Update - November 2025

Fabric Data Days

FabCon Atlanta 2026