Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Reply
ThisIsFrank
Frequent Visitor

SQL Query or Remove Columns

Hello,

 

I've got a question that I haven't been able to figure out on my own, Using PBI Desktop, asfter selecting a data source, you're given the option in Advanced Options to write an SQL statement to precisely select the columns/parameters you'd like to have load in your dataset. Lets assume I do the following:

 

SELCECT COLUMN-A, COLUMN-B

FROM TABLE-1

So I'll only get 2 columns from my 100 column table. 

 

Let's say that I did NOT write this statement and opted to load the entire report. Now I go to Edit Queries and Remove all Columns except COLUMN-A, COLUMN-B using the PBI UI. 

 

  • Did I just accomplish the same thing?
    • If I choose to Remove Columns, will the removed columns still be queried, but just not displayed in the dataset? 
  • If so, what is the equivalent to a WHERE statement in Query Editor? Some sort of DAX measure?

The end goal here is to limit load times. Our connection to a certain Oracle DB is very slow and I'm trying to make my dataset as lean as possible.

3 REPLIES 3
v-shex-msft
Community Support
Community Support

Hi ThisIsFrank,

 

>> If I choose to Remove Columns, will the removed columns still be queried, but just not displayed in the dataset?

I think the columns will be removed from query, but the query editor has stored the applied steps. You can cancel this step to recover this column.

 

>> If so, what is the equivalent to a WHERE statement in Query Editor? Some sort of DAX measure?

In query editor, WHERE statement is equivalent to 'FilteredRows' or ‘SelectedRows’; In dax measure, it same as filter function.

 

In summary, if you operate the data on query editor, it directly operate on the query, but in Dax formula, it use the filter to hide/display specify records.

 

Reference:

 

Remove columns (Power Query)

 

Regards,

Xiaoxin Sheng

Community Support Team _ Xiaoxin
If this post helps, please consider accept as solution to help other members find it more quickly.
KGrice
Memorable Member
Memorable Member

Yes and no, and maybe depending on what your next steps are 🙂

 

From the basic example you listed, you're getting the exact same result, and it wouldn't really matter one way or another. However, when you use Get Data to pull from certain sources, Power Query uses something called query folding. If you import the whole table from SQL, then remove everything but the two columns in the Query Editor, Power Query is smart enough to realize what you've done, and essentially write back your SQL to only pull those two columns. The first load to Preview might take a while since you started with all 100 columns, but refreshes after that will know you only wanted two columns and will grab those, pushing that "smarter" request back to the database.

 

Filtering is your WHERE clause. You could filter Column A to only include values of XYZ, and again, query folding will push that back to the database to do the work instead.

 

However, certain things will break query folding and prevent it from pushing the heavy lifting of your query back to the database. For example, writing custom SQL automatically prevents query folding from taking place. If you used SQL to specify those two columns, then used the query editor to filter Column A for XYZ values, that filter can't be pushed back to the database as a WHERE clause.

 

Which should you use? If you know SQL well, and you know you can get your data the way you want it without further transformation (or at least without further transformation that would prevent query folding anyway), you might as well use SQL. But if you think Power Query could write the SQL better for you, just pull in the whole table and do all of your filtering, transformations, etc., from there. Power Query will use query folding as much as possible to help you out and push the processing back on the database.

 

If you want to know more, here's a pretty good breakdown of query folding to get you started. Fair warning, it's not been updated since May 2015, so there may have been changes or improvements since the time of the article.

I am often using this select in sql during importing huge table

(select columnA,columnB from Table)

 

so...I dont need to write this? I can just load for once whole table, then remove all other columns and this will result, that power query will always send ask to sql just for importing that two columns? Isnt it just a step which powerbi do later, will it really help me if I dont want load all big table from sql?

 

 

Helpful resources

Announcements
April AMA free

Microsoft Fabric AMA Livestream

Join us Tuesday, April 09, 9:00 – 10:00 AM PST for a live, expert-led Q&A session on all things Microsoft Fabric!

March Fabric Community Update

Fabric Community Update - March 2024

Find out what's new and trending in the Fabric Community.