Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

The Power BI Data Visualization World Championships is back! Get ahead of the game and start preparing now! Learn more

Reply
eddd83
Resolver I
Resolver I

How find the most computation expensive calculated columns

So in the data model that I inherited, there are a number of calculated columns in the main table. By chance, I noticed if I delete this particular "end customer" column, it would cut the time to open the .pbix in half (~six minutes to three minutes). 

 

According to the metrics in DAX studio, there wasn't anything particularly notable about this "end customer" column. For example, in terms of size and cardinality, it's much smaller than other columns in the same table. However, the logic in this "end customer" was poorly written (it checks if the related(other_table_column) is blank and if not, returns that value) and hence my theory that eliminating this calculation would dramatically speed up the model. 

 

My main question: aside from manually deleting columns 1 by 1 and then manually timing the time to open the file, is there a faster way to find these expensive computation columns? As i mentioned, nothing seems to jump out at me looking at the dax studio metrics.

27 REPLIES 27
lbendlin
Super User
Super User

First you can use Performance Analyzer to see the impact on the UX

 

lbendlin_0-1704223318117.png

 

And then you can use DAX Studio to check the query plan for your column (or measure)

lbendlin_1-1704223379362.pnglbendlin_2-1704223419460.pnglbendlin_3-1704223474055.pnglbendlin_4-1704223495070.png

 

Check the length of the query plan but also the max number of records.

So i did what you suggested. Here's the problematic calculated column (end_customer).

end customer dax studio.PNG

 

However, here's another column (transaction_id_order). Removing this from the data model makes almost no difference in speed. Whereas removing end_customer makes a big difference. Again, i see nothing different between the 2 query plans aside from the problematic end_customer having fewer records.

transaction id order dax studio.PNG

 

Here's what the end_customer logic looks like. Basically it's checking the related(XXX) for a number of tables and returns the first one that isn't blank.

end customer logic.PNG

I am surprised that the first query performs worse than the second. What do the engine timings look like for both?

By engine timings, did you mean server timings? Again, the top one (end_customer) is the CPU intensive calcuation. 

0104 server timings.PNG

 

I'm not 100% sure i did this correctly, but all i did was drop each calculated column into a table (a seperate table for each column). Did a "refresh visuals" then a "copy query". Pasted the DAX into DAX studio and hit "run" with "query plan" and "server timings" turned on.

 

0104 performance analyzer.PNG

Yes, that is the correct process.  But you are supposed to evaluate the DAX for the calculated column, not the final result.

 

Rather intriguing that both queries spend all their time in the formula engine.  A simple column pull should be done entirely in the storage engine.

but how do i evaluate the dax for the column in dax studio?

 

End_Customer =
VAR customer_number = [Customer Number]
VAR int_sales_brid_0_ship_to_num =
    RELATED ( Internal_Sales_Bridge[Ship_To_Number] )
VAR lift_table_brid_0_cus_num =
    RELATED ( Lift_Table_Bridge[Customer Number] )
VAR output =
    IF (
        ISBLANK ( int_sales_brid_0_ship_to_num ),
        IF (
            ISBLANK ( lift_table_brid_0_cus_num ),
            "customer_number",
            "lift_table_brid_0_cus_num"
        ),
        "int_sales_brid_0_ship_to_num"
    )
RETURN
    output

 

For example, how do I put the above into DAX studio? (FYI, this a short version of a similar formula from a related model)

Everything in DAX is a table.  You can use 

 

EVALUATE ROW("column name", <your DAX>)

 

or 

EVALUATE  {<your DAX>}

 

or any other variant that produces a table.

 

Like so:

EVALUATE{
VAR customer_number = [Customer Number]
VAR int_sales_brid_0_ship_to_num =
    RELATED ( Internal_Sales_Bridge[Ship_To_Number] )
VAR lift_table_brid_0_cus_num =
    RELATED ( Lift_Table_Bridge[Customer Number] )
RETURN
    IF (
        ISBLANK ( int_sales_brid_0_ship_to_num ),
        IF (
            ISBLANK ( lift_table_brid_0_cus_num ),
            "customer_number",
            "lift_table_brid_0_cus_num"
        ),
        "int_sales_brid_0_ship_to_num"
    )
}

0105 failure eval 02.PNG0105 failure eval 01.PNG

 

Tried both options and getting 2 different error messages

EVALUATE ROW needs a column name.

 

EVALUATE ROW("column name", <your DAX>)

 

I'm stuck getting the same error0115 same old error.PNG

var checkSource = VALUES(Fact_Table[Source])

same error. I've only included a portion of the actual logic. 0116 same old error.PNG

you are attempting to return a table inside a row. 

 

remove lines 2,3 and 12

So when i add back in the related variables, i get this message. FYI this is only 1 of many related variables in this logic)

 

0116 -2 same old error.PNG

remember that EVALUATE must return a table.  RELATED will always return a scalar value for a single column. You are mixing too many concepts at once.

 

Check your data model, then formulate the right DAX. You can also let the "DAX Query View"  do the heavy lifting for you when you ask it to create the code for a calculated column or a measure etc.

the whole point of this exercise was to compare different columns and why some columns have a greater effect on the model efficiency/speed. The above logic is the correct DAX for my business case. 

thanks. i have been busy this past week. I will try again when i am free

lbendlin
Super User
Super User

In DAX Studio check the query plan for that DAX - especially the Records column.  High numbers there indicate cartesians.

I'm not familar with using query plan. Is there a particular video or webpage that explains how I go about using query plan in relation to analyzing calculated columns?

eddd83
Resolver I
Resolver I

Yes, I understand that using measures is preferable, but this data model is a mess (ie, it is 82 columns wide. > 50% of the columns are calculated columns). Ideally, my original question would be answered and then I know which columns i can proritize my effort into possibly converting into a measure.

Helpful resources

Announcements
Power BI DataViz World Championships

Power BI Dataviz World Championships

The Power BI Data Visualization World Championships is back! Get ahead of the game and start preparing now!

December 2025 Power BI Update Carousel

Power BI Monthly Update - December 2025

Check out the December 2025 Power BI Holiday Recap!

FabCon Atlanta 2026 carousel

FabCon Atlanta 2026

Join us at FabCon Atlanta, March 16-20, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.