Solved: Unexpected Results Using LINESTX for Binary Variab...

mh20221111 · ‎09-03-2025

Hello Community,

I am trying to perform a simple linear regression in Power BI using the LINESTX function. My data is:

Cause A: a binary variable (0 or 1)
Downtime(Hrs): a numeric variable (accident downtime per case)
There are 1,428 entries in total.
In the Cause A column, there are 86 entries with the value "1," and the remaining rows contain "0."
There are 10 other columns in the table besides Cause A, but they are not used in the analysis.

I want to regress Downtime(Hrs) on Cause A only.

When I calculate the slope and intercept manually using the least squares method (Excel), I get a slope of about -6.64 and an intercept of about 9.05.

However, when I use the following DAX:
　LINESTX(
　SELECTCOLUMNS(DowntimeData, [Cause A], [Downtime(Hrs)]),
　[Cause A],
　[Downtime(Hrs)]
　)

the returned slope is -0.0000581, which is almost zero and very different from the manual calculation.

I have confirmed that:

Both columns are numeric.
There are no filters or slicers applied.
The same calculation in Excel gives the expected result.

Why does LINESTX return such a small slope in this case?
Is there something about the function's behavior with binary variables or the way it handles the data that I am missing?
Is this a bug, or am I misunderstanding the usage?

Any advice or explanation would be greatly appreciated!

Thank you.

DataNinja777 · ‎09-03-2025

Hi @mh20221111 ,

The issue you're encountering is a subtle but common problem with DAX evaluation context, not a bug in LINESTX or an issue with binary variables. Your problem stems from using SELECTCOLUMNS to create the table for LINESTX to iterate over.

When you wrap your table in SELECTCOLUMNS, the subsequent expressions for your X and Y variables ([Cause A] and [Downtime(Hrs)]) are not being evaluated row-by-row over that new, temporary table as you'd expect. Instead, they are evaluated in the outer filter context, which can cause them to be treated as a single, constant value. When the regression algorithm receives nearly constant inputs, it correctly calculates a slope that is virtually zero.

The correct and more direct approach is to apply LINESTX to your original data table. This ensures a proper row context is established, and the function evaluates the X and Y expressions for each individual row of your data. You don't need SELECTCOLUMNS, as LINESTX will only use the columns you specify anyway.

You should use this DAX formula instead, which will give you the expected slope of approximately -6.64:

LINESTX(
    DowntimeData,
    DowntimeData[Downtime(Hrs)],
    DowntimeData[Cause A]
)

It's also helpful to remember what these coefficients mean in your specific case. The intercept (
approx9.05) is the average Downtime(Hrs) for cases where Cause A is 0. The slope (
approx−6.64) represents the difference in average downtime when moving from the Cause A = 0 group to the Cause A = 1 group. So, your analysis correctly shows that cases associated with Cause A have, on average, 6.64 fewer hours of downtime.

Best regards,

View solution in original post

Shahid12523 · ‎09-03-2025

VAR RegressionTable =
ADDCOLUMNS(
DowntimeData,
"X", DowntimeData[Cause A],
"Y", DowntimeData[Downtime(Hrs)]
)

RETURN
LINESTX(
RegressionTable,
[Y],
[X]
)

This ensures that [X] and [Y] are evaluated within the row context of the table being passed to LINESTX.

Shahed Shaikh

View solution in original post

Shahid12523 · ‎09-03-2025

VAR RegressionTable =
ADDCOLUMNS(
DowntimeData,
"X", DowntimeData[Cause A],
"Y", DowntimeData[Downtime(Hrs)]
)

RETURN
LINESTX(
RegressionTable,
[Y],
[X]
)

This ensures that [X] and [Y] are evaluated within the row context of the table being passed to LINESTX.

Shahed Shaikh

mh20221111 · ‎09-04-2025

I believe I made a basic mistake. Thank you for pointing it out.

DataNinja777 · ‎09-03-2025

Hi @mh20221111 ,

The issue you're encountering is a subtle but common problem with DAX evaluation context, not a bug in LINESTX or an issue with binary variables. Your problem stems from using SELECTCOLUMNS to create the table for LINESTX to iterate over.

When you wrap your table in SELECTCOLUMNS, the subsequent expressions for your X and Y variables ([Cause A] and [Downtime(Hrs)]) are not being evaluated row-by-row over that new, temporary table as you'd expect. Instead, they are evaluated in the outer filter context, which can cause them to be treated as a single, constant value. When the regression algorithm receives nearly constant inputs, it correctly calculates a slope that is virtually zero.

The correct and more direct approach is to apply LINESTX to your original data table. This ensures a proper row context is established, and the function evaluates the X and Y expressions for each individual row of your data. You don't need SELECTCOLUMNS, as LINESTX will only use the columns you specify anyway.

You should use this DAX formula instead, which will give you the expected slope of approximately -6.64:

LINESTX(
    DowntimeData,
    DowntimeData[Downtime(Hrs)],
    DowntimeData[Cause A]
)

It's also helpful to remember what these coefficients mean in your specific case. The intercept (
approx9.05) is the average Downtime(Hrs) for cases where Cause A is 0. The slope (
approx−6.64) represents the difference in average downtime when moving from the Cause A = 0 group to the Cause A = 1 group. So, your analysis correctly shows that cases associated with Cause A have, on average, 6.64 fewer hours of downtime.

Best regards,

mh20221111 · ‎09-04-2025

Using the formula you provided resolved the issue. Thank you.

Unexpected Results Using LINESTX for Binary Variable Regression in Power BI

Helpful resources

Power BI Monthly Update - August 2025

Fabric Community Update - August 2025

How to Get Your Question Answered Quickly

Join us at FabCon Vienna from September 15-18, 2025

Unexpected Results Using LINESTX for Binary Variable Regression in Power BI

Helpful resources

Power BI Monthly Update - August 2025

Fabric Community Update - August 2025

How to Get Your Question Answered Quickly