Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn a 50% discount on the DP-600 certification exam by completing the Fabric 30 Days to Learn It challenge.

Reply
Anonymous
Not applicable

Linear regression - Challenges with trend line at year change

Hi all,

In power Bi desktop, I have been using a dax expression to model the trend line of production output per week (bellow).
It was working just fine, before the year change.
However, when including weeks from different years (in this case, weeks 52 and 1, from year 2021 and 2022 respectively) my trend line has split into three diferent slopes, like so:

 

LinaSilveira_0-1673534975758.png

One slope for the 2021's weeks until 52 (decreasing), another slope for between 52 and 1 (increasing), and another for selected weeks in 2022 (decreasing again, different slope).

 

My goal is to have the trend line show up as a continuum, with one slope only, to model the complete selected period, regardless of year.

 

The measure I'm using for trend line is the following:

 

Trend line =
VAR Known =
FILTER (
SELECTCOLUMNS (
ALLSELECTED ( DateTable[Week] ),
"Known[X]", DateTable[Week],
"Known[Y]", CALCULATE ( SUM ( ProductionTable[Output] ) )
),
AND ( NOT ( ISBLANK ( Known[X] ) ), NOT ( ISBLANK ( Known[Y] ) ) )
)
VAR Count_Items =
COUNTROWS ( Known )
VAR Sum_X =
SUMX ( Known, Known[X] )
VAR Sum_X2 =
SUMX ( Known, Known[X] * Known[X] )
VAR Sum_Y =
SUMX ( Known, Known[Y] )
VAR Sum_XY =
SUMX ( Known, Known[X] * Known[Y] )
VAR Average_X =
AVERAGEX ( Known, Known[X] )
VAR Average_Y =
AVERAGEX ( Known, Known[Y] )
VAR Slope =
DIVIDE (
Count_Items * Sum_XY - Sum_X * Sum_Y,
Count_Items * Sum_X2 - Sum_X * Sum_X
)
VAR Intercept = Average_Y - Slope * Average_X
RETURN
SUMX ( DISTINCT ( DateTable[Week] ), Intercept + Slope * DateTable[Week] )

 

Other important details:

ProductionTable[Output] is measured per day,

ProductionTable has it's own [Output_Date] column (that is the date when output was measured).

ProductionTable is connected to DateTable through a many to one relationship using columns ProductionTable[Output_Date] and DateTable[Date]. Cross filter is single.

DateTable contains a 'Date Hierarchy', of which Week is not a part of.

'Date Hierarchy' contains Year, Quarter, Month,Day.

 

LinaSilveira_3-1673537270599.png

 

 

The visualization should show Trendline for output, per week of the year. These are the fiels in my axis':

 

LinaSilveira_1-1673536427598.png

My DateTable has the following columns/data:

 

LinaSilveira_2-1673537201464.png

 

Any tips on what I can do to make the trend line have one slope for the whole selected period, regardless of year?

 

Thanks for any help,

I can provide more info if you have questions

 

6 REPLIES 6
johnt75
Super User
Super User

Instead of having both year and week on the axis use the [Year - Week] column from your date table, and change the trend line measure to use that column instead of the week column.

You may need to create another column in the date table for sorting the Year - Week by.

Anonymous
Not applicable

Hi johnt75, thanks for your input.

 

I did what you suggested.

Changed the X-Axis to Year-Week instead (which has datatype string)

 

It now looks like so:

LinaSilveira_5-1673614494593.png

 

Have also sorted it by WeekofyearKey, which has data type wholenumber:

 

LinaSilveira_1-1673613622784.png

LinaSilveira_2-1673613682438.png

Finally, have changed the measure for trendline to use Year-Week, and called it Trend line 2:

 

Trend line 2=
VAR Known =
FILTER (
SELECTCOLUMNS (
ALLSELECTED ( DateTable[Year - Week] ),
"Known[X]", DateTable[Year - Week],
"Known[Y]", CALCULATE ( SUM ( ProductionTable[Output] ) )
),
AND ( NOT ( ISBLANK ( Known[X] ) ), NOT ( ISBLANK ( Known[Y] ) ) )
)
VAR Count_Items =
COUNTROWS ( Known )
VAR Sum_X =
SUMX ( Known, Known[X] )
VAR Sum_X2 =
SUMX ( Known, Known[X] * Known[X] )
VAR Sum_Y =
SUMX ( Known, Known[Y] )
VAR Sum_XY =
SUMX ( Known, Known[X] * Known[Y] )
VAR Average_X =
AVERAGEX ( Known, Known[X] )
VAR Average_Y =
AVERAGEX ( Known, Known[Y] )
VAR Slope =
DIVIDE (
Count_Items * Sum_XY - Sum_X * Sum_Y,
Count_Items * Sum_X2 - Sum_X * Sum_X
)
VAR Intercept = Average_Y - Slope * Average_X
RETURN
SUMX ( DISTINCT ( DateTable[Year - Week] ), Intercept + Slope * DateTable[Year - Week] )

 

Now measures Trend line and Trend line 2 present different behaviours in the visualization:

 

Trendline is now overlapping values of ProductionTable[Output] , like so:

 

LinaSilveira_4-1673614287706.png

As for Trend line 2 

 It generates an error message saying in ProductionTable[Trend line 2] : The function SUMX cannot work with values of type string

 

So the problem seems to be, in order to create the trend line through linear regression, The X axys needs to be either a date or a whole number. Strings don't work for that purpose.

How do I achieve Year -week as a date or whole number?

Or, if the problem is just function SUMX, are there any alternatives to it that can use strings?

 

I have tried something else, that also didn't work :

  • Changed the X-axis to WeekofYearKey

As WeekofYearKey has datatype whole number (it is a justaposition of year and week with the format YYYYWW) I thought I could give it a try and see how it would look like if I used it as XAxis and in the Trend line as well;

 

Basically have changed the trend measure to use WeekofYearKey instead of the Year - Week

and used these in the axis like so:

LinaSilveira_6-1673617411435.png

 

It turned out terribly wrong, the X axis assumes there must be a period between 202152 and 202201 that should be accounted for in the visualization AND in top of that, the trend line also has 3 different slopes:

  1. one before the period without data entries 
  2. one in the imaginary "weeks" between 202152 and 202201
  3. another from 202201 on... 

LinaSilveira_7-1673617770944.png

 

Any other ideas?

 

Thank you

 

 

 

You could create a week commencing column in the date table, like

[Date] - WEEKDAY([Date],2) + 1,

You can use that in the measure, not sure if you would have to use it on the axis as well or whether you could use the year week column.

Anonymous
Not applicable

That didn't work either.

What I am currently trying is:

Have created and index for all the Year-week's, in my time interval.

This index behaves as in the following example:

 

Year WeekSequencial index
2021 011
2021 022
......
2021 5252
2022 01 53
2022 0254
2022 0355
......
2022 51103
2022 52104

 

Now, I'm using this sequential index in the X axis, and as it is continuous, it allows me to use the default trendline available in the 'Analytics' panel.

LinaSilveira_2-1674042914206.png

LinaSilveira_3-1674042955316.png

 

 

 

However, now the challenge is, how do I make it show the corresponding Year Week, instead of the index?

 

How my DateTable looks now:

 

LinaSilveira_0-1674042570135.png

 

and how using the sequential index reflects on the Production output table and trendline:

 

LinaSilveira_1-1674042746851.png

 

Now I would just need to be able to show Year week in the X axis, in a purely cosmetic way, instead of this ugly index. Any ideas?

 

Thanks!

 

 

 

It might be possible to keep the sequential week number in the calculation and use the year week column on the chart axis

Anonymous
Not applicable

Wait, I have messed up in keeping X-Axis as type continuum. But changing it to categorical still didn't solve the trend slope part unfortunately.

 

LinaSilveira_1-1673618290195.png

 

 

 

LinaSilveira_0-1673618254040.png

 

 

Helpful resources

Announcements
LearnSurvey

Fabric certifications survey

Certification feedback opportunity for the community.

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.