Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

The Power BI Data Visualization World Championships is back! Get ahead of the game and start preparing now! Learn more

Reply
Betty888
Helper II
Helper II

How to run a linear model by group in Python?

Dear all,

I’m beginning with Python that I need to use to run a linear model for the dataset below :

Location

Y

X1

X2

1

32

1

1

1

44

1

2

1

58

1

3

1

76

2

1

1

73

2

2

1

37

2

3

1

52

3

1

1

78

3

2

1

60

3

3

2

93

1

1

2

78

1

2

2

25

1

3

2

97

2

1

2

85

2

2

2

60

2

3

2

70

3

1

2

62

3

2

2

95

3

3

 

My target is to run a linear model as follows :

Y ~ X1 + X2

And for that the following code gave me exactly what I need :

import numpy as np

import pandas as pd

import statsmodels.api as sm

import matplotlib.pyplot as plt

from statsmodels.formula.api import ols

import scipy.stats as stats

 

df = pd.DataFrame(dataset)

reg = ols('Y ~ C(X1) + C(X2)', data=df).fit()

df['fitted_values'] = reg.fittedvalues

result = reg.outlier_test()

df['student_resid'] = result.student_resid

 

What I’m not able to do is to run this code by ‘Location’, and get my columns 'fitted_values' and 'student_resid' accordingly.

Any help is highly appreciated.

Thanks a lot in advance.

Regards,

1 REPLY 1
lbendlin
Super User
Super User

What I’m not able to do is to run this code by ‘Location’, and get my columns 'fitted_values' and 'student_resid' accordingly.

All fields that you want to use in your script must be added to the Values well of the Python visual.

Helpful resources

Announcements
November Power BI Update Carousel

Power BI Monthly Update - November 2025

Check out the November 2025 Power BI update to learn about new features.

Fabric Data Days Carousel

Fabric Data Days

Advance your Data & AI career with 50 days of live learning, contests, hands-on challenges, study groups & certifications and more!

FabCon Atlanta 2026 carousel

FabCon Atlanta 2026

Join us at FabCon Atlanta, March 16-20, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.