Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Enhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends August 31st. Request your voucher.

Reply
lchinelli
New Member

How to run a pyspark code directly from a Github Repo?

I'd like to create a data pipeline and run a pyspark code directly from a Github repo, is that possible?

5 REPLIES 5
v-ssriganesh
Community Support
Community Support

Hello @lchinelli,
Thank you for reaching out to the Microsoft Fabric Forum Community.

I’ve reproduced your scenario in Microsoft Fabric and achieved the desired outcome. You can run PySpark code directly from a GitHub repo by using a Fabric Notebook that dynamically fetches the script using a requests.get() call and exec() to run it. This notebook can then be triggered inside a Data Factory pipeline using a Notebook activity.

How It Works:

  • Your .py file is stored in GitHub (public or private).
  • The Fabric notebook reads and executes that code using the raw GitHub URL.
  • A pipeline triggers the notebook and runs the code.


Example GitHub Code Used:

data = [

    ("Microsoft Fabric", 2025),

    ("Power BI", 2024),

    ("Synapse", 2023)

]

columns = ["Product", "Year"]

df = spark.createDataFrame(data, columns)

df.show()


Here’s a successful pipeline run in Microsoft Fabric using a notebook that fetches a PySpark script from GitHub:

vssriganesh_0-1753082115969.png

 

If this information is helpful, please “Accept as solution” and give a "kudos" to assist other community members in resolving similar issues more efficiently.
Thank you.

Is it possible to run code from another folders importing into a main.py file or in a main.ipynb? I said that because my code is OOP

BhaveshPatel
Community Champion
Community Champion

As far as I know, You can not run programmimg code such as Pyspark from Github repo. It is for CI/CD ( Github Repo) . By the way, why you have to do this. 

Rather than I should use Dataflow Gen 2 or Python Notebooks. 

Thanks & Regards,
Bhavesh

Love the Self Service BI.
Please use the 'Mark as answer' link to mark a post that answers your question. If you find a reply helpful, please remember to give Kudos.

To better version control and to import modules from another folders

KevinChant
Super User
Super User

Do you mean run a notebook from a GitHub repo using a GitHub workflow? If so then absolutely.

I did a post that shows how you can do it with Azure DevOps, you can port the logic over:
https://www.kevinrchant.com/2025/01/31/authenticate-as-a-service-principal-to-run-a-microsoft-fabric... 

Helpful resources

Announcements
July 2025 community update carousel

Fabric Community Update - July 2025

Find out what's new and trending in the Fabric community.

Join our Fabric User Panel

Join our Fabric User Panel

This is your chance to engage directly with the engineering team behind Fabric and Power BI. Share your experiences and shape the future.

June FBC25 Carousel

Fabric Monthly Update - June 2025

Check out the June 2025 Fabric update to learn about new features.