Solved: Running a Fabric Notebook as Spark Job.

Dinosauris · ‎02-21-2024

Hello,

I am interested in converting a Fabric notebook that I’ve created, containing multiple transformation steps, into a Spark job. However, I’m struggling to find comprehensive documentation on the additional code I need to include (such as building a Spark session) and how to create a reference file if necessary (along with the required code). Additionally, I’m unsure about the modifications needed in my notebook to enable downloading it as a .py file and running it as a Spark job.

The motivation behind this transition is that I want the transformation code to execute every 2 hours, and from what I’ve read, using Spark jobs may offer better performance for this use case.

I would greatly appreciate it if someone could provide an example of how to achieve this.

Thank you in advance! 😊

Dinosauris · ‎03-07-2024

Hello!

In the meantime we decided to run the Fabric notebooks via pipelines for the use case.

Thank you anyways for the help! 🙂

View solution in original post

Dinosauris · ‎03-07-2024

Hello!

In the meantime we decided to run the Fabric notebooks via pipelines for the use case.

Thank you anyways for the help! 🙂

HimanshuS-msft · ‎02-21-2024

Hello @Dinosauris ,

Just to provide a bit more clarity here . Since the transformation is already there in the notebbok , you can add that to a pipeline ( as @Anonymous called out above ) or you can scheduled from the notebook pane itself . The below snapshot should help .

Thanks
Himanshu

Dinosauris · ‎02-22-2024

Hello Himanshu,

thanks for the info. I am aware of this and this is my backup option (and will probably be the solution of the problem). However, I still wanted to understand why I cannot make the Notebook work using Spark Job. I have a lot of display() functions in there and don't know whether this could be an issue. No mssparkutils APIs are called.

Thanks & BR 🙂

Anonymous · ‎02-21-2024

Hi @Dinosauris
Thanks for using Fabric Community.
Please refer to these documents:
https://learn.microsoft.com/en-us/fabric/data-engineering/create-spark-job-definition
https://www.red-gate.com/simple-talk/databases/sql-server/bi-sql-server/using-spark-jobs-for-multipl...

Hope this helps. Please let me know if you have any further questions.

Dinosauris · ‎02-21-2024

Thanks for the quick reply!

Finally, I did two things, I added the following code at the beginning of my notebook to start the spark session:

from pyspark.sql import SparkSession

spark = SparkSession.builder \
.appName("MySparkJob") \
.getOrCreate()

And the list code line is stopping the spark session:

spark.stop()

However, I get this error message when trying to run the spark job: Execution is not supported for Spark Job Definitions that do not have content.

Anonymous · ‎02-21-2024

Hi @Dinosauris
Apologies for the issue you have been facing.
You can also run your notebooks using Notebook activity in pipelines. Pipeline has more powerful features. And some mssparkutils APIs are not supported in Spark Job Definition.
For more information please refer to this link:
Notebook activity - Microsoft Fabric | Microsoft Learn
Scheduling Notebooks in Microsoft Fabric + Reading JSON from Dynamic File Paths (youtube.com)

Hope this helps. Please let me know if you have any further questions.

Anonymous · ‎02-27-2024

Hi @Dinosauris
Apologies for the delay in response.

Please go ahead and raise a support ticket to reach our support team: Link

After creating a Support ticket please provide the ticket number as it would help us to track for more information.

Thanks.

Running a Fabric Notebook as Spark Job.

Helpful resources

Fabric Monthly Update - September 2025

Fabric Community Update - August 2025

FabCon is coming to Atlanta

Running a Fabric Notebook as Spark Job.

Helpful resources

Fabric Monthly Update - September 2025

Fabric Community Update - August 2025