Join us at FabCon Atlanta from March 16 - 20, 2026, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.
Register now!To celebrate FabCon Vienna, we are offering 50% off select exams. Ends October 3rd. Request your discount now.
Hello,
I am interested in converting a Fabric notebook that I’ve created, containing multiple transformation steps, into a Spark job. However, I’m struggling to find comprehensive documentation on the additional code I need to include (such as building a Spark session) and how to create a reference file if necessary (along with the required code). Additionally, I’m unsure about the modifications needed in my notebook to enable downloading it as a .py file and running it as a Spark job.
The motivation behind this transition is that I want the transformation code to execute every 2 hours, and from what I’ve read, using Spark jobs may offer better performance for this use case.
I would greatly appreciate it if someone could provide an example of how to achieve this.
Thank you in advance! 😊
Solved! Go to Solution.
Hello!
In the meantime we decided to run the Fabric notebooks via pipelines for the use case.
Thank you anyways for the help! 🙂
Hello!
In the meantime we decided to run the Fabric notebooks via pipelines for the use case.
Thank you anyways for the help! 🙂
Hello @Dinosauris ,
Just to provide a bit more clarity here . Since the transformation is already there in the notebbok , you can add that to a pipeline ( as @Anonymous called out above ) or you can scheduled from the notebook pane itself . The below snapshot should help .
Thanks
Himanshu
Hello Himanshu,
thanks for the info. I am aware of this and this is my backup option (and will probably be the solution of the problem). However, I still wanted to understand why I cannot make the Notebook work using Spark Job. I have a lot of display() functions in there and don't know whether this could be an issue. No mssparkutils APIs are called.
Thanks & BR 🙂
Hi @Dinosauris
Thanks for using Fabric Community.
Please refer to these documents:
https://learn.microsoft.com/en-us/fabric/data-engineering/create-spark-job-definition
https://www.red-gate.com/simple-talk/databases/sql-server/bi-sql-server/using-spark-jobs-for-multipl...
Hope this helps. Please let me know if you have any further questions.
Thanks for the quick reply!
Finally, I did two things, I added the following code at the beginning of my notebook to start the spark session:
from pyspark.sql import SparkSession
spark = SparkSession.builder \
.appName("MySparkJob") \
.getOrCreate()
And the list code line is stopping the spark session:
spark.stop()
However, I get this error message when trying to run the spark job: Execution is not supported for Spark Job Definitions that do not have content.
Hi @Dinosauris
Apologies for the issue you have been facing.
You can also run your notebooks using Notebook activity in pipelines. Pipeline has more powerful features. And some mssparkutils APIs are not supported in Spark Job Definition.
For more information please refer to this link:
Notebook activity - Microsoft Fabric | Microsoft Learn
Scheduling Notebooks in Microsoft Fabric + Reading JSON from Dynamic File Paths (youtube.com)
Hope this helps. Please let me know if you have any further questions.
Hi @Dinosauris
Apologies for the delay in response.
Please go ahead and raise a support ticket to reach our support team: Link
After creating a Support ticket please provide the ticket number as it would help us to track for more information.
Thanks.