Power BI is turning 10, and we’re marking the occasion with a special community challenge. Use your creativity to tell a story, uncover trends, or highlight something unexpected.
Get startedJoin us at FabCon Vienna from September 15-18, 2025, for the ultimate Fabric, Power BI, SQL, and AI community-led learning event. Save €200 with code FABCOMM. Get registered
Hello,
I have developed some functions in python, which I packaged as a .whl and then imported into my fabric environment.
The functions developed use 'spark' to effectuate read and write actions.
This is a simplified version of the function in the package:
import spark
def get_file(outputFilePath: str)
df = spark.read.option("multiline", "true").json(outputFilePath)
return df
In the package, spark gets defined like this in a spark.py file
from pyspark.sql import SparkSession
spark = SparkSession.builder \
.appName("GlobalSpark") \
.master("local[*]") \
.getOrCreate()
This code works locally as it notices there is no SparkSession, and therefore creates one.
However, when calling this function within a Fabric Notebook I get a 'NameError: name 'spark' is not defined'
As I will be executing this code with the Fabric Runtime I expected the SparkSession(=spark) to also be available within these functions. That is the reason why i did not pass spark explicitly in the function.
(I tried both import methods, either through the custom libraries, as well as direct installation via the built-in resources.)
I know one option is to refactor my whole codebase in order to pass spark explicitly.
Before doing such things, I wanted to check whether this is the intended behavior, or whether I am configuring this wrong?
Kind regards,
Anissa
Solved! Go to Solution.
When you develop locally, your spark.py file explicitly creates a SparkSession using SparkSession.builder. This works because you control the full Python environment and are expected to instantiate Spark manually.
However, in Microsoft Fabric Notebooks, a SparkSession is already created and provided implicitly by the runtime. You can access it simply via the spark object, but you should not re-instantiate or create a new SparkSession.
Don't define your own spark.py module
Rename your module to something else like spark_utils.py or file_io.py. This avoids shadowing the built-in spark object.
Use the implicit spark from Fabric
Remove the following from your code entirely:
CopyEdit
from pyspark.sql import SparkSession
spark = SparkSession.builder \
.appName("GlobalSpark") \
.master("local[*]") \
.getOrCreate()
Instead, in your utility function, rely on the pre-provided spark:
CopyEdit
def get_file(outputFilePath: str):
df = spark.read.option("multiline", "true").json(outputFilePath)
return df
Avoid import spark altogether
If you must package utilities, structure it like this:
CopyEdit
# file_io.py
def get_file(outputFilePath: str):
from pyspark.sql import SparkSession
spark = SparkSession.getActiveSession()
if spark is None:
raise RuntimeError("No active SparkSession found. This function must be run within a Spark environment.")
df = spark.read.option("multiline", "true").json(outputFilePath)
return df
But in Fabric, SparkSession.getActiveSession() should return the running session just fine.
Please mark this post as solution if it helps you. Appreciate Kudos.
Hi @fAnissa,
as we haven’t heard back from you, we wanted to kindly follow up to check if the solution provided for your issue worked? or let us know if you need any further assistance here?
Thanks,
Prashanth Are
MS Fabric community support
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly and give Kudos if helped you resolve your query
Hello Andrew,
Thank you for your reply.
Indeed, I performed the following actions :
- I removed the spark.py utility from my package
- I added this to my function :
spark = SparkSession.getActiveSession()
Once these 2 actions performed, Fabric recognizes spark and doesn't throw a NameError.
However...
If i don't explicitly use the 'getActiveSession' function, then the problem still persists.
How come we have no direct access to the SparkSession(spark) without defining it first?
Thanks so much for your help so far !
Kind regards,
Anissa
When you develop locally, your spark.py file explicitly creates a SparkSession using SparkSession.builder. This works because you control the full Python environment and are expected to instantiate Spark manually.
However, in Microsoft Fabric Notebooks, a SparkSession is already created and provided implicitly by the runtime. You can access it simply via the spark object, but you should not re-instantiate or create a new SparkSession.
Don't define your own spark.py module
Rename your module to something else like spark_utils.py or file_io.py. This avoids shadowing the built-in spark object.
Use the implicit spark from Fabric
Remove the following from your code entirely:
CopyEdit
from pyspark.sql import SparkSession
spark = SparkSession.builder \
.appName("GlobalSpark") \
.master("local[*]") \
.getOrCreate()
Instead, in your utility function, rely on the pre-provided spark:
CopyEdit
def get_file(outputFilePath: str):
df = spark.read.option("multiline", "true").json(outputFilePath)
return df
Avoid import spark altogether
If you must package utilities, structure it like this:
CopyEdit
# file_io.py
def get_file(outputFilePath: str):
from pyspark.sql import SparkSession
spark = SparkSession.getActiveSession()
if spark is None:
raise RuntimeError("No active SparkSession found. This function must be run within a Spark environment.")
df = spark.read.option("multiline", "true").json(outputFilePath)
return df
But in Fabric, SparkSession.getActiveSession() should return the running session just fine.
Please mark this post as solution if it helps you. Appreciate Kudos.
This is your chance to engage directly with the engineering team behind Fabric and Power BI. Share your experiences and shape the future.
Check out the June 2025 Fabric update to learn about new features.
User | Count |
---|---|
61 | |
36 | |
14 | |
14 | |
5 |
User | Count |
---|---|
66 | |
63 | |
26 | |
8 | |
7 |