Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Enhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends August 31st. Request your voucher.

Reply
SamKum
Frequent Visitor

Lakehouse table error in Spark notebook in Fabric

Even though I have set up a Lakehouse as default to my notebook, When I try to refer tables directly in Spark notebook, they are not recognized. Please find attached screen print and suggest what could be the reason.

Note: In attached screen print using select on a lakehouse table and creating a dataframe will work.Lakehouse_Table_Error_in_Spark_Notebook.png

1 ACCEPTED SOLUTION
DataBard
Resolver II
Resolver II

For your code above @SamKum , the join function requires specifying a dataframe to join into, not a Delta Table.

 

With the Default Lakehouse attached to your notebook, you can reference the Delta Table to build a dataframe either via the absolute path:

 

 

absolute_path = 'abfss://00000000-0000-0000-0000-000000000000@onelake.dfs.fabric.microsoft.com/00000000-0000-0000-0000-000000000000/Tables/DimCustomer'

#Can I connect absolutely?
absolute_df = spark.read.format("delta").load(absolute_path)
display(absolute_df)

 

 

Or the Relative Path:

 

 

relative_path = 'Tables/DimCustomer'
relative_df = spark.read.format("delta").load(relative_path)
display(relative_df)

 

 

Or via a Spark SQL query against the default lakehouse:

 

 

#Default way from drag-and-drop
df = spark.sql("SELECT * FROM Databard_Demo.DimCustomer LIMIT 1000")
display(df)

 

 

You are correct, using the relative path and Spark SQL is simpler. However, it requires attaching your notebook to a default lakehouse, which must then be managed should you deploy your notebook to different environments (i.e. Dev/Test/Prod). Without a default lakehouse, the Spark SQL and relative paths won't work, and you would have to reference the absolute path.

 

As far as I'm aware, I've never seen a way where Delta Tables can be referenced directly like in your example, at least not without defining the Delta Table as a variable in PySpark. The closest I've seen is loading the Delta Tables into their own variables, and then using Spark SQL to join the tables. Here's an example:

 

#These show absolute paths, but you can do relative with a default lakehouse
delta_table_path_1 = "abfss://<your-workspace-id>@onelake.dfs.fabric.microsoft.com/<your-lakehouse-id>/<your-delta-table-path-1>"
delta_table_path_2 = "abfss://<your-workspace-id>@onelake.dfs.fabric.microsoft.com/<your-lakehouse-id>/<your-delta-table-path-2>"

# Load the Delta Tables
delta_table_df_1 = spark.read.format("delta").load(delta_table_path_1)
delta_table_df_2 = spark.read.format("delta").load(delta_table_path_2)

# Create temporary views
delta_table_df_1.createOrReplaceTempView("delta_table_1")
delta_table_df_2.createOrReplaceTempView("delta_table_2")

# Execute a SQL query to join the tables
# Assumes Customer data for demo purposes, but could be any query
joined_df = spark.sql("""
    SELECT 
        t1.CustomerKey, 
        t1.FirstName, 
        t1.LastName, 
        t2.OrderKey, 
        t2.OrderDate
    FROM 
        delta_table_1 t1
    JOIN 
        delta_table_2 t2
    ON 
        t1.CustomerKey = t2.CustomerKey
""")

# Show the result
joined_df.show()

 

 

Hope this helps! If it does, be sure to accept an answer so others can learn from this conversation.

View solution in original post

6 REPLIES 6
v-saisrao-msft
Community Support
Community Support

Hi @SamKum 
I wanted to check if you had the opportunity to review the information. Please feel free to contact us if you have any further questions. If my response has addressed your query, please accept it as a solution and give a 'Kudos' so other members can easily find it.
Thank you.

v-saisrao-msft
Community Support
Community Support

Hi @SamKum ,
Thank you for reaching to the Microsoft forum community.


I wanted to check if you had the opportunity to review the information provided by @DataBard . Please feel free to contact us if you have any further questions. If the response has addressed your query, please accept it as a solution and give a 'Kudos' so other members can easily find it.


Thank you.

DataBard
Resolver II
Resolver II

For your code above @SamKum , the join function requires specifying a dataframe to join into, not a Delta Table.

 

With the Default Lakehouse attached to your notebook, you can reference the Delta Table to build a dataframe either via the absolute path:

 

 

absolute_path = 'abfss://00000000-0000-0000-0000-000000000000@onelake.dfs.fabric.microsoft.com/00000000-0000-0000-0000-000000000000/Tables/DimCustomer'

#Can I connect absolutely?
absolute_df = spark.read.format("delta").load(absolute_path)
display(absolute_df)

 

 

Or the Relative Path:

 

 

relative_path = 'Tables/DimCustomer'
relative_df = spark.read.format("delta").load(relative_path)
display(relative_df)

 

 

Or via a Spark SQL query against the default lakehouse:

 

 

#Default way from drag-and-drop
df = spark.sql("SELECT * FROM Databard_Demo.DimCustomer LIMIT 1000")
display(df)

 

 

You are correct, using the relative path and Spark SQL is simpler. However, it requires attaching your notebook to a default lakehouse, which must then be managed should you deploy your notebook to different environments (i.e. Dev/Test/Prod). Without a default lakehouse, the Spark SQL and relative paths won't work, and you would have to reference the absolute path.

 

As far as I'm aware, I've never seen a way where Delta Tables can be referenced directly like in your example, at least not without defining the Delta Table as a variable in PySpark. The closest I've seen is loading the Delta Tables into their own variables, and then using Spark SQL to join the tables. Here's an example:

 

#These show absolute paths, but you can do relative with a default lakehouse
delta_table_path_1 = "abfss://<your-workspace-id>@onelake.dfs.fabric.microsoft.com/<your-lakehouse-id>/<your-delta-table-path-1>"
delta_table_path_2 = "abfss://<your-workspace-id>@onelake.dfs.fabric.microsoft.com/<your-lakehouse-id>/<your-delta-table-path-2>"

# Load the Delta Tables
delta_table_df_1 = spark.read.format("delta").load(delta_table_path_1)
delta_table_df_2 = spark.read.format("delta").load(delta_table_path_2)

# Create temporary views
delta_table_df_1.createOrReplaceTempView("delta_table_1")
delta_table_df_2.createOrReplaceTempView("delta_table_2")

# Execute a SQL query to join the tables
# Assumes Customer data for demo purposes, but could be any query
joined_df = spark.sql("""
    SELECT 
        t1.CustomerKey, 
        t1.FirstName, 
        t1.LastName, 
        t2.OrderKey, 
        t2.OrderDate
    FROM 
        delta_table_1 t1
    JOIN 
        delta_table_2 t2
    ON 
        t1.CustomerKey = t2.CustomerKey
""")

# Show the result
joined_df.show()

 

 

Hope this helps! If it does, be sure to accept an answer so others can learn from this conversation.

govindarajan_d
Super User
Super User

Hi @SamKum,

 

The code should be:

joined_df = sales_example.join(product_example, product_example.product_ID == sales_example.product_ID, "inner")

 

You should refer the dataframe name rather than the table name.  

DataBard
Resolver II
Resolver II

Hi @SamKum ,

Adding a default lakehouse makes it easier to use the UI to create dataframes, as well as giving you an option to refer to the table path to find the table in your lakehouse relatively instead of having to reference the absolute path. Either way, you typically have to refer to the path along with your table itself (or lakehouse.schema.table, in the case of line 1 in your code).

 

In your case, instead of referencing 'products', either use the dataframe you are creating in the above example, or use a spark function to reference the Delta table like this:

# Read Delta tables
delta_table_1 = spark.read.format("delta").load("/path/to/delta_table_1")

 

Hopefully this helps! If it does, be sure to accept the answer so others in the community can learn from your question.

@DataBard , If creating datframe is only way to go, in your solution path to delta table is absolue path? If yes, my attempt of creating dataframe with a select query is much simpler.

 

I saw an Udemy video where they demonstrated directly referring delta table name from lakehouse attached to the notebook. Do we you know if this was allowed in earlier version or this was never a feature?

Helpful resources

Announcements
July 2025 community update carousel

Fabric Community Update - July 2025

Find out what's new and trending in the Fabric community.

Join our Fabric User Panel

Join our Fabric User Panel

This is your chance to engage directly with the engineering team behind Fabric and Power BI. Share your experiences and shape the future.

June FBC25 Carousel

Fabric Monthly Update - June 2025

Check out the June 2025 Fabric update to learn about new features.