Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Enhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends August 31st. Request your voucher.

ibarrau

Reading and Writing to Fabric Lakehouse with Azure Databricks

In order to get started let's prepare our environment. To enable communication between Databricks and Fabric, the first step is to create an Azure Databricks Premium Tier resource and the second step is to ensure two things on our cluster:

 

1) Use an “unrestricted” or “power user compute” policy.

ibarrau_0-1753119435115.png

 

2) Make sure that Databricks can pass our credentials through Spark.
This can be enabled in the advanced options.

ibarrau_1-1753119549287.png

NOTE: I won’t go into further details about cluster creation. I’ll leave the rest of the processing options for you to explore, or I assume you’re already familiar with them if you’re reading this post. If you can't see those settings, make sure to turn off the "Simple form".

ibarrau_2-1753119920640.png

Once our cluster is created, we’re going to create a notebook and start reading data in Fabric.
We’ll achieve this using ABFS (Azure Blob File System), which is an open-format address whose driver is included in Azure Databricks.

ibarrau_3-1753119984444.png

The address should be composed of something similar to the following string:

oneLakePath ='abfss://myWorkspaceId@onelake.dfs.fabric.microsoft.com/myLakehouse.lakehouse/Files/'

Knowing this path, we can start working as usual.
Let’s look at a simple notebook to read a Parquet file in Fabric Lakehouse:

ibarrau_4-1753120063832.png

 

Thanks to the cluster configuration, the processes are as simple as: spark.read

Writing is just as simple.

ibarrau_5-1753120124759.png

Starting with a cleanup of unnecessary columns and using a simple: [frame].write we’ll have a clean silver table. We then go to Fabric, and we’ll find it in our Lakehouse.

ibarrau_6-1753120242932.png

This concludes our Databricks processing in Fabric’s Lakehouse, but not the article. We haven’t yet talked about the other type of storage in the blog, but we’re going to mention what’s relevant to this post.

Fabric Warehouses are also built with a next-generation traditional lake structure. Their main difference is that they offer a 100% SQL-based user experience, as if we were working in a regular database. However, behind the scenes, we’ll find Delta as a Spark Catalog or Metastore.

ibarrau_9-1753120339814.png

 

The path should look something like this:

path_dw = "abfss://WorkspaceName@onelake.dfs.fabric.microsoft.com/WarehouseName.Datawarehouse/Tables/dbo/"

Considering that Fabric aims to store Delta format content in both its Lakehouse Spark Catalog (tables) and its Warehouse, we’ll read it as shown in the following example:

ibarrau_8-1753120332384.png

Now this does conclude our article, showing how we can use Databricks to work with Fabric’s storage options.

Original post from LaDataWeb in spanish