Microsoft Fabric Lakehouses: The Engine Room of Da...

Ilgar_Zarbali · ‎07-02-2025

What Is a Lakehouse in Microsoft Fabric?

A Lakehouse in Fabric is a hybrid data architecture combining the best of data lakes (flexibility and scale) and data warehouses (performance and structure). It stores both raw files (CSV, JSON, Excel, etc.) and structured tables in Delta Parquet format, enabling SQL-based querying and powerful compute engine compatibility.

Whether you're loading 100 rows or 100 million, the Lakehouse is designed to scale, organize, and serve data efficiently across analytics workloads.

Lakehouse Architecture

Setting Up a Lakehouse

To begin, go to fabric.microsoft.com, sign in. Within your workspace, you can create a Lakehouse with just a few clicks. Assign a trial Fabric capacity or use an existing capacity (like F2), and you're ready to go.

Fabric offers a familiar environment for Power BI users—workspaces, datasets, and now, Lakehouses that live inside your workspace and interact with all your other artifacts.

Fabric Workspace with Lakehouse

Files vs. Tables: Understanding the Storage Layers

Each Lakehouse has two key folders:

Files Folder: This is where you upload raw data (CSV, JSON, Excel, etc.). Files stored here are not immediately queryable via SQL.
Tables Folder: This is where your files are converted to Delta Parquet and become fully SQL-queryable, version-controlled, and accessible across engines.

You can upload a file (like items.csv - https://github.com/IlgarZ/Newsletter-Fabric/blob/main/Items.csv) - , preview it, and then load it into the Tables folder as a structured table. Make sure to clean your column names—no spaces are allowed!

Files vs Tables in Lakehouse

Shortcuts: Connect Without Moving Data

Lakehouses support Shortcuts, an innovative feature in Fabric that allows you to reference data from external locations without duplicating it. You can link to:

Another Lakehouse in OneLake
Azure Data Lake Storage Gen2
Amazon S3

This keeps your data unified and in-place, eliminating unnecessary movement, and improving performance.

Fabric Shortcuts: Unified Data Access

SQL Endpoints and Compute Engine Integration

Once your data is loaded into tables, a SQL endpoint is automatically generated. You can use this to connect with:

SQL Server Management Studio (SSMS)
Azure Data Studio
Power BI
Python Notebooks
Spark Jobs

These tools, often referred to as compute engines, can now work seamlessly with your structured Lakehouse data using SQL or code.

SQL Endpoint in Action

Lakehouse in Action: Uploading and Converting Data

Let’s say you upload items.csv with 400,000 rows into the Files folder. You can preview its structure and then load it into the Tables folder under a name like products. The table is stored in Delta Parquet format, fully trackable and queryable.

From this point forward, it's accessible across Fabric. You can create Power BI visuals, run Spark transformations, or use notebooks—all reading from the same source of truth.

From CSV to Delta Table

In Summary

The Lakehouse is not just a data storage location—it's a gateway to unified, scalable, and real-time analytics. With the ability to manage both raw and structured data, support external references, and integrate with multiple compute engines, it’s one of the cornerstones of Microsoft Fabric.

Stay tuned for the next article, where we’ll start querying the Lakehouse using SQL and explore best practices for building efficient analytical pipelines!

Enjoyed this post?
Subscribe to my newsletter for more deep dives into Microsoft Fabric, Power BI, and modern data tools:
👉https://www.linkedin.com/newsletters/ilgar-zarbaliyev-s-newsletter-6912079812832452608/