Don't miss your chance to take the Fabric Data Engineer (DP-600) exam for FREE! Find out how by attending the DP-600 session on April 23rd (pacific time), live or on-demand.
Learn moreNext up in the FabCon + SQLCon recap series: The roadmap for Microsoft SQL and Maximizing Developer experiences in Fabric. All sessions are available on-demand after the live show. Register now
02-10-2026 07:14 AM - last edited 02-10-2026 07:15 AM
Version Controlled Lakehouse is a Microsoft Fabric extensibility workload that brings Git-like version control semantics to data files stored in OneLake, directly inside the Fabric experience.
It enables users to explore, query, snapshot, and version CSV and Parquet files in a Lakehouse using familiar concepts such as repositories, branches, commits, and commit history.
Repository: https://github.com/ugljesanovak/fabric-lakehouse-versioning
Architecture: https://github.com/ugljesanovak/fabric-lakehouse-versioning/blob/master/docs/items/LakehouseGitFS/a...
Demo: https://www.youtube.com/watch?v=T8huw007CPw
Version Controlled Lakehouse is a browser-first, metadata-driven versioning system for lakehouse files.
It allows users to treat a set of CSV and Parquet files in OneLake as a versioned repository, enabling:
The design is inspired by Git workflows and data versioning systems such as lakeFS, while leveraging DuckDB WASM to provide interactive SQL querying directly in the browser.
All logic runs in the client, and all persisted state is stored using Fabric item metadata and OneLake storage.
Microsoft Fabric provides Git integration for certain workspace and item metadata, but it does not natively support:
Version Controlled Lakehouse addresses this gap by introducing a Git-like mental model for lakehouse data, allowing users to:
This is particularly valuable for data engineering, analytics exploration, and reproducibility scenarios.
This workload is designed for:
It is also intended as a reference extensibility implementation, demonstrating how advanced data lifecycle concepts can be built on top of Fabric using the Extensibility Toolkit.
The current implementation focuses on a frontend-only MVP, with the following capabilities:
Each commit stores a snapshot of files under:
/Files/.gitfs/{item_id}/Data/{commit_id}
Metadata describing repositories, branches, commits, and files is stored in the Fabric item definition.
Version Controlled Lakehouse follows a three-layer architecture:
This architecture enables rapid iteration, simple deployment, and a clear separation between metadata and data storage.
The following capabilities are explicitly out of scope for the MVP and planned as future enhancements:
These enhancements would evolve the project from a frontend-driven prototype into a more production-ready data versioning system.
Version Controlled Lakehouse demonstrates how data can be treated like code inside Microsoft Fabric, using familiar Git workflows applied to lakehouse files. It showcases the power of Fabric extensibility and highlights how advanced data lifecycle management patterns can be implemented with minimal infrastructure.