Get certified for free when you join Fabric Data Days 2026 and dive into Fabric, Power BI, SQL, AI, and other essential data skills.
Join nowData Days is here! Join us now for 60+ days of learning, challenges, and connection. Learn more
The Problem:
Currently, Microsoft Fabric supports Git integration for workspace items (metadata). However, developers cannot "branch" the actual underlying data in OneLake. If a developer wants to test a new ETL logic, they must manually copy entire datasets to a "Dev" lakehouse, which is slow and consumes extra capacity.
The Idea:
Introduce Zero-Copy Data Branching directly within the Fabric UI. Using the power of Delta Lake’s "Time Travel," Fabric should allow users to create a "Data Branch" in a workspace.
1. Branching: A developer creates a "Dev Branch" of a Lakehouse. This creates a virtual pointer to the existing Parquet files without duplicating them.
2. Isolated Testing: The developer runs new Spark jobs or Pipelines on this branch. Only changed data is written to new files.
3. Merging: Once the code is verified, the user "merges" the data branch back to the "Main" Lakehouse, similar to a Git Pull Request for code.
Why this makes Fabric better:
• DevOps Excellence: It brings true "Data-as-Code" to Fabric.
• Cost Efficiency: Reduces the need to store multiple physical copies of data for development and testing.
• Safety: Prevents accidental corruption of production data by allowing a full "Sandbox" environment that mirrors production data perfectly.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.