Zero-Copy Access to One ake Data in Azure Databric...

AnujPandey · ‎02-24-2026

Zero-copy access to OneLake in Azure Databricks is a new preview feature that allows Azure Databricks to query data stored in OneLake without requiring any data copy or movement. In practice, this means Azure Databricks users can directly access and analyze data in Microsoft Fabric’s OneLake (the unified data lake in Microsoft Fabric) via Databricks’ own Unity Catalog, treating Fabric’s data as external tables. This capability eliminates the need to build ETL pipelines or duplicate data between Fabric and Databricks environments. It represents a significant step towards a “single source of truth” for analytics data across multiple platforms, enabling truly multi-engine analytics on a shared dataset

Zero-copy access to OneLake data in Azure Databricks (Preview) | Microsoft Fabric Blog | Microsoft F...

How Zero-Copy Access Works

From a technical standpoint, zero-copy access is achieved through a feature called OneLake catalog federation, part of Azure Databricks’ new Lakehouse Federation capabilities. Instead of physically moving data, Azure Databricks establishes a connection at the metadata/catalog level to the OneLake data. The high-level process is:

Under the hood, when a query is executed, the Databricks engine reads the data directly from OneLake’s storage (using the credentials provided). OneLake’s storage engine is fully compatible with ADLS and Delta Lake standards, so Databricks can treat Fabric tables like external Delta tables. The integration bridges what was previously an incompatibility between OneLake’s managed data layer and Databricks’ expectation of direct storage access. In the past, Databricks could not natively read OneLake’s dfs.fabric.microsoft.com address because OneLake abstracts the underlying storage accounts for security. The new federation feature overcomes this by using the Unity Catalog open interface – Databricks communicates with OneLake via a secure API and storage credential rather than requiring a raw storage mount. This preserves OneLake’s governance “control plane” within Fabric while still allowing Databricks to perform distributed queries against the data. All data access in this preview is read-only – Databricks cannot modify or delete OneLake data through the federation (write-back capabilities may come in future enhancements, but for now OneLake remains the system of record)

Key Benefits Over Traditional Approaches

In traditional multi-platform setups, organizations often resort to copying data between lakes or warehouses to enable different tools to work with it, leading to redundant data copies, complex pipelines, and data governance headaches. Zero-copy integration eliminates these issues. Below are the key benefits of this approach:

Integration with Microsoft Fabric Ecosystem

One of the biggest advantages of OneLake is that it enables a range of analytics tools to operate on the same data. Zero-copy access in Databricks extends this capability beyond the core Fabric services. In Microsoft Fabric, a variety of engines (T-SQL, Spark, Power BI, real-time analytics) already share data via OneLake, using the Delta Parquet format as a common denominator. For example, data engineers might use Fabric’s integrated Spark engine to write data to a Lakehouse, which stores files in OneLake, and then data scientists can immediately analyze that data using Azure Databricks notebooks. At the same time, business analysts could connect to the same OneLake-hosted data via Power BI’s Direct Lake mode for dashboards, all without separate exports or imports.

This new integration effectively adds Azure Databricks as another first-class consumer of OneLake. It complements existing features like Fabric’s Data Activator (real-time analytics) and Mirrored tables (which allow Fabric to pull in external Delta tables from Databricks or other sources) – creating a two-way street for data sharing. The end result is a more open analytics ecosystem: organizations can choose the best tool for each job (e.g. Fabric’s built-in SQL for BI, or Databricks for ML and data science) without worrying about data silos. All engines are accessing a common data lake, which encourages a modular, interoperable analytics architecture.

Preview Status, Limitations, and Considerations

As of early 2026, zero-copy OneLake access via Azure Databricks is in public preview Data engineers and architects should be aware of the following limitations and considerations before using this feature in an enterprise setting:

Read-Only (No Writes): In the current preview, Databricks can only perform read operations (SELECT queries) on OneLake data. INSERT/UPDATE/MERGE from Databricks to OneLake are not supported. OneLake remains the authoritative data store (for example, ETL pipelines or ELT processes that populate OneLake must still run within Fabric or other tools; Databricks can then be used for reading and analyzing that data).
Supported Data Sources: The federation is limited to Fabric Lakehouse and Warehouse tables (the primary data items in Fabric that store files in OneLake). Other Fabric item types (e.g. KQL databases, Data Science datasets, etc.) are not yet directly accessible via this feature. In practice, this means most structured data in Fabric’s open data lake (Delta tables in lakehouses or SQL warehouses) can be federated to Databricks.
Databricks Runtime and Feature Requirements: Zero-copy OneLake access requires newer versions of Azure Databricks. It is supported on Databricks Runtime 18.0 and above, with clusters in Standard (shared) mode. Single-user (dedicated) clusters and serverless SQL warehouses are not supported in the preview. Additionally, if using Databricks SQL warehouses, the warehouse must be Pro SKU (not the classic or serverless tier) and version 2025.35 or higher. These requirements ensure the presence of Unity Catalog and the necessary Federation features.
Setup Overhead: Initial configuration involves Azure admin work (setting up an Access Connector or service principal with correct permissions) and Unity Catalog admin work. While straightforward, this does introduce some overhead. Data teams should plan for coordination between Azure admin, Fabric admin, and Databricks admin roles to configure the connection securely. Microsoft’s documentation provides step-by-step guidance for this setup, and the process should be tested in a development environment first.
Performance Testing: Because this is a preview, organizations should validate performance with their own datasets. Factors like network throughput, data sizes, and query complexity can impact performance. Monitor query execution and storage IO during testing, and consider using Databricks caching or workload management strategies if needed. (At this time, there are no known performance bugs specific to OneLake federation, but caution is warranted with any preview feature.) Ensuring data is appropriately partitioned and using Delta best practices will help achieve optimal performance on read-heavy workloads.
Known Issues: Users should review the latest Beta release notes for any new issues. As of the current preview, one must be mindful of the limitations above. Additionally, certain Unity Catalog features may not fully apply to foreign catalogs in preview (for example, lineage tracking or row-level security on federated tables may not be available yet – these are areas to watch as the integration matures). It’s recommended to use this feature for evaluation and development purposes initially, and engage with Microsoft support or forums for guidance on any odd behaviors encountered.

Example use cases: This zero-copy integration is especially beneficial in hybrid analytics environments where some teams use Fabric and others use Databricks. For instance, an enterprise might use Fabric’s Data Warehouse (Synapse) to consolidate and clean transactional data into a Lakehouse, and then have a data science team using Azure Databricks to train machine learning models on that same data. With zero-copy access, the data scientists can directly query the curated Lakehouse tables in OneLake from Databricks notebooks, without the BI team having to export data or maintain a parallel data lake for them. Likewise, if the organization has a central data lake in OneLake feeding Power BI dashboards, an expert data engineering team could leverage Databricks to perform more complex transformations or ML feature engineering on this data, then write results back into OneLake (writing would require a different mechanism until that capability is added to federation). This pattern fits well with a data mesh or hub-and-spoke architecture – OneLake acts as the hub for shared data products, and various spokes (Fabric engines, Azure Databricks, etc.) all tap into it as needed. Teams can thus choose the best tool for each analytics task without worrying about data availability, and with confidence that everyone is working from the latest data.

In summary, zero-copy access to OneLake from Azure Databricks is a promising development for enterprises looking to simplify their data landscape. It offers a path to break down the traditional barriers between data platforms, fostering a more open and collaborative ecosystem. While the feature is still in preview – with some limitations around write capabilities and environment prerequisites – it demonstrates Microsoft’s and Databricks’ commitment to “open lakehouse” interoperability. Data engineers and architects should begin evaluating this feature in non-production environments, assess its performance on representative workloads, and keep an eye on its evolution toward general availability. If the preview’s promise holds, full production-grade support is likely to follow, unlocking new scenarios for hybrid analytics without data silos. This will enable organizations to maximize the value of Microsoft Fabric’s OneLake as a one-stop, governed data lake, while leveraging the advanced analytics power of Azure Databricks on the same data – all with zero copies and zero hassle.

Expanding support for OneLake in Unity Catalog | Databricks Blog

(2) Post | Feed | LinkedIn

Thanks

Anuj Pandey

Data & AI Fabric Community