This time we’re going bigger than ever. Fabric, Power BI, SQL, AI and more. We're covering it all. You won't want to miss it.
Learn moreDid you hear? There's a new SQL AI Developer certification (DP-800). Start preparing now and be one of the first to get certified. Register now
The Native Execution Engine for Fabric Data Engineering workloads is now generally available (GA) as part of Fabric Runtime 1.3. This C++-based vectorized engine (built on Apache Gluten and Velox) runs Spark workloads directly on the lakehouse, requiring no code changes or new libraries. It supports Spark 3.5 APIs and both Parquet and Delta Lake formats, so your existing Spark queries simply run faster. In internal tests, the engine has delivered dramatic speedups – Microsoft’s benchmarks showed roughly 4× faster queries on a 1 TB TPC-DS workload versus vanilla Spark, and in our own Fabric GA trials we have seen up to 6× end-to-end performance gains on representative big-data jobs. The Native Execution Engine is well-suited for a wide range of workloads (from batch ETL to interactive data science) because it processes data in columnar form and minimizes JVM overhead.
Getting started with the Native Execution Engine in Fabric is straightforward. You can enable it in your Spark environment or session in several ways:
In the Fabric portal and create a new Environment item from the New item option and navigate to the Acceleration tab, and check Enable native execution engine. Once saved and published, all Spark Job Definitions and Notebooks that are using the environment inherit the setting.
Microsoft_Fabric_Data_Engineering_Native_Execution_Engine_now_generally_availabl
For a single notebook or job, set the Spark property in your session configuration. For example, in a notebook cell add:
%%configure
{ "conf": { "spark.native.enabled": "true" } }
If using a Spark job definition, include the same property in the job’s configuration. The change takes effect immediately without having to restart your Spark session.
- Workspace default environment: As part of workspace setup, you can attach an environment as the workspace default (via Workspace settings → Data Engineering/Science → Environment). By making that environment the default, all new Spark workloads in the workspace will automatically use the native engine without per-job configuration.
In this GA release we have incorporated a series of optimizations and new features for the Native Execution Engine. Key improvements include:
- Native Delta write accleration
- Optimized Delta snapshot creation
- Deletion vectors support
- Expanded Delta operations support
Together, these enhancements plug feature gaps from preview and unlock further acceleration across common data engineering workload patterns.
With the Native Execution Engine enabled, users will see substantial end-to-end speedups. For example, in internal benchmarks on typical data aggregation and join queries, we have observed up to 6X faster runtimes compared to the standard Spark engine.
Importantly, the Fabric Native Execution Engine is included at no additional cost – just enable it and your existing Spark credit rates apply. Customers benefit from the dramatically faster execution without changing their spending plan: effectively you pay less for the same work.
Learn more about the latest performance updates as enabled as part of this General Availability release from our Native execution engine for Fabric Data Engineering documentation.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.