Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Did you hear? There's a new SQL AI Developer certification (DP-800). Start preparing now and be one of the first to get certified. Register now

Nick_Salch

Answers to common questions about Fabric Data Factory

As the Data Integration Customer Advisory Team (CAT) lead, I spent a lot of time talking to customers at the recent FabCon/SQLCon about Fabric Data Factory, and I came away with a clear picture of what's on customers' minds when it comes to the future of data integration. Many of the same questions came up in every booth conversation, Ask the Experts discussion, and hallway chat.

Here’s what I heard:

In this post, I'm going to answer those questions directly, one by one.

I'm already on ADF. Should I move to Fabric Data Factory?

This is the one I heard most often. It's a fair question if you are already successfully running your workload on ADF, especially if you don't have immediate plans to adopt other Fabric workloads or you're using ADF today to move data across multiple clouds.

Here's the short answer: you don't have to move, but you'll want to.

Microsoft continues to fully support ADF, so you can keep running it with confidence. But all new data integration innovation is landing in Fabric Data Factory. That means new connectors, new capabilities, AI-assisted data integration with Copilot, and new experiences.

If you move to Fabric Data Factory, you'll get less engineering overhead, AI-assisted development, pro-developer or low-code tooling that doesn't exist in ADF, and a unified platform that replaces the sprawl of stitching multiple services together.

Fabric Data Factory isn't a lateral move; it's a meaningful upgrade. Your existing pipelines come over without rewrites and your connector knowledge, your pipeline logic, your operational patterns - all of it carries over. What you gain with Fabric Data Factory is access to everything that ADF can't do: Copilot-assisted development, Mirroring, Copy Jobs, Apache Airflow, dbt, unified CI/CD, and a platform whose ceiling is already higher and will continue to grow. I will walk through a lot of these advantages in more detail in this post.

Fabric Data Factory is newer. How do I know it’s mature enough for my production workloads?

It's a fair question. ADF has been the backbone of enterprise data integration on Azure for more than a decade with a track record that gives customers confidence running their production workloads.

That same trust can be placed in Fabric Data Factory. It is built on the same core infrastructure as ADF: the same execution engine, same connector library, same hardened reliability model that enterprise teams depend on today. The core orchestration and data movement capabilities are one in the same. If your pipelines depend on ADF's uptime, scale, and predictable execution, you'll find that consistency fully reflected in Fabric Data Factory.

The_image_is_a_diagram_that_highlights_the_benefits_of_upgrading_to_Fabric_DataThe_image_is_a_diagram_that_highlights_the_benefits_of_upgrading_to_Fabric_Data

Figure: Reasons for upgrading to Fabric Data Factory.

What changes is the new functionality around that core—a modern SaaS experience, unified governance, AI assistance, and a broader set of tools made possible by Fabric’s SaaS model.

What does Fabric Data Factory give me that ADF doesn't?

A lot, actually. I answered this question in nearly every conversation. There are numerous things you can get only in Fabric Data Factory:

Fabric Data Factory enables you to unify your data from any cloud, on-premises, and more

Mirroring is the killer feature that came up most frequently, and it's probably the easiest on ramp to Fabric Data Factory available today. Mirroring gives you continuous, low-latency replication from operational databases and cloud warehouses—SQL Server, Snowflake, Google BigQuery, Cosmos DB, and more—directly into OneLake. No pipelines to build, configure, or maintain. And best of all, Fabric compute and OneLake storage for mirroring are free up to a capacity-based limit. It's one of the most compelling data replication capabilities on the platform. For customers who are currently paying to orchestrate continuous replication in ADF, that alone can justify the move.

Copy Jobs deliver a configuration-first approach for bulk copy, watermark-based incremental sync, and native CDC scenarios—including SCD Type 2 support for full history tracking with built-in audit columns. This enables less engineering overhead. Many scenarios that require pipeline-building in ADF don’t require it in Fabric Data Factory.

Fabric also extends this integration story to enterprise application estates through Microsoft’s strategic partnership with SAP. Through collaboration with SAP and the Business Data Cloud (BDC) ecosystem, Fabric Data Factory introduces native, configuration‑driven integration patterns for SAP‑sourced operational data. This allows organizations to make analytics‑ready SAP data from systems like SAP S/4HANA available alongside other cloud and on‑premises sources in OneLake without maintaining custom ingestion pipelines or intermediate staging environments.

Beyond replication and ingestion, Fabric Data Factory includes a broad connector library for both sources and destinations. In fact, Fabric Data Factory already has the broadest connector set Microsoft has offered to date. Copy Jobs support controlled, configurable movement to destinations including Snowflake, Oracle, Google BigQuery, Amazon S3, and Databricks, while Dataflows Gen2 can write transformed data to platforms such as Snowflake and BigQuery when downstream consumption remains outside of OneLake.

For organizations that prefer not to move data at all, OneLake Shortcuts allow you to connect to data where it already lives—in Azure, AWS, or Google Cloud—without moving or copying it. Permissions and credentials are managed centrally through OneLake and because shortcuts work against data in place, you eliminate edge copies and the staging pipelines that create them.

The result is a unified data integration platform that works across Azure, AWS, Google Cloud, SAP landscapes, and on‑premises environments without requiring organizations to redesign their existing data estate before they begin benefiting from Fabric.

Change data made easy

Processing change data can be hard. Traditional batch pipelines struggle with late-arriving data, out-of-order events, and schema drift leaving a lot of work on the downstream system. Teams end up building complex watermark logic, custom merge code, and fragile orchestration to stay close to real time.

Mirroring and Copy Jobs in Fabric Data Factory remove that complexity. With mirroring, you get continuous replication where inserts, updates, and deletes are handled automatically without any custom pipeline logic, so your data is always up to date. For scenarios where full history matters, Copy jobs with native CDC and SCD type 2 support make it straightforward to retain prior versions of records with audit columns and effective dates. AI models benefit from data that is fresh, complete, and historically accurate. Fabric shortens the gap between operational systems and analytics consumption to ensure your data is AI-Ready.

AI-accelerated development

AI that helps you build faster. Fabric Data Factory brings Copilot directly into the authoring experience—a capability introduced with Fabric. With Copilot, teams can:
  • Generate pipelines from natural language. Describe what you want to move or orchestrate and Copilot builds the structure, cutting time-to-first run down significantly.
  • Get expression suggestions and plain-language explanations of complex pipeline steps.
  • Troubleshoot inline. Copilot identifies errors and recommends fixes without switching context.
  • Expose Copy Jobs as MCP endpoints, making every data movement task programmable and AI-accessible from any tool in your stack.
The Data Factory MCP server takes this further, enabling pipelines, dataflows, and data movements to be built and operationalized entirely through AI.

Low code transformation at scale

For low-code and no-code users, Mirroring and Copy Jobs offer configuration-driven data movement with no pipeline authoring required. And Dataflows Gen2 brings the next generation of Power Query to the cloud. It's the go-to tool for teams that want powerful, citizen-led data transformation without writing code. Built on the familiar Power Query experience that millions of analysts already know from Excel and Power BI, Dataflows Gen2 connects to hundreds of data sources and supports over 300 built-in transformations—all through a visual, no-code interface.

The Fabric version goes well beyond the traditional Power Query experience. Dataflows Gen2 now supports multiple output destinations—Lakehouse, Warehouse, Azure SQL, Snowflake, ADLS Gen2, and more—so your transformed data lands exactly where your downstream consumers need it. Partitioned execution and the Modern Query Evaluator deliver dramatically better performance and significantly lower compute costs compared to earlier versions. Variable library integration enables true environment promotion across development, testing, and production without hardcoding connection details, making Dataflows Gen2 a legitimate CI/CD-ready tool—not just a self-service one.

For organizations with both citizen integrators and professional data engineers, Dataflows Gen2 sits right in the middle: approachable enough for an analyst to build and schedule a transformation pipeline on their own, and capable enough to handle enterprise-scale ingestion workloads.

Pro-developer tooling

For teams with more advanced engineering requirements Fabric has you covered. Pipelines remain the backbone for complex, multi-step orchestration, conditional branching, approval gates, and REST integrations, but Fabric Data Factory also introduces two new first-class capabilities with no external infrastructure to manage: Apache Airflow Jobs and native dbt jobs.

Apache Airflow Jobs in Fabric provide a fully managed, SaaS-native Airflow experience that runs directly within the Fabric platform. There is no cluster provisioning, no scheduler management, and no separate security or networking model to wire up. DAGs run alongside pipelines, Dataflows Gen2, and Copy Jobs in the same workspace, with shared monitoring and governance.

For teams that already have Airflow investments, this means you can bring existing DAGs forward without re-platforming, while gaining tighter integration with Fabric data assets in OneLake. For teams adopting Airflow for the first time, it removes the operational overhead that traditionally comes with standing up and maintaining Airflow infrastructure.

On the other hand, SQL-based transformation has become the standard for analytics engineering, and dbt is the tool many teams standardize on. Fabric Data Factory introduces native dbt Jobs that let teams operationalize dbt without managing separate compute, storage, or authentication layers.

dbt Jobs in Fabric are tightly integrated with Lakehouse and Data Warehouse, use Git-backed version control, and support environment-specific execution without custom scripting. This makes it easier to promote models from development to production while keeping transformations close to the data they operate on.

More importantly, dbt Jobs sit alongside low-code transformations, pipelines, and Airflow workflows in a single platform. That allows organizations to mix approaches without fragmenting their architecture. Analysts can rely on governed tables produced by dbt models, while engineers maintain full control over SQL logic and testing.

Together, Airflow Jobs and dbt Jobs give pro developers Python-first orchestration and SQL-first transformation, fully managed and directly integrated with Fabric’s storage, security, and monitoring layers. Teams keep their preferred tools but lose the operational overhead that usually comes with running them at enterprise scale.

How do I migrate? Where do I start?

If you are ready to consider moving your ADF workload into Fabric, our built-in migration experience is designed to be incremental, guided, and low risk. And it starts right inside your existing Azure Data Factory.

From the ADF authoring experience, you can launch a built-in migration flow that walks you through the steps to bring workloads into Fabric:

  • Assess — the assessment available inside the ADF UX helps you understand which pipelines are ready, which need review, and which have activities that are coming soon in Fabric.
  • Migrate selectively and in phases — choose which pipelines to move first, prioritize by business value, and move in controlled waves.
  • No rewrites required — existing ADF pipelines are brought into Fabric Data Factory as-is, while immediately unlocking Copilot, Copy Jobs, Mirroring, dbt, Airflow, and native OneLake integration.
  • Continue running ADF during the transition — validate results, preserve output parity, and move at your own pace.
The migration experience maps ADF linked services directly to Fabric connections, handles the assessment automatically, and gives you a clear plan before you move anything.

I hope this helps answer some of your questions about Fabric Data Factory. It is not a rip-and-replace—it's a natural progression built on the same foundation you already trust, with a much larger surface area for what's possible.

The question isn't really whether you want to migrate to Fabric Data Factory, it's when and how. And the answer to both is: sooner than you think, and easier than you expect.

Next steps