Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Join us at the 2025 Microsoft Fabric Community Conference. March 31 - April 2, Las Vegas, Nevada. Use code FABINSIDER for $400 discount. Register now

Reply
arpost
Kudo Collector
Kudo Collector

Is it better to load straight to Data Warehouse or to Lakehouse first in Fabric?

Greetings, all. I'm exploring using Microsoft Fabric for an enterprise-scale data warehousing solution but have a question. We have a lot of raw data files in CSV format that we want to load into staging and then transform in a Data Warehouse.

 

Questions

My questions are:

 

  1. Is it best practice to load straight to the data warehouse, or should the files first be dropped into a Lakehouse and then copied into the DW?
  2. Does loading into a Lakehouse and then loading into a Data Warehouse create separate copies of the same data?

 

Ingestion Example

I've seen the following pattern presented as the way to approach this kind of medallion architecture (use Lakehouse as Bronze, DW as Silver, etc.):

  1. Load files into Lakehouse.
  2. Load Lakehouse file data into Lakehouse tables.
  3. Load Lakehouse table data into Data Warehouse tables.

My concern with #2 is that new columns can be added to the files, which would require the delta tables change. Plus, it sounds like copying into the Lakehouse and then copying into a DW creates two "copies" of the same data.

 

Anyone have ideas or helpful suggestions on this?

1 ACCEPTED SOLUTION
v-nikhilan-msft
Community Support
Community Support

Hi @arpost ,
Thanks for using Fabric Community.

Whether to load directly to the data warehouse or to load to a lakehouse first and then copy to the data warehouse depends on a few factors, including:

  • Data volume and processing requirements: If you have a large volume of data and/or complex processing requirements, it may be more efficient to load to a lakehouse first. This is because lakehouses are designed to handle large volumes of data and can perform complex processing tasks such as data transformation and enrichment.
  • Data governance and compliance requirements: If you have strict data governance and compliance requirements, you may want to load directly to the data warehouse. This is because data warehouses are typically designed to meet these requirements out of the box.

 

Here are some additional things to consider:

  • Lakehouses offer more flexibility: Lakehouses are more flexible than data warehouses. They can store data in a variety of formats, including structured, semi-structured, and unstructured data. This makes them ideal for storing data from a variety of sources, such as IoT devices, social media, and customer relationship management (CRM) systems.
  • Data warehouses offer better performance: Data warehouses are typically designed to offer better performance for analytical workloads than lakehouses. This is because data warehouses are typically optimized for SQL queries.

Yes, loading into a lakehouse and then loading into a data warehouse does create separate copies of the same data. This is because the lakehouse and the data warehouse are two separate systems. The lakehouse is typically used for storing and processing raw data, while the data warehouse is typically used for storing and analyzing structured data.


Please refer to these links for more information:
Link1 
Link2 
Link3 

Hope this helps. Please let us know if you have any further queries.

 

View solution in original post

2 REPLIES 2
v-nikhilan-msft
Community Support
Community Support

Hi @arpost ,
Thanks for using Fabric Community.

Whether to load directly to the data warehouse or to load to a lakehouse first and then copy to the data warehouse depends on a few factors, including:

  • Data volume and processing requirements: If you have a large volume of data and/or complex processing requirements, it may be more efficient to load to a lakehouse first. This is because lakehouses are designed to handle large volumes of data and can perform complex processing tasks such as data transformation and enrichment.
  • Data governance and compliance requirements: If you have strict data governance and compliance requirements, you may want to load directly to the data warehouse. This is because data warehouses are typically designed to meet these requirements out of the box.

 

Here are some additional things to consider:

  • Lakehouses offer more flexibility: Lakehouses are more flexible than data warehouses. They can store data in a variety of formats, including structured, semi-structured, and unstructured data. This makes them ideal for storing data from a variety of sources, such as IoT devices, social media, and customer relationship management (CRM) systems.
  • Data warehouses offer better performance: Data warehouses are typically designed to offer better performance for analytical workloads than lakehouses. This is because data warehouses are typically optimized for SQL queries.

Yes, loading into a lakehouse and then loading into a data warehouse does create separate copies of the same data. This is because the lakehouse and the data warehouse are two separate systems. The lakehouse is typically used for storing and processing raw data, while the data warehouse is typically used for storing and analyzing structured data.


Please refer to these links for more information:
Link1 
Link2 
Link3 

Hope this helps. Please let us know if you have any further queries.

 

Hi @arpost ,
Glad that your issue got resolved. Please continue using Fabric Community for any help regarding your queries.

Helpful resources

Announcements
Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!

FebFBC_Carousel

Fabric Monthly Update - February 2025

Check out the February 2025 Fabric update to learn about new features.

Feb2025 NL Carousel

Fabric Community Update - February 2025

Find out what's new and trending in the Fabric community.