Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Enhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends August 31st. Request your voucher.

Reply
NagaRK
Advocate I
Advocate I

Best to process log files and store the data in warehouse in fastest way

Hi Team,

 

I have hundrednds of files coming from our sources everyday. I would like to process these files in Fabric notebook and store extracted  data into warehouse in around 50 to 100 tables. I would like process this data and store in warehouse with in 5 to 10mins, basically as fast as possible. I would like to load the data to staging can be in lakehouse or warehouse whichever is possible. Once data is staged, I would like to transfer the data from staging tables to final tables by keeping insert statements in transaction block. I am aware transaction block is not possible in fabric notebook I need to depend on Stored procedure to do this.

 

1) What is the best way to stage data? Is it faster to stage in lakehouse or warehouse in notebook? We have tried staging in Lakehouse tables but inserting data to lakehouse is taking more time around 20 mins in Fabric notebook where as storing the data to lakehouse as files instead of L:akehouse tables taking just 2 mins. But the challenge is again loading the staging data from Lakehouse files to Warehouse final table  in transaction manner. 

 

2) How to transfer data from Lakehouse files(Parquet format) to Warehouse table  in transaction block manner?

 

Any suggestion?

 

1 ACCEPTED SOLUTION
v-dineshya
Community Support
Community Support

Hi @NagaRK ,

Thank you for reaching out to the Microsoft Community Forum.

 

You are trying to process hundreds of log files daily using Microsoft Fabric notebooks and load the data into 50–100 warehouse tables within 5–10 minutes. You have already observed that writing to Lakehouse tables is slower 20 mins, while storing files in Lakehouse as Parquet is much faster 2 mins. You also want to transfer staged data to final warehouse tables using transactional logic via stored procedures.

 

1. Fabric notebooks excel in batch processing and complex transformations, but they do not inherently support real-time ingestion. For optimized storage and access, the recommendation is to write data into Delta tables, which are supported in Lakehouse environments.

 

Solution: Write Parquet files to Lakehouse is fast. Write to Lakehouse tables is slower due to metadata sync and Spark overhead. Warehouse ingestion is faster and more suitable for transactional operations.

 

Note: Stage data as Parquet files in Lakehouse for speed. Use COPY INTO or Pipelines to load from Lakehouse files into Warehouse staging tables, then use stored procedures for transactional inserts into final tables.

 

2. Fabric notebooks don’t support transaction blocks, but Stored Procedures in Warehouse will support.

 

Solution:

 

Please follow below steps.

 

1. Read Parquet files in Notebook and compare schema with Warehouse using "INFORMATION_SCHEMA.COLUMNS".

 

2. Write schema differences or data to a staging table or file in OneLake.

 

3. Use Fabric Pipeline to read staging data. Pass parameters to a stored procedure. Execute INSERT or ALTER TABLE statements in a transaction block.


Please refer below links.

Optimal Loading Practices for Microsoft Fabric: The Data Professional’s Ultimate Guide | Microsoft C...

Load data into a Microsoft Fabric data warehouse - Training | Microsoft Learn

Efficiently Processing Massive Data Volumes using ... - Microsoft Fabric Community

 

I hope this information helps. Please do let us know if you have any further queries.

 

Regards,

Dinesh

View solution in original post

3 REPLIES 3
NagaRK
Advocate I
Advocate I

Thank you for your suggestion  @v-dineshya 

v-dineshya
Community Support
Community Support

Hi @NagaRK ,

Thank you for reaching out to the Microsoft Community Forum.

 

You are trying to process hundreds of log files daily using Microsoft Fabric notebooks and load the data into 50–100 warehouse tables within 5–10 minutes. You have already observed that writing to Lakehouse tables is slower 20 mins, while storing files in Lakehouse as Parquet is much faster 2 mins. You also want to transfer staged data to final warehouse tables using transactional logic via stored procedures.

 

1. Fabric notebooks excel in batch processing and complex transformations, but they do not inherently support real-time ingestion. For optimized storage and access, the recommendation is to write data into Delta tables, which are supported in Lakehouse environments.

 

Solution: Write Parquet files to Lakehouse is fast. Write to Lakehouse tables is slower due to metadata sync and Spark overhead. Warehouse ingestion is faster and more suitable for transactional operations.

 

Note: Stage data as Parquet files in Lakehouse for speed. Use COPY INTO or Pipelines to load from Lakehouse files into Warehouse staging tables, then use stored procedures for transactional inserts into final tables.

 

2. Fabric notebooks don’t support transaction blocks, but Stored Procedures in Warehouse will support.

 

Solution:

 

Please follow below steps.

 

1. Read Parquet files in Notebook and compare schema with Warehouse using "INFORMATION_SCHEMA.COLUMNS".

 

2. Write schema differences or data to a staging table or file in OneLake.

 

3. Use Fabric Pipeline to read staging data. Pass parameters to a stored procedure. Execute INSERT or ALTER TABLE statements in a transaction block.


Please refer below links.

Optimal Loading Practices for Microsoft Fabric: The Data Professional’s Ultimate Guide | Microsoft C...

Load data into a Microsoft Fabric data warehouse - Training | Microsoft Learn

Efficiently Processing Massive Data Volumes using ... - Microsoft Fabric Community

 

I hope this information helps. Please do let us know if you have any further queries.

 

Regards,

Dinesh

Hi @NagaRK ,

We haven’t heard from you on the last response and was just checking back to see if you have a resolution yet. And, if you have any further query do let us know.

 

Regards,

Dinesh

Helpful resources

Announcements
Fabric July 2025 Monthly Update Carousel

Fabric Monthly Update - July 2025

Check out the July 2025 Fabric update to learn about new features.

August 2025 community update carousel

Fabric Community Update - August 2025

Find out what's new and trending in the Fabric community.