Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Ask the Fabric Databases & App Development teams anything! Live on Reddit on August 26th. Learn more.

Reply
Hussain_Abedi
New Member

Handling Pipeline Failures and Audit Layer for Data Ingestion in Microsoft Fabric Lakehouse

I am ingesting data from PostgreSQL and MySQL into the Fabric Lakehouse Bronze layer. In the Silver layer, the two tables are joined, and the combined data is stored in a delta table, which I have named the temp table.

From the temp table, the data is ingested into a warehouse. After ingestion, an inner join is performed between the temp table and the warehouse to identify the records that match in both places. These matched records are marked as loaded and saved into the Silver layer of the Lakehouse.

Problem:

If the pipeline breaks after the data is ingested into the warehouse but before it is marked as loaded and saved into the Silver layer, the incoming new data will overwrite the temp table, causing a loss of unprocessed records from the previous run. This happens because the Silver delta tables in Fabric are immutable, which led me to adopt this approach.

Questions:

  1. What should I use as an audit layer?
    How can I ensure that the records loaded into the warehouse are properly tracked, especially in the event of a pipeline failure?

  2. How can I maintain which records have been loaded into the warehouse and which have not?

1 ACCEPTED SOLUTION
v-aatheeque
Community Support
Community Support

Hi @Hussain_Abedi ,

Thanks for reaching out to the Microsoft Fabric Community forum.
To address your data ingestion challenges and prevent data loss during pipeline failures, implement these strategies:

1. Create a dedicated audit table that logs the status of each record processed during the pipeline. This table can include the following columns: record_id , ingestion _timestamp, status, error_message.


2.Instead of directly overwriting the temp table, consider using a staging area where you can store the new incoming data. This allows you to keep the previous run's data until you confirm that the new data has been successfully processed

  • Ingest new data into the staging area.
  • Perform an inner join between the temp table and staging area to identify matching records.
  • Update the audit table to mark successfully ingested records as "loaded."
  • Overwrite the temp table with new data only after all records are processed and marked.
  • Use the audit table to mark records as "loaded" upon successful ingestion into the warehouse.
  • If a pipeline failure occurs, you can query the audit table to identify which records have not been marked as "loaded" and need to be reprocessed.

 

If this post was helpful, please consider marking Accept as solution to assist other members in finding it more easily.

If you continue to face issues, feel free to reach out to us for further assistance!

View solution in original post

4 REPLIES 4
v-aatheeque
Community Support
Community Support

Hi @Hussain_Abedi ,

Thanks for reaching out to the Microsoft Fabric Community forum.
To address your data ingestion challenges and prevent data loss during pipeline failures, implement these strategies:

1. Create a dedicated audit table that logs the status of each record processed during the pipeline. This table can include the following columns: record_id , ingestion _timestamp, status, error_message.


2.Instead of directly overwriting the temp table, consider using a staging area where you can store the new incoming data. This allows you to keep the previous run's data until you confirm that the new data has been successfully processed

  • Ingest new data into the staging area.
  • Perform an inner join between the temp table and staging area to identify matching records.
  • Update the audit table to mark successfully ingested records as "loaded."
  • Overwrite the temp table with new data only after all records are processed and marked.
  • Use the audit table to mark records as "loaded" upon successful ingestion into the warehouse.
  • If a pipeline failure occurs, you can query the audit table to identify which records have not been marked as "loaded" and need to be reprocessed.

 

If this post was helpful, please consider marking Accept as solution to assist other members in finding it more easily.

If you continue to face issues, feel free to reach out to us for further assistance!

Hi @Hussain_Abedi ,

We haven’t heard from you on the last response and was just checking back to see if your query got resolved. If this post was helpful, please consider marking Accept as solution to assist other members in finding it more easily.

 

If you continue to face issues, feel free to reach out to us for further assistance!

Thanks for using Microsoft Fabric Community.

Hi @Hussain_Abedi ,

We haven’t heard from you on the last response and was just checking back to see if your query got resolved. If this post was helpful, please consider marking Accept as solution to assist other members in finding it more easily.

 

If you continue to face issues, feel free to reach out to us for further assistance!

Thanks for using Microsoft Fabric Community.

Hi @Hussain_Abedi ,

We haven’t heard from you on the last response and was just checking back to see if your query got resolved. If this post was helpful, please consider marking Accept as solution to assist other members in finding it more easily.

 

If you continue to face issues, feel free to reach out to us for further assistance!

Thanks for using Microsoft Fabric Community.

Helpful resources

Announcements
Fabric July 2025 Monthly Update Carousel

Fabric Monthly Update - July 2025

Check out the July 2025 Fabric update to learn about new features.

August 2025 community update carousel

Fabric Community Update - August 2025

Find out what's new and trending in the Fabric community.