Solved: Fabric trigger invoking twice when there is a inc...

adigkarth · ‎08-14-2024

Hi,

Our requirement was to invoke a pipeline whenever there is a blob created in a adls folder.This json file is written from datatbricks using dbutlis.fs.put api.We have created a reflex and chosen blobcreated as an event and in filter we have put the folder name where the json arrives

now the problem we are facing is,when the databricks notebook writes the file into adls,reflex is detecting the same json file twice and triggering the target pipeline twice with the same file.We tried uploading the same file using adls portal in the same folder and the trigger was invoked only once.Could you please let us know how to resolve this

mikeburek · ‎08-14-2024

I haven't set this up in Fabric pipelines yet, but I had a similar issue in a normal Azure Data Factory pipeline.

The issue was with a notebook writing a parquet file.

What happened is that the notebook (scala?) would first write an empty file with one API call. Then it would flush the rest of the data to the file with a different API call. To the Azure Data Factory pipeline trigger, this was technically 2 API calls to write a file. Just the first one was empty and the second one had data.

In the Azure Data Factory pipeline trigger, there is an option for "Ignore empty file". I do not know if this is an option in the Fabric Pipeline Trigger, but that is what I'd look for first.

There was also another situation where the file was supposed to stay empty because it was just a signaling file. In that case, I had to look at the event body of the event that triggered the trigger and look for the specific flush API call, and ignore the open file API call.

View solution in original post

mikeburek · ‎08-14-2024

I haven't set this up in Fabric pipelines yet, but I had a similar issue in a normal Azure Data Factory pipeline.

The issue was with a notebook writing a parquet file.

What happened is that the notebook (scala?) would first write an empty file with one API call. Then it would flush the rest of the data to the file with a different API call. To the Azure Data Factory pipeline trigger, this was technically 2 API calls to write a file. Just the first one was empty and the second one had data.

In the Azure Data Factory pipeline trigger, there is an option for "Ignore empty file". I do not know if this is an option in the Fabric Pipeline Trigger, but that is what I'd look for first.

There was also another situation where the file was supposed to stay empty because it was just a signaling file. In that case, I had to look at the event body of the event that triggered the trigger and look for the specific flush API call, and ignore the open file API call.

Fabric trigger invoking twice when there is a incoming file

Helpful resources

Join us at the Microsoft Fabric Community Conference

Fabric Monthly Update - January 2025

Fabric Monthly Update - January 2025

New Offer! Become a Certified Fabric Data Engineer

Fabric trigger invoking twice when there is a incoming file

Helpful resources

Join us at the Microsoft Fabric Community Conference

Fabric Monthly Update - January 2025

Fabric Monthly Update - January 2025