Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Enhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends August 31st. Request your voucher.

Reply
eurenergy
Frequent Visitor

Best practice for moving data from files to delta table in Fabric

Hi all,

I set up a weekly scheduled data pipeline which copies data from Azure Datalake Storage Gen2 (ADLS gen2) to the Fabric LakeHouse. 

It works perfectly, it simply takes all files added last week, and moves it to Lakehouse. If a file is already in Lakehouse, it is simply overwritten, which is fine.

 

However, I am strugging to find the best practice on how to move the data from these files to a delta table, as this can be done in several ways (a notebook, dataflows, copy activity). It is important that data only should be appended if it is not already in the delta table. 

 

What is the best practice way to do this? It is strange that I am not able to find a lot of infomation about this, as this is one of the most common things to occur in Fabric. 

 

1 ACCEPTED SOLUTION
spencer_sa
Super User
Super User

We do this in three ways;
1) if we have local copies of the files, we have a folder structure that has an unprocessed folder and a processed folder.  A pipeline with a Get Metadata activity on the unprocessed folder feeding a For Each activity.  Inside that is a Copy Data activity and some file copy/delete steps.  (No Move File activity 😞 )
2) if all we have is a shortcut and we don't want to copy the files, we maintain a list of processed files in a table (using a notebook).  In a second notebook, we then do a left anti join on the files in the shortcut against the processed list and output the list of unprocessed files.  This then feeds a For Each activity to do the Copy Data and then using the first notebook append the processed file.
3) If we use a shortcut *and* keep a local copy of the processed files, then we can do somethin like 2), just subbing in a directory listing instead of the processed file table.

If this helps, please consider Accepting as a solution to help other people find it more easily.

View solution in original post

2 REPLIES 2
lbendlin
Super User
Super User

When you move files into the Lakehouse are they not automatically written into Parquet/delta format? 

spencer_sa
Super User
Super User

We do this in three ways;
1) if we have local copies of the files, we have a folder structure that has an unprocessed folder and a processed folder.  A pipeline with a Get Metadata activity on the unprocessed folder feeding a For Each activity.  Inside that is a Copy Data activity and some file copy/delete steps.  (No Move File activity 😞 )
2) if all we have is a shortcut and we don't want to copy the files, we maintain a list of processed files in a table (using a notebook).  In a second notebook, we then do a left anti join on the files in the shortcut against the processed list and output the list of unprocessed files.  This then feeds a For Each activity to do the Copy Data and then using the first notebook append the processed file.
3) If we use a shortcut *and* keep a local copy of the processed files, then we can do somethin like 2), just subbing in a directory listing instead of the processed file table.

If this helps, please consider Accepting as a solution to help other people find it more easily.

Helpful resources

Announcements
Fabric July 2025 Monthly Update Carousel

Fabric Monthly Update - July 2025

Check out the July 2025 Fabric update to learn about new features.

July 2025 community update carousel

Fabric Community Update - July 2025

Find out what's new and trending in the Fabric community.