Re: Lakehouse connection fails intermittently in C...

ToddChitt · ‎09-08-2025

I have a simple pipeline with three tasks: Get Metadata (of a specified SFTP site and folder), For Each loop, and inside the loop, as Copy Data task to copy the binary file from SFTP to a Lakehouse. There are three files in the SFTP source folder. I am routinely getting errors that look like this:

The GUIDs and blurred out portion point to the Lakehouse files and folder in the destination. But this does not happen on all three files, usually only ONE of the three.

Question: WHY would it deny permissions on the SAME connection when it allowed the connection just a minute or so before?

Is this a BUG?

I even spaced out the Copy activity by injecting a WAIT of 60 seconds in the loop and that did not help.

Thanks in advance.

Did I answer your question? If so, mark my post as a solution. Also consider helping someone else in the forums!

Proud to be a Super User!

v-pnaroju-msft

Hi ToddChitt,

We are following up to see if what we shared solved your issue. If you need more support, please reach out to the Microsoft Fabric community.

Thank you.

v-dineshya · ‎09-09-2025

Hi @ToddChitt ,

Thank you for reaching out to the Microsoft Community Forum.

This error is occurring during the upload to the Lakehouse, and it is related to authentication failure specifically, the password-based authentication to the destination Lakehouse or intermediary storage is being denied.

Please check the below things to fix the issue.

1. Check the credentials used for Lakehouse access are valid throughout the pipeline execution.

2. Turn on verbose logging in the pipeline to capture more details about the failure.

3. Wrap the Copy Data activity in a retry policy to see if transient failures resolve on retry.

4. Run the pipeline for each file separately to isolate whether one specific file is causing the issue.

5. Verify that the destination folder in the Lakehouse allows write access for all files.

6. If you are using Azure Data Factory or Synapse, switch to Managed Identity for authentication instead of password-based access.

Please refer below link.

Failure happened on 'Source' side. Copy activity Error in Fabric - Microsoft Q&A

I hope this information helps. Please do let us know if you have any further queries.

Regards,

Dinesh

ToddChitt · ‎09-09-2025

Hello @v-dineshya and thank you for the suggestions. To address them one by one:

1. Credentials do not change. It is using my OAuth credentials.

2. Where do I enable "Verbose logging", and after doing so, where would I find the log files

3. I will try that.

4. Through numerous tests, I have see each of the three files fail at different times, and often with different errors. Example: File 1 fails due to SFTP authentication, then File 2 succeeds, then File 3 fails with Lakehouse authentication Permission Denied.

5. Yes, the pipeline has access to write to the lakehouse folder. It has even CREATED a destinatin folder when I did not specify the correct folder properly.

6. I have a parrallel pipeline set up in Azure Data Factory, and it does not exhibit this behavior. That uses a Linked Service with a Service Principal to the Lakehouse.

Commentary: I tried "Disable chunking" but that did not help.

Baseed on the fact that the failures seem to be inconsistent and from different files each time, I think the Retry is my best option for getting this to go. Each file has, at one time or another, made it through the Copy Data task. Each file has, at one time or another, failed with different error messages.

Through my Fabric journey, I'm starting to NOT put a lot of faith in the error messages I see, at least not all the time. I bet the actual failure is not related to Fabric Lakehouse permissions at all, but the best the coders could come up with was that message.

I have seen one file get copied just fine, then 30 seconds later the second fails with an error about Lakehouse permissions, then the third file gets copied. You mean to tell me that permissions got dropped 30 seconds after one file, then were magically restored 30 seconds later? Makes no sense.

Did I answer your question? If so, mark my post as a solution. Also consider helping someone else in the forums!

Proud to be a Super User!

v-dineshya · ‎09-10-2025

Hi @ToddChitt ,

Thank you for the response. Please try below things to fix the issue.

1. Enable Verbose Logging in Copy Activity, You can enable detailed logging in the Settings tab of the Copy activity. Enable fault tolerance: Skip incompatible rows or files. Enable logging: Logs copied files, skipped files, and rows. Enable staging: Optional, for intermediate storage.

2. After enabling logging, Logs are stored in the linked storage account like Azure Blob or ADLS Gen2. You can find them in the Monitoring tab of the pipeline run.
Logs includes Timestamp, Operation (Read, Write, Skip), File name and Message like success or reason for skip.

3. Retrying is a strategy given the transient nature of your errors. While Fabric doesn’t yet expose a granular retry policy UI like ADF, you can Use a custom retry loop inside your pipeline. Add a Wait activity between retries. Log each attempt for traceability.

Please refer below links.

How to copy data using copy activity - Microsoft Fabric | Microsoft Learn

Session log in a Copy activity - Azure Data Factory | Microsoft Learn

Copy Job Activity in Data Factory Pipelines - Microsoft Fabric | Microsoft Learn

I hope this information helps. Please do let us know if you have any further queries.

Regards,

Dinesh

v-dineshya

Hi @ToddChitt ,

We haven’t heard from you on the last response and was just checking back to see if you have a resolution yet. And, if you have any further query do let us know.

Regards,

Dinesh

ToddChitt · ‎09-08-2025

Well, that was a good theory, but it failed on a different file this time.

<sigh>

Did I answer your question? If so, mark my post as a solution. Also consider helping someone else in the forums!

Proud to be a Super User!

tayloramy · ‎09-08-2025

Dang it, I thought you were on to something there.

That sounded really promising.

Is this data transfer goiong through a data gateway at all?
If so could you turn on additional logging and we could get some advanced logs as to what is happening behind the hood?

ToddChitt · ‎09-08-2025

Hello @tayloramy

While this finding may not be conclusive, I checked the box on the Source tab of the Copy Data activity for "Disable chunking". One of the files was pretty large, like > 250MB. I think the Copy Data task was trying to pull it down in chunks, meaning parallel threads, and write all those threads to the same file. I was also getting errors from the SFTP side about "A file transfer is already in progress" or similar, even 10 or 15 minutes AFTER a failure. With "Disable chunking" set, it took like 30 minutes to complete that ONE file, but it DID complete, along with the other two.

Ronald Reagan said it best: "Trust, but verify"

Edit: from this Microsoft Learn page:Copy and transform data in SFTP server using Azure Data Factory or Azure Synapse Analytics - Azure D...

Did I answer your question? If so, mark my post as a solution. Also consider helping someone else in the forums!

Proud to be a Super User!

tayloramy · ‎09-08-2025

Hi @ToddChitt,
This is very interesting and at first glace appears very much like a bug.

THe only thing I can think of is is there someone or something trying to modify the file while your pipeline is running that might be putting a lock on the file? I've never encountered that in a lakehouse, but I supposed it could be possible.

Is it consistently the same file that is causing problems? Could there be something with the file that the Lakehouse doesn't like and is causing random error codes? I've encounted things like this in the past where error codes are not at all descriptive of the error that was encountered.

If you found this helpful, consider giving some Kudos.
If I answered your question or solved your problem, mark this post as the solution.

ToddChitt · ‎09-08-2025

@tayloramy Hello and thanks for the input. To answer your questions:

Nobody else is doing anything with the files in the lakehouse that might lock a file. The *only* thing I can think of that might be causing it is that lakehouses can take a while to solidify their contents after loading. I have seen a lag as much as 5 minutes between "I KNOW I just loaded that table." and "Oh, THERE it is!" I try to wait a few minutes AFTER clearing out the lakehouse destination folder and trying again, just in case it is failing on the fact that there is already a file there by the same name.

I will have to track the individual files to see if one is a problem.

Did I answer your question? If so, mark my post as a solution. Also consider helping someone else in the forums!

Proud to be a Super User!

tayloramy · ‎09-08-2025

Hi @ToddChitt,

Let me know what you find out.

I can't think of any other cause for this. If you were using the SQL Endpoint I would recommend refreshing the endpoint metadata, but you're loading binary files directly to onelake.

I'm curious to see what the root cause is here.

If you found this helpful, consider giving some Kudos.
If I answered your question or solved your problem, mark this post as the solution.

Lakehouse connection fails intermittently in Copy Data task

Helpful resources

Fabric Monthly Update - September 2025

Fabric Community Update - August 2025

FabCon is coming to Atlanta

Lakehouse connection fails intermittently in Copy Data task

Helpful resources

Fabric Monthly Update - September 2025

Fabric Community Update - August 2025