Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Fabric Data Days Monthly is back. Join us on March 26th for two expert-led sessions on 1) Getting Started with Fabric IQ and 2) Mapping & Spacial Analytics in Fabric. Register now

Reply
TGG360
Frequent Visitor

Error Encountered When Copying Data from Bronze to Silver Lakehouse Using Dataflow Gen2

Hello,

I am encountering the following error when attempting to copy data from the Bronze Lakehouse to the Silver Lakehouse using Dataflow Gen2:

Error: Extracted_data_WriteToDataDestination: There was a problem refreshing the dataflow: "Couldn't refresh the entity because of an issue with the mashup document MashupException.Error: DataSource.Error: Pipeline execution failed (runId: 02ddb959-4800-4b38-988b-0c58354041a3). Operation on target ca-47ecc963-4615-4f7f-8a72-dacd9d97d62a failed: ErrorCode=DeltaSnapshotError,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=Delta table snapshot doesn't exist.,Source=Microsoft.DataTransfer.ClientLibrary,''Type=Microsoft.Data.DeltaLake.DeltaLakeException,Message=table doesn't exist,Source=Microsoft.Data.DeltaLake,' Details: Reason = DataSource.Error;RunId = 02ddb959-4800-4b38-988b-0c58354041a3". Error code: 999999. (Request ID: 1ef47cdf-9734-4ef3-96eb-cd4748ae0adb).

Process Overview:

  1. Data is retrieved via API calls. After approximately 10 hours, the session times out.
  2. To address this, I implemented a checkpoint file that stores the IDs of records already copied to the Delta table. This allows the process to resume from where it stopped by reconciling IDs from the API with those in the checkpoint file, skipping processed records and continuing with the remaining ones.
  3. When the session timed out, I created a Dataflow Gen2 to copy data from Bronze to Silver Lakehouse, which succeeded.
  4. After reconnecting the session and running the extraction again, it correctly skipped processed IDs and completed the remaining extraction.
  5. However, when I refreshed and ran the Dataflow Gen2 again, I encountered the error above. Creating a new Dataflow Gen2 resulted in the same error.

Note: The checkpoint mechanism was implemented for another extraction process, which completed successfully without session timeout. The Dataflow Gen2 is specifically copying data from Bronze to Silver Lakehouse.

Implementation Details:

  • Writing to Delta Table:
# Append to Delta table
df_batch.write.format("delta").mode("append").saveAsTable(lakehouse_table_name)

total_records_written += len(batch_data)
print(f" Batch written | Data: {len(processed_data_ids)}/{len(data_data)} | Total records: {total_records_written:,}")

 

  • Saving Checkpoint:

# Save checkpoint
try:
checkpoint = {
'processed_data_ids': list(processed_data_ids),
'failed_data': failed_sites
}


os.makedirs(os.path.dirname(checkpoint_file), exist_ok=True)


with open(checkpoint_file, 'wb') as f:
pickle.dump(checkpoint, f)

# Save data metadata (lastData tracker)
with open(data_metadata_file, 'wb') as f:
pickle.dump(data_last_data_tracker, f)

 

Could you please advise on what might be causing this issue and suggest a possible solution?

Thank you!

2 ACCEPTED SOLUTIONS
TGG360
Frequent Visitor

I am using a pipeline as an alternative. I haven't figured out the root cause of the issue. Thanks

View solution in original post

TGG360
Frequent Visitor

The issue with the DF Gen2 integration has been resolved. The data source configuration should use the SQL connection string rather than the Lakehouse connector. After updating the data source to use the SQL connection string, all processes tested successfully.

View solution in original post

7 REPLIES 7
TGG360
Frequent Visitor

I am using a pipeline as an alternative. I haven't figured out the root cause of the issue. Thanks

TGG360
Frequent Visitor

The issue with the DF Gen2 integration has been resolved. The data source configuration should use the SQL connection string rather than the Lakehouse connector. After updating the data source to use the SQL connection string, all processes tested successfully.

Hi @TGG360,
Thanks for the update. If you face any issues again, feel free to reach out here.

Regards,
Community Support Team.

v-hjannapu
Community Support
Community Support

Hi @TGG360,

I hope the information provided above assists you in resolving the issue. If you have any additional questions or concerns, please do not hesitate to contact us. We are here to support you and will be happy to help with any further assistance you may need.

Regards,
Community Support Team.

Hi @TGG360,
I hope the above details help you fix the issue. If you still have any questions or need more help, feel free to reach out. We are always here to support you.


Regards,
Community Support Team.

Mauro89
Super User
Super User

Hi , 

it seems to me there might be an issue with the DeltaTable snapshot (as stated in the error message).

 

Try these:

 

In a notebook attached to the same lakehouse as the Dataflow source:

spark.sql("DESCRIBE HISTORY your_table_name").show()

If this fails → the Delta log is broken.

Then rebuild the table: delete the table first and then recreate it (via notebook or your dataflow).

 

Best regards!

PS: If you find this helpful, consider leaving kudos or mark it as solution.

@TGG360

Thank @Mauro89  for your input. The code isnt failing. I was able to see the versions and the timestamps of the processes

Helpful resources

Announcements
Join our Fabric User Panel

Join our Fabric User Panel

Share feedback directly with Fabric product managers, participate in targeted research studies and influence the Fabric roadmap.

February Fabric Update Carousel

Fabric Monthly Update - February 2026

Check out the February 2026 Fabric update to learn about new features.

FabCon Atlanta 2026 carousel

FabCon Atlanta 2026

Join us at FabCon Atlanta, March 16-20, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.

Top Solution Authors
Top Kudoed Authors