Starting December 3, join live sessions with database experts and the Microsoft product team to learn just how easy it is to get started
Learn moreGet certified in Microsoft Fabric—for free! For a limited time, get a free DP-600 exam voucher to use by the end of 2024. Register now
We have a pipeline that contains a copy activity that reads from lakehouse table and sinks into warehouse table.
If the lakehouse table contains rows the pipelines will work as a charm. If on the following run the lakehouse table is empty the copy activity will fail with the following error:
ErrorCode=UserErrorFileNotFound,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=ADLS Gen2 operation failed for: Operation returned an invalid status code 'NotFound'. Source=Microsoft.DataTransfer.Connectors.MSSQLImport,''Type=Microsoft.Azure.Storage.Data.Models.ErrorSchemaException,Message=Operation returned an invalid status code 'NotFound',Source=Microsoft.DataTransfer.ClientLibrary,'
If I query the lakehouse table through the SQL endpoint will not return any rows, but the table does exists. I was expecting a similar result on copy activity, simply not copying any rows.
Is there a way to avoid this failure?
Solved! Go to Solution.
Hi @KarinaCossa
Glad to hear that the sink table already exists. Based on this, you may add several activities before the copy data activity to do a pre-check job for each table to be copied. Here are the steps:
Check if the Table is Empty:
SELECT COUNT(*) AS RowCount FROM YourLakehouseTable
.@activity('LookupActivity').output.firstRow.RowCount
.Create an IF Activity:
RowCount
from the Lookup activity is greater than 0. E.g. @greater(int(variables('RowCount')), 0)
.Configure the True and False Activities:
You may give this a try.
Best Regards,
Jing
Hi @KarinaCossa
Here is some supplement of a new idea:
Do you must use a copy data activity to copy the empty table to the destination warehouse? If you don't, and if this is just a one-time activity, you can try using dataflow Gen2 to copy this empty table. In my test, dataflow Gen2 can identify the schema of an empty source table and create an empty table with the same schema in its destination warehouse. This would be much easier.
Of course, if there are a lot of empty tables or we are not sure which tables would be empty, it may not be good to use dataflow Gen2, because it does not support fetching parameters from the outside at the moment, which means that we need to manually create data source connections and name new tables.
Best Regards,
Jing
Unfortunately we do not know which tables or when are they going to be empty. This is happening when running incremental load, and it is possible that nothing comes from source.
Hi @KarinaCossa
I think this error may be caused by not being able to get the schema of the empty table on the source. I did some testing and here's my configuration on the destination, I chose Auto create table option and tried to import schemas for an empty source table that I wanted to copy. However it didn't return any schema for the table.
When I ran the pipeline, I got the following error message which was somewhat different from yours. When I ran it for a table which is not empty, it succeeded without any error.
ErrorCode=MissingSchemaForAutoCreateTable,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=Failed to auto create table for no schema found in Source.,Source=Microsoft.DataTransfer.TransferTask,'
May I know how you set up the destination?
My current idea is that we may need to do some pre-check job before copying a lakehouse table. If a lakehouse table is not empty, execute the copy activity. Otherwise we might need to create a new table in the destination for the empty table directly. If you choose copy data into existing tables in the destination warehouse, we can choose to skip the copy activity for an empty table.
Best Regards,
Jing
If this post helps, please Accept it as Solution to help other members find it. Appreciate your Kudos!
Thanks for your reply.
On sink we have set Auto Create Table option.
However, the sink table already exists, because was created on previous successful runs, so not really sure if this has any impact in this issue.
Hi @KarinaCossa
Glad to hear that the sink table already exists. Based on this, you may add several activities before the copy data activity to do a pre-check job for each table to be copied. Here are the steps:
Check if the Table is Empty:
SELECT COUNT(*) AS RowCount FROM YourLakehouseTable
.@activity('LookupActivity').output.firstRow.RowCount
.Create an IF Activity:
RowCount
from the Lookup activity is greater than 0. E.g. @greater(int(variables('RowCount')), 0)
.Configure the True and False Activities:
You may give this a try.
Best Regards,
Jing
Thanks for your help. Yes, I think the only way is to do some prechecks, and base on the number of rows proceed with the copy activity or not.
Unless there is a way to capture the error that the copy activity produces and give an elegant exit.
Starting December 3, join live sessions with database experts and the Fabric product team to learn just how easy it is to get started.
Check out the November 2024 Fabric update to learn about new features.
User | Count |
---|---|
6 | |
5 | |
4 | |
4 | |
1 |
User | Count |
---|---|
16 | |
12 | |
9 | |
8 | |
6 |