Get certified in Microsoft Fabric—for free! For a limited time, the Microsoft Fabric Community team will be offering free DP-600 exam vouchers. Prepare now
In legacy datafactory there were options to explicitly allow schema drift. I do not see that in Fabric. Am I missing it?
For example,
I don't see a straight-forward way to handle this.
So Consider this scenario: Copying multiple csv files from OneLake to a Data warehouse within Fabric capacity. CSVs have different schemas, and the tables in the Data Warehouse have to be autocreated (will require schema drift to be on). Now since the schemas are different, in the "Mapping" section of the "Copy" activity, one cannot provide the schemas, but still the current pipelines in Fabric do ask for that.
For more context - GetMetadata gets all the childitems in the onelake folder, and ForEach runs the CopyData activity on all the ChildItems.
Any update on if schema drift will be added in the futre? especially being it looks like that functionality exsists in ADF?
Suppose the ask is for explict schema mapping in copy activity exists in ADF today. Editing mapping for Lakehouse destination will be coming in 1-2 month. @dhorrall what destination you are looking for?
Probably all of the above. Current 'datafactory' has checkbox to handle. I see nothing like this in Fabric. This was the basis of my question.
What you are referring to is possible through Azure Data Factory Mapping Dataflows, but they are not available in Fabric. Perhaps you want to try out Fabric Dataflows and see if it applies as it is ?
In terms of what copy activity allows, if your destination table already exists, and the data you are writing has a column missing, it will be defaulted to null (default value) when writing to destination. If there is a new column, or if a column is not typecastable to the destination type, then this is treated as a bad row, and you can either skip writing this bad row (and log it into a temporary storage to be processed later), or fail the operation (the default).
Is the ADF Mapping Dataflow coming to Fabric ?
We have the same kind of requirement with json files as source, evolving with new attributes, we need to have the schema drift available
Any answer on this? We have the same requirements and need pipelines to be able to handle schema drift as it is under ADF.
How do you expect the schema variation to have taken effect ?
There can be many situations possible, column added, column dropped, or column type changed.
User | Count |
---|---|
6 | |
6 | |
4 | |
3 | |
3 |
User | Count |
---|---|
14 | |
10 | |
7 | |
7 | |
6 |