Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Get certified in Microsoft Fabric—for free! For a limited time, the Microsoft Fabric Community team will be offering free DP-600 exam vouchers. Prepare now

Reply
dhorrall
Helper I
Helper I

How to handle schema drift?

In legacy datafactory there were options to explicitly allow schema drift.  I do not see that in Fabric.  Am I missing it?

For example,

  1. I was just doing some random testing of loading historical blob text files into a 'table' in Fabric to kick the tires
  2. I had created some intermediary parquet files from the text files to get practice with that
  3. I then attempted to load that parquet files to a 'table', and got the error..'Source column is not defined in delta meta data'
  4. Obviously because files have columns that evolved over time

I don't see a straight-forward way to handle this.

8 REPLIES 8
eldpbi
Frequent Visitor

So Consider this scenario: Copying multiple csv files from OneLake to a Data warehouse within Fabric capacity. CSVs have different schemas, and the tables in the Data Warehouse have to be autocreated (will require schema drift to be on). Now since the schemas are different, in the "Mapping" section of the "Copy" activity, one cannot provide the schemas, but still the current pipelines in Fabric do ask for that.

For more context - GetMetadata gets all the childitems in the onelake folder, and ForEach runs the CopyData activity on all the ChildItems.

eldpbi_0-1725333700143.png


@ajarora @GraceGu @haha 



Jreed_7474
New Member

Any update on if schema drift will be added in the futre? especially being it looks like that functionality exsists in ADF?

GraceGu
Microsoft Employee
Microsoft Employee

Suppose the ask is for explict schema mapping in copy activity exists in ADF today. Editing mapping for Lakehouse destination will be coming in 1-2 month.  @dhorrall what destination you are looking for? 

dhorrall
Helper I
Helper I

Probably all of the above.  Current 'datafactory' has checkbox to handle.  I see nothing like this in Fabric.  This was the basis of my question.

ajarora
Microsoft Employee
Microsoft Employee

What you are referring to is possible through Azure Data Factory Mapping Dataflows, but they are not available in Fabric. Perhaps you want to try out Fabric Dataflows and see if it applies as it is ?

In terms of what copy activity allows, if your destination table already exists, and the data you are writing has a column missing, it will be defaulted to null (default value) when writing to destination. If there is a new column, or if a column is not typecastable to the destination type, then this is treated as a bad row, and you can either skip writing this bad row (and log it into a temporary storage to be processed later), or fail the operation (the default).

Anonymous
Not applicable

Is the ADF Mapping Dataflow coming to Fabric ?
We have the same kind of requirement with json files as source, evolving with new attributes, we need to have the schema drift available

Any answer on this? We have the same requirements and need pipelines to be able to handle schema drift as it is under ADF.

ajarora
Microsoft Employee
Microsoft Employee

How do you expect the schema variation to have taken effect ?

There can be many situations possible, column added, column dropped, or column type changed.

Helpful resources

Announcements
September Hackathon Carousel

Microsoft Fabric & AI Learning Hackathon

Learn from experts, get hands-on experience, and win awesome prizes.

October NL Carousel

Fabric Community Update - October 2024

Find out what's new and trending in the Fabric Community.