Starting December 3, join live sessions with database experts and the Microsoft product team to learn just how easy it is to get started
Learn moreGet certified in Microsoft Fabric—for free! For a limited time, get a free DP-600 exam voucher to use by the end of 2024. Register now
My scenario is pretty simple:
* csv files that have been manually uploaded to Lakehouse Files
* Dataflow Gen 2 queries sourcing from these Lakehouse csv files
* Dataflow Gen 2 Data destination to Lakehouse tables
I am finding that any changes after creating a Dataflow Gen 2 such adding new or renamed query columns, new or renamed data destination Lakehouse tables can cause errors or failures and the only apparent way to get past this is to delete the entire Workspace and start from scratch.
I am guessing there are things happening in the staging artifacts that are not able to adapt to these changes or perhaps that the new query and table articifacts do not connect to new staging artifacts. This might be due to data pipeline artifacts that appear to be created automatically behind the scenes might not be updated when new dataflows or tables are created?
Solved! Go to Solution.
If I'm understading your scenario correctly:
- You're setting the output destination of queries to the Lakehouse (tables)
- You're modufying the queries in ways that change the schema
- After those changes, refresh is failing
If so, the way to address this is to reconfigure the output destination after the schema changes. In the Query Settings pane of the Query Editor, you will see the Output Destination section at the bottom. Clicking on the "X" will remove the current settings and then you can reconfigure the destination (including specifying a new column mapping).
We are looking into a future mode that does not require explicit remapping for disruptive schema changes.
Note: you will also have to delete the previous Lakehouse table (if you'd like to use the same table for the new schema).
If you using the "Existing Table" flow (instead of "New Table"), the schema of the existing table will not be altered.
Thanks
If I'm understading your scenario correctly:
- You're setting the output destination of queries to the Lakehouse (tables)
- You're modufying the queries in ways that change the schema
- After those changes, refresh is failing
If so, the way to address this is to reconfigure the output destination after the schema changes. In the Query Settings pane of the Query Editor, you will see the Output Destination section at the bottom. Clicking on the "X" will remove the current settings and then you can reconfigure the destination (including specifying a new column mapping).
We are looking into a future mode that does not require explicit remapping for disruptive schema changes.
Hey Sidjay,
Thanks for reply.
I will explicitly try this. Though pretty sure I tried it as one of my troubleshooting steps.
I note I've seen that the data flow gen 2 destination process has a step where it presents the in and out schema, which seems to confirm that the schema is ok, and have even seen it indicate new columns that have unchecked checkbox beside them (which I have checked).
But it appears that in fact this apparent confirmation doesn't yet actually behind the scenes update the schema mapping thus your recommendation to delete and recreate the Output Destination section by clicking X.
Starting December 3, join live sessions with database experts and the Fabric product team to learn just how easy it is to get started.
Check out the November 2024 Fabric update to learn about new features.
User | Count |
---|---|
5 | |
5 | |
5 | |
4 | |
3 |