March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount! Early bird discount ends December 31.
Register NowBe one of the first to start using Fabric Databases. View on-demand sessions with database experts and the Microsoft product team to learn just how easy it is to get started. Watch now
In online service, I have a simple dataflow gen2 that has a single 2-column, 3-row query (image below) that publishes (update method replace) to my lakehouse. Updates to the dataflow (i.e. modifying data in existing rows or new rows added) show up in the lakehouse table just fine when the dataflow and lakehouse are refreshed. All good so far.
However, if I add a new column to my dataflow query (table structure/schema is changed, simple text column) - taking care to ensure successful publish and lakehouse refresh - the lakehouse table doesn't show the new 3rd column.
Refreshing the individual table in the lakehouse doesnt change anything either.
The relevant 'refresh-timestamped' parquet and json files for the lakehouse table do not reflect the new table schema either.
In the dataflow query editor, the schema view shows the newly added column so at least dataflow side everything is normal. Also in the dataflow query editor, going through the lakehouse data destination settings and 'refresh destination schema' dialogue makes no difference - despite the process acknowledging that, quote "Schema changed since you last set the output settings. Column mappings have been reset to their default". Strangely, if I select "Append" mode it adds the 3rd columns data to the bottom of the 1st and 2nd columns.
The only way to get the changed table to update correctly to the lakehouse is to delete the lakehouse table itself and then refresh the dataflow. Not ideal if you've got upstream dependencies and model relationships to consider.
Same behaviour no matter which of my many dataflow schemas change - the lakehouse always retains the original schema.
Any help greatly appreciated.
Solved! Go to Solution.
Solution: As at 13th Dec 2023 this fix works: When your query schema changes (e.g. add or delete new column ) before clicking 'Publish' to lakehouse, go into data destination settings (cog bottom right), click 'Next' into 'Choose Destination Target' - and this is the critical bit - ensure 'New table' is selected (even though you know that this table already exists in your Lakehouse! I know, so intuitive right?!). Ensure your destination lakehouse and table name are unchanged, click 'Next' which should display a message to the effect of 'your schema has changed', (you might need to check the box next to your new column(s) to include them in the schema) then save and publish. Please note this method does not work on my historical dataflows/tables - only new ones. To get this to work on your older tables you will need to delete them from the lakehouse, re-publish them from your dataflow, then from that point on you should be good for future schema changes.
Solution: As at 13th Dec 2023 this fix works: When your query schema changes (e.g. add or delete new column ) before clicking 'Publish' to lakehouse, go into data destination settings (cog bottom right), click 'Next' into 'Choose Destination Target' - and this is the critical bit - ensure 'New table' is selected (even though you know that this table already exists in your Lakehouse! I know, so intuitive right?!). Ensure your destination lakehouse and table name are unchanged, click 'Next' which should display a message to the effect of 'your schema has changed', (you might need to check the box next to your new column(s) to include them in the schema) then save and publish. Please note this method does not work on my historical dataflows/tables - only new ones. To get this to work on your older tables you will need to delete them from the lakehouse, re-publish them from your dataflow, then from that point on you should be good for future schema changes.
I tried this method but it doesn't work. Selecting "new table" does not allow me to click "next" hence I'm unable to proceed further. Any workarounds besides deleting the old table as stated by the author?
Have you opened the proxy to the required destinations?
Dataflow fails to read from the lakehouse: Solution:
The firewall rules on the gateway server and/or customer's proxy servers need to be updated to allow outbound traffic from the gateway server to the following:
Protocol: TCP
Endpoints: *.datawarehouse.pbidedicated.windows.net, *.datawarehouse.fabric.microsoft.com, *.dfs.fabric.microsoft.com
Port: 1433
I'm not sure I can do that as that would be managed by my IT department but I'm not sure that would determine if I can have the "next" button enabled for the solution stated above. There doesn't seem to be an issue of "dataflow fails to read from lakehouse" here, it's just the part of having the new column being displayed in the destinatoin editor.
I have the same problem. Are we missing something?
March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!
Your insights matter. That’s why we created a quick survey to learn about your experience finding answers to technical questions.
Arun Ulag shares exciting details about the Microsoft Fabric Conference 2025, which will be held in Las Vegas, NV.
User | Count |
---|---|
39 | |
26 | |
15 | |
11 | |
10 |
User | Count |
---|---|
58 | |
52 | |
23 | |
14 | |
11 |