Starting December 3, join live sessions with database experts and the Microsoft product team to learn just how easy it is to get started
Learn moreGet certified in Microsoft Fabric—for free! For a limited time, get a free DP-600 exam voucher to use by the end of 2024. Register now
First time positng to the Fabric Community, but this community has been a HUGE resouce for me in all my work, so thank you eveyone.
I have a dozen or so Gen2 Dataflows that pull from a SharePoint location Excel files and are pointed to write to Fabric Wearhouse tables. Occanionally, I have refresh errors that occur because of new raw data issues, and up to last week, I can go in and correct the query step or create a query step to handle the data outlier issue (i.e., divide by zero error that was new), republish the dataflow, and move on.
Now, when I correct the query issue (typically a mashup error) and republish the dataflow, each one fails to publish with the following error:
“A source column (see name in the details below) is nullable but the destination column (see name in the details below) is not (see query name in the details below).”
These all seem to reference the query / tables ID column, which in the dataflow there are no nullable values. All query modifications I’ve made do not affect the ID attribute in the query. In the Fabric warehouse, the only columns in the table that set to NOT NULL are the ID column.
Again, up to last week, there have been zero issues with nullable values, but now if I try to re-publish any Gen2 Dataflow that I have writing to warehouse table with a ID column that is set to NOT NULL, I get this error.
Anyone have any insight or know about new issues?
Thanks,
Hi @Mrizzo21 ,
You can try the following:
Best Regards,
Adamk Kong
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
I may have worked out what the issue is and have come up with a shoddy workaround.
It appears to be a data typing issue. Until recently, Fabric was weakly typed. If you are pulling in data from a data source, M Query seems to use its own data types e.g. number, which is not null unless specifically declared otherwise e.g.
type nullable number
But to write a number to an integer field (which id fields often are) requires that the data is converted appropriately e.g. Int64.Type. The kicker here is that Int64.Type *is* nullable, and so cannot be written to a not null field.
My workaround is to change integer not null fields to decimal(10,0) not null. Has anybody got anything better?
Coming back to this again after some distractions, and the issue seems to have simply gone away. I have reverted my decimal (10,0) not null to integer not null and there is no issue.
This is one of the significant issues I have with Cloud implementations of anything, we have no control and no knowledge of what changes are being implemented. Fabric release notes anyone?
I've figured out the solution that lets you keep your destination table's field as INT NOT NULL.
1. In the Power Query editor, set the data type to Int64.Type
2. Select Transform > Mark as key
3. Change the last parameter from false to true
I don't see this option in Fabric. For future reference can you explain where it is please? This is what I can see:
Thanks.
It is in the Transform ribbon (not the shortcut in the Home ribbon):
Thanks for that suggestion. Do you also have a workaround for VARCHAR(??) NOT NULL fields?
I don't seem to be getting the same issue in this case. What error are you seeing when you try to run the dataflow? Have you tried the steps listed by @v-kongfanf-msft in their last reply?
Hi @Mrizzo21,
Welcome to the Fabric Community! It’s great to hear that you’ve found it helpful.🙂
The issue you're encountering with your Gen2 Dataflows sounds frustrating, especially since it seems to have arisen suddenly. The error message indicating that a source column is nullable while the destination column is not can often be linked to changes in the data itself or in the schema of the tables you're writing to.
Here are a few suggestions to troubleshoot and resolve the issue:
If the problem persists, it might be worth reaching out to the support team for additional insights, especially since it seems to have changed unexpectedly.
Good luck, and I hope you get this resolved soon!
Best regards,
Sai
Data Engineer | 3x Microsoft Certified
If this helps, kindly mark this as the accepted solution! 👍
This happened to me today as well. I tried to drop the primary key from the Warehouse table and re-publish the dataflow, but it still throwed the same error! I have filtered out null rows from these non nullable columns. I suspect i need to alter the column from non-null to nullable, and since Warehouse STILL does not support ALTER COLUMN i would need to drop and create new table. This is annoying.
Hi @Mrizzo21,
it seems to me that a null value of the ID column has entered your input data and is giving the error when you refresh the data. The table in the destination is fixed with not null for the ID column. If there always need to be a value for the ID column (which makes sense for an ID column), you could remove the rows where the ID column is null. If you would like to allow null values, you should delete the destination table and recreate it.
I'm seeing the same issue.
Sounds like a bug introduced in the dataflows, that they check for nullable columns source<>destination instead of actually checking for the occurrence of null-values in the source.
Anyone had any luck resolving it? My destination table needs the column to be not null.
Also just started getting this issue when trying to re-publish an already functioning Dataflow Gen2. The source field (SQL Server accessed via recently upgraded Data Gateway) is a Primary Key and thus not null. So it's not clear what the system is checking.
Hi FabianSchut, the problem is that I've checked the output queries, line by line, and there are no NULL values being generated anywhere. I've tried looking though the JSON code as well.
This is happened to me as well for the first time today. I have no values that are null and have also filtered to remove empty just in case. It is the column that will be used as the primary key field in SQL.
User | Count |
---|---|
5 | |
5 | |
3 | |
3 | |
2 |