Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Get certified in Microsoft Fabric—for free! For a limited time, get a free DP-600 exam voucher to use by the end of 2024. Register now

Reply
Mrizzo21
New Member

Gen2 Dataflow Started Failing - Nullable Issue

First time positng to the Fabric Community, but this community has been a HUGE resouce for me in all my work, so thank you eveyone. 

I have a dozen or so Gen2 Dataflows that pull from a SharePoint location Excel files and are pointed to write to Fabric Wearhouse tables. Occanionally, I have refresh errors that occur because of new raw data issues, and up to last week, I can go in and correct the query step or create a query step to handle the data outlier issue (i.e., divide by zero error that was new), republish the dataflow, and move on.  

Now, when I correct the query issue (typically a mashup error) and republish the dataflow, each one fails to publish with the following error:

 

“A source column (see name in the details below) is nullable but the destination column (see name in the details below) is not (see query name in the details below).”

 

These all seem to reference the query / tables ID column, which in the dataflow there are no nullable values. All query modifications I’ve made do not affect the ID attribute in the query. In the Fabric warehouse, the only columns in the table that set to NOT NULL are the ID column.

Again, up to last week, there have been zero issues with nullable values, but now if I try to re-publish any Gen2 Dataflow that I have writing to warehouse table with a ID column that is set to NOT NULL, I get this error.

Anyone have any insight or know about new issues?

Thanks,

15 REPLIES 15
v-kongfanf-msft
Community Support
Community Support

Hi @Mrizzo21 ,

 

You can try the following:

  • First, make sure that the source data does not contain any null values for the VARCHAR fields. You can set default values for these fields in the source data to avoid null values.
  • Second, before loading the data into the target table, you can transform the data to replace null values with default strings. This can be done using a data conversion tool or in a dataflow query.
  • Finally, error handling is implemented in the dataflow to catch and handle null values during data loading.

 

Best Regards,
Adamk Kong

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

 

 

mcwires
Regular Visitor

I may have worked out what the issue is and have come up with a shoddy workaround.

 

It appears to be a data typing issue. Until recently, Fabric was weakly typed. If you are pulling in data from a data source, M Query seems to use its own data types e.g. number, which is not null unless specifically declared otherwise e.g.

 

type nullable number

 

But to write a number to an integer field (which id fields often are) requires that the data is converted appropriately e.g. Int64.Type. The kicker here is that Int64.Type *is* nullable, and so cannot be written to a not null field.

 

My workaround is to change integer not null fields to decimal(10,0) not null. Has anybody got anything better?

Coming back to this again after some distractions, and the issue seems to have simply gone away. I have reverted my decimal (10,0) not null to integer not null and there is no issue.

This is one of the significant issues I have with Cloud implementations of anything, we have no control and no knowledge of what changes are being implemented. Fabric release notes anyone?

I've figured out the solution that lets you keep your destination table's field as INT NOT NULL.

 

1. In the Power Query editor, set the data type to Int64.Type

2. Select Transform > Mark as key

3. Change the last parameter from false to true

I don't see this option in Fabric. For future reference can you explain where it is please? This is what I can see:

mcwires_0-1731329701467.png

Thanks.

It is in the Transform ribbon (not the shortcut in the Home ribbon):

 

lukemiller_0-1731330026665.png

 

Thanks for that suggestion. Do you also have a workaround for VARCHAR(??) NOT NULL fields?

I don't seem to be getting the same issue in this case. What error are you seeing when you try to run the dataflow? Have you tried the steps listed by @v-kongfanf-msft in their last reply?

 
nsprudhvi
Frequent Visitor

Hi @Mrizzo21,

 

Welcome to the Fabric Community! It’s great to hear that you’ve found it helpful.🙂

 

The issue you're encountering with your Gen2 Dataflows sounds frustrating, especially since it seems to have arisen suddenly. The error message indicating that a source column is nullable while the destination column is not can often be linked to changes in the data itself or in the schema of the tables you're writing to.

 

Here are a few suggestions to troubleshoot and resolve the issue:

  • Check Source Data: Even if you believe there are no nullable values in the source data, it’s worth double-checking for any new rows or changes in the Excel files that could introduce null values.
  • Review Schema: Verify that the schema of your destination Fabric Warehouse tables has not changed. Ensure that the ID column and any other relevant columns are set correctly to handle the incoming data.
  • Modify Query Steps: If there are any transformations or query steps that might inadvertently introduce null values, consider adding checks or filters to handle those cases explicitly before writing to the destination.
  • Temporary Changes: As a test, you might try allowing nulls in the destination table for a specific run to see if that resolves the issue. This could help identify whether the problem is indeed related to nullability.
  • Check for Recent Updates: Since the issue started recently, check if there have been any updates or changes to the Fabric platform or dataflow functionalities that could have introduced this behavior.

If the problem persists, it might be worth reaching out to the support team for additional insights, especially since it seems to have changed unexpectedly.

 

Good luck, and I hope you get this resolved soon!

 

Best regards,
Sai 
Data Engineer | 3x Microsoft Certified

 

If this helps, kindly mark this as the accepted solution! 👍

savvyt
Regular Visitor

This happened to me today as well. I tried to drop the primary key from the Warehouse table and re-publish the dataflow, but it still throwed the same error! I have filtered out null rows from these non nullable columns. I suspect i need to alter the column from non-null to nullable, and since Warehouse STILL does not support ALTER COLUMN i would need to drop and create new table. This is annoying.

FabianSchut
Solution Sage
Solution Sage

Hi @Mrizzo21,

 

it seems to me that a null value of the ID column has entered your input data and is giving the error when you refresh the data. The table in the destination is fixed with not null for the ID column. If there always need to be a value for the ID column (which makes sense for an ID column), you could remove the rows where the ID column is null. If you would like to allow null values, you should delete the destination table and recreate it.

I'm seeing the same issue.

Sounds like a bug introduced in the dataflows, that they check for nullable columns source<>destination instead of actually checking for the occurrence of null-values in the source.

Anyone had any luck resolving it? My destination table needs the column to be not null.

Also just started getting this issue when trying to re-publish an already functioning Dataflow Gen2. The source field (SQL Server accessed via recently upgraded Data Gateway) is a Primary Key and thus not null. So it's not clear what the system is checking.

Hi FabianSchut, the problem is that I've checked the output queries, line by line, and there are no NULL values being generated anywhere. I've tried looking though the JSON code as well. 

This is happened to me as well for the first time today.  I have no values that are null and have also filtered to remove empty just in case.  It is the column that will be used as the primary key field in SQL.

Helpful resources

This widget could not be displayed.