Re: Error using data imported through DataFlow whe...

m_cherriman · ‎11-13-2023

Since Sunday (13-11-2023) I have been getting this error message on data imported through DataFlows. Previously they have been working. Has anyone else experienced this issue?

SparkRuntimeException: Error while decoding: java.lang.IllegalArgumentException: requirement failed: Mismatched minReaderVersion and readerFeatures. newInstance(class scala.Tuple3).

The Data Flows are linking to an on prem sql server database and I'm able to query the table in the sql endpoint, but cannot read the table into a dataframe. I've tried restricting the data to a couple of lines to see if there is unusual item in any of the queries, but nothing obvious and they previously worked. To get round this I've had to extract the data into csv files (using a python script in vs code) and then uploading the csvs to Files folders and then importing those files into a new table, which can be read using pyspark.

I've had a couple of issues with Data Flows over the last week which makes me wonder how reliable they are.

souldish · ‎12-27-2023

Still seeing issues with this....not connected to gateway and using DFG2 to connect to REST API and copy the data to a table in the lakehouse.

Any updates on this issue? it makes DFG2 completely unusable.

souldish · ‎12-28-2023

I think i got the issue resolved. I'm not sure what exactly fixed it, but I removed a previous connection to an on prem gateway in my DFG2 even though the current flow wasn't using a gateway at all. Then, I re created the dataflow in a new DFG2. Something about the connection to the gateway was still being written to the files even though I wasn't using the on prem gateway connection. Weird.

naanii · ‎12-12-2023

Hello,

I am still experiencing the same issue.

Any update about this case?

EsteraKot · ‎03-04-2024

If you are using the gateway, make sure that you are using the latest version. The Dataflows Gen2 has been updated in all regions by 2/23. Let me know if that works.

KA78 · ‎12-12-2023

Hi naanii, You'll also have to update your power bi gateway.

Reidy · ‎12-27-2023

Hi,

We are still having this issue. What do you mean by "update your PowerBI gateway"? Any help would be greatly appreciated.

miguel · ‎12-28-2023

Hey! I'd suggest starting a new thread so we can talk about your particular scenario in more detail. If your dataflow is not using a gateway then the suggestion made by other users in this thread won't apply to your scenario.

You can also raise a support ticket so you can engage directly with our support team to troubleshoot the issue.

naanii · ‎12-12-2023

Thanks for the solution!

KA78 · ‎12-01-2023

UPDATE:

I also created a ticket for this bug. I just received an email from support telling me it will be fixed this week ! 😀

"Would like to inform you that I have received an update from engineering team. The update says it was a dataflow bug whose fix will be done and rolled out to production by the end of this week.

For Dataflows Gen2, the fix should reach all production regions by the end of this week."

Thanks @Microsoft for the great support!

It would be nice if we could see this kind of major issue in the known issues list in the future : Microsoft Fabric Known Issues

KA78 · ‎12-01-2023

@jcvega I'm just tagging you here so you'll get a notice that you don't have to put effort in the workaround. Should be fixed this week.

KA78 · ‎11-28-2023

So, I found the root of the bug and also a temporary workaround.

The bug is that Dataflows gen2 writes this in the json delta log files :

{"protocol":{"minReaderVersion":1,"minWriterVersion":2,"readerFeatures":[],"writerFeatures":[]}}

That's the reason for this error message : requirement failed: Mismatched minReaderVersion and readerFeatures

A temporary fix is to change this string in all the _delta_log json files for the table to:

{"protocol":{"minReaderVersion":1,"minWriterVersion":2}}

You can change the log files within the onelake in your files browser:

jcvega · ‎11-30-2023

Thansk for the workarround.. I had the same issue.. How can I edit these files?

KA78 · ‎12-01-2023

You can change the files with the Onelake plugin that you can install in your windows file explorer :https://www.microsoft.com/en-us/download/details.aspx?id=105222

Keep in mind that every time you execute the dataflow, you'll have to modify the new log file that has been created for this workaround. Hopefully they'll quickly fix this bug.

O_Tahmas · ‎04-16-2024

I'm still having this issue.

I load data from my on-prem sql server using the `copy data` option, then read this on the Fabric Notebooks.

I frequently get this error:
SparkRuntimeException: Error while decoding: java.lang.IllegalArgumentException: requirement failed: Mismatched minReaderVersion and readerFeatures.

When running a pipeline with the:

copy data (which reads from on-prem and copies to LH Table with overwrite) --> Notebook --> (anything else)

This will break when runing the 2nd or 3rd time. Super frustrating, and feels like this shouldn't really be happening at this point.

rowrowrow2 · ‎04-26-2024

I just started having the same issue.

Reidy · ‎04-26-2024

Hi,

We managed to resolve this issue by changing the spark runtime version. You'll need to restart your notebook after changing the environment.

jcvega · ‎12-01-2023

Thanks very much... I suppose that Microsoft is working on this.. it doesn't make sense every day doing this process (if the DF runs each day)

KA78 · ‎11-27-2023

Hi,

@v-cboorla-msft is there perhaps an update you can give us on this issue?

Also, if anyone should have a workaround, we would aprecieate it greatly. This is a real showstopper now.

Our ERP data resides in an on premise database that we can only access with ODBC. So the only way to get this data into Fabric (when using only fabric - we want to stay with SaaS) is by using Dataflow gen2. And now the tables created by Dataflow gen2, can't be used in a notebook. So we can't use a notebook / pyspark to tranform our data...

Does anyone know of a workaround, inside Fabric, so that we can combine ODBC on prem data + notebooks? Any help is greatly appreciated.

@m_cherriman , did you perhaps find a solution?

Thanks in advance for any update or insight into this issue.

v-cboorla-msft · ‎11-27-2023

Hi @KA78

Thanks for using Fabric Community and posting your question.

Can you please create a new post as the initial ask is different from your issue? We will definitely look into the issue and help.

Thanks for understanding.

KA78 · ‎11-28-2023

Thanks @v-cboorla-msft for your reply. I'll create a new post for my question regarding the workaround.

Although, the root of the problem is the same as the initial ask. Is there perhaps something you can share about the status of the inital ask : "Error using data imported through DataFlow when using pyspark". Is this a known issue?

(related to this error : "SparkRuntimeException: Error while decoding: java.lang.IllegalArgumentException: requirement failed: Mismatched minReaderVersion and readerFeatures. newInstance(class scala.Tuple3)." )

Error using data imported through DataFlow when using pyspark

Helpful resources

Join us at the Microsoft Fabric Community Conference

Fabric Monthly Update - February 2025

Fabric Community Update - February 2025

New Offer! Become a Certified Fabric Data Engineer

Error using data imported through DataFlow when using pyspark

Helpful resources

Join us at the Microsoft Fabric Community Conference

Fabric Monthly Update - February 2025

Fabric Community Update - February 2025