Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Be one of the first to start using Fabric Databases. View on-demand sessions with database experts and the Microsoft product team to learn just how easy it is to get started. Watch now

Reply
JayJay11
Resolver II
Resolver II

Dataflow Gen2 Error: Out of memory: realloc of size [...] failed

Hello all,

 

I have the following case: A Gen2 Dataflow picks up data from a Gen1 Dataflow and shall load it to a Lakehouse. Then I get the following error after the dataflow has run for a while:

 

104100 Couldn't refresh the entity because of an issue with the mashup document MashupException.Error: DataFormat.Error: Error in replacing table's content with new data in a version: #{0}., Underlying error: Parquet: class parquet::ParquetStatusException (message: 'Out of memory: realloc of size 156590976 failed') Details: Reason = DataFormat.Error;Message = Parquet: class parquet::ParquetStatusException (message: 'Out of memory: realloc of size 156590976 failed');Message.Format = Parquet: class parquet::ParquetStatusException (message: 'Out of memory: realloc of size 156590976 failed');Microsoft.Data.Mashup.Error.Context = User

 

I don't get what the error is trying to tell me.. I can load this table without problems to a semantic model within a couple of minutes.

1 ACCEPTED SOLUTION

To anyone wondering what the solution is: (1) accepting that there are limitations when mixing Gen1 and Gen2 Dataflows. And (2) not mixing Gen1 and Gen2 dataflows as a consequence.

View solution in original post

6 REPLIES 6
pqian_MSFT
Microsoft Employee
Microsoft Employee

Thanks, looks like it wants to write out 100k rows and needs 4GB of commit to do so. The cloud engine today allows for 6GB of commit, the parquet write and the CSV buffering together ate all of it.

 

We'll need to investigate what would it take to make this work on the cloud. It's a valid scenario and should just work out of the box, unfortunately our current limitations makes this a little challenging, but I'm pretty confident that this can be solved.

 

Meanwhile, trying this sceanrio on a Gateway where there is no commit limit should get you unblocked. But I guess you are having difficulties with your GW server as well?

pqian_MSFT
Microsoft Employee
Microsoft Employee

Since Gen 1 dataflow writes out CSV and Lakehouse expects Parquet, the dataflow engine is buffering all that data and converting it to Parquet, which is quite memory intensive.

 

Can you share the request ID so I can see how much memory it's taking?

Hi @pqian_MSFT thank you, the full error message is this:

 

104100 Couldn't refresh the entity because of an issue with the mashup document MashupException.Error: DataFormat.Error: Error in replacing table's content with new data in a version: #{0}., Underlying error: Parquet: class parquet::ParquetStatusException (message: 'Out of memory: realloc of size 156590976 failed') Details: Reason = DataFormat.Error;Message = Parquet: class parquet::ParquetStatusException (message: 'Out of memory: realloc of size 156590976 failed');Message.Format = Parquet: class parquet::ParquetStatusException (message: 'Out of memory: realloc of size 156590976 failed');Microsoft.Data.Mashup.Error.Context = User

 

Request ID = b55b0d6d-89b8-47d3-b698-636ba5791825

 

Just as a note: I understand that this is not an ideal ETL setup, but I was surprised that I get an error here as well .. 

To anyone wondering what the solution is: (1) accepting that there are limitations when mixing Gen1 and Gen2 Dataflows. And (2) not mixing Gen1 and Gen2 dataflows as a consequence.

Hi @JayJay11 
We haven’t heard from you on the last response and was just checking back to see if your query has been resolved. Otherwise, will respond back with the more details and we will try to help.
Thanks

Hi @JayJay11 
We haven’t heard from you on the last response and was just checking back to see if your query has been resolved. Otherwise, will respond back with the more details and we will try to help.
Thanks

Helpful resources

Announcements
Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!

Dec Fabric Community Survey

We want your feedback!

Your insights matter. That’s why we created a quick survey to learn about your experience finding answers to technical questions.

ArunFabCon

Microsoft Fabric Community Conference 2025

Arun Ulag shares exciting details about the Microsoft Fabric Conference 2025, which will be held in Las Vegas, NV.

December 2024

A Year in Review - December 2024

Find out what content was popular in the Fabric community during 2024.

Top Solution Authors