Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Don't miss out! 2025 Microsoft Fabric Community Conference, March 31 - April 2, Las Vegas, Nevada. Use code MSCUST for a $150 discount. Prices go up February 11th. Register now.

Reply
bluelemon64
Regular Visitor

Dataflow Gen 2 Error: ParquetStatusException (message: 'Out of memory: realloc of size 33554432 fai

Hi,

 

I am trying to use a data flow Gen 2 to merge 3x tables together. All of which come from the same lakehouse. I see each of the 3 tables updates successfully but the Append_writetodatadestination action fails:

 

TableNameAppend_WriteToDataDestination: Mashup Exception Data Format Error Couldn't refresh the entity because of an issue with the mashup document MashupException.Error: DataFormat.Error: Error in replacing table's content with new data in a version: #{0}., Underlying error: Parquet: class parquet::ParquetStatusException (message: 'Out of memory: realloc of size 33554432 failed') Details: Reason = DataFormat.Error;Message = Parquet: class parquet::ParquetStatusException (message: 'Out of memory: realloc of size 33554432 failed');Message.Format = Parquet: class parquet::ParquetStatusException (message: 'Out of memory: realloc of size 33554432 failed');Microsoft.Data.Mashup.Error.Context = User

1 ACCEPTED SOLUTION

Hi @pqian_MSFT, appreciate your support. In the end I wrote a PyScript and did the operation in a notebook using my Fabric capacity. Although it took a bit of manual intepretation of the schemas, it's working well. it does however use up most of my capacity due to the table sizes but I think I need to optimize my tables before the operation to optimize.

Thanks!

View solution in original post

5 REPLIES 5
v-nuoc-msft
Community Support
Community Support

Hi @bluelemon64 

 

@pqian_MSFT Thank you very much for your reply.

 

Can you tell me if your problem is solved? If yes, please accept it as solution.

 

Regards,

Nono Chen

pqian_MSFT
Microsoft Employee
Microsoft Employee

@bluelemon64 what is the input source for your 3 tables? Did you mark these 3 tables as "enable staging" and then build a merge on top of the entities? Can you share your dataflow.json (from export) as well as the dataflow ID?

 

I think way to go about doing this is ensure the three entities are loaded straight to LH, and then merge them from the LH itself as a separate entity\dataflow.

Hi @pqian_MSFT, appreciate your support. In the end I wrote a PyScript and did the operation in a notebook using my Fabric capacity. Although it took a bit of manual intepretation of the schemas, it's working well. it does however use up most of my capacity due to the table sizes but I think I need to optimize my tables before the operation to optimize.

Thanks!

bluelemon64
Regular Visitor

Hi @v-nuoc-msft,

 

I read that thread however the solution there is not to use data flow gen 1 and gen 2. I'm not using any gen 1 here as far as I understand. 

 

It looks as if @pqian_MSFT was investigating a solution but no resolution that I can see from the thread.

 

Thanks,

Charlie

v-nuoc-msft
Community Support
Community Support

Hi @bluelemon64 

 

There seems to be someone here who is having the same problem as you, here are some solutions:

 

Solved: Dataflow Gen2 Error: Out of memory: realloc of siz... - Microsoft Fabric Community

 

Regards,

Nono Chen

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Helpful resources

Announcements
Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!

ArunFabCon

Microsoft Fabric Community Conference 2025

Arun Ulag shares exciting details about the Microsoft Fabric Conference 2025, which will be held in Las Vegas, NV.

December 2024

A Year in Review - December 2024

Find out what content was popular in the Fabric community during 2024.

Top Solution Authors