Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Enhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends August 31st. Request your voucher.

Reply
sodapepper
Frequent Visitor

Dataflow no longer working due to memory issue

I have a dataflow that last worked on 9/17.  According to the refresh history it processed 8M rows.  Yesterday I tried to run the same dataflow and received this error.  Nothing has changed with the dataflow.

 

Append: Error Code: Mashup Exception Data Format Error, Error Details: Couldn't refresh the entity because of an issue with the mashup document MashupException.Error: DataFormat.Error: Failed to insert a table., Underlying error: Parquet: class parquet::ParquetStatusException (message: 'Out of memory: malloc of size 1610612736 failed') Details: Reason = DataFormat.Error;Message = Parquet: class parquet::ParquetStatusException (message: 'Out of memory: malloc of size 1610612736 failed');Message.Format = Parquet: class parquet::ParquetStatusException (message: 'Out of memory: malloc of size 1610612736 failed');Microsoft.Data.Mashup.Error.Context = System (Request ID: 127d0336-cc7a-406c-9256-160c05fe40b6).

 

The dataflow is taking one column from two tables, appending them together, removing duplicates, and then adding an index.  How come I'm getting this error when I changed nothing with my dataflow?

2 REPLIES 2
Anonymous
Not applicable

Hi @sodapepper ,

 

For the memory issue, Lakehouse requires Parquet if the configured destination is lakehouse, so the dataflow engine buffers all this data and converts it to Parquet, which is quite memory intensive.

 

Monitor the CPU, memory, and network usage of the dataflow job to identify any potential bottlenecks. This can help you understand if the dataflow is running out of memory due to resource constraints.

 

In addition, proper use of staging can optimize the performance of processing, refer to the following documentation.

Dataflow Gen2 data destinations and managed settings - Microsoft Fabric | Microsoft Learn

An overview of refresh history and monitoring for dataflows. - Microsoft Fabric | Microsoft Learn

 

Best Regards,
Adamk Kong

 

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

 

Hi, the data destination is not lakehouse.  I do not have a data destination configured.

 

How can I monitor the CPU, memory, and network usage of the dataflow job?  The monitoring Hub provides none of these details

Helpful resources

Announcements
Join our Fabric User Panel

Join our Fabric User Panel

This is your chance to engage directly with the engineering team behind Fabric and Power BI. Share your experiences and shape the future.

June FBC25 Carousel

Fabric Monthly Update - June 2025

Check out the June 2025 Fabric update to learn about new features.

June 2025 community update carousel

Fabric Community Update - June 2025

Find out what's new and trending in the Fabric community.

Top Solution Authors
Top Kudoed Authors