Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

To celebrate FabCon Vienna, we are offering 50% off select exams. Ends October 3rd. Request your discount now.

Reply
YCastano
Frequent Visitor

Optimize saved to lakehouse table

I have a notebook with dataframes and dictionaries in pyspark that I process, and then save them in a Lakehouse table, but it is taking a lot of time. I have tried writeto().append, and write.csv() but they take more time than I need, how can I optimize the loading to the lakehouse table?

 

YCastano_0-1721976805502.png

YCastano_1-1721976947609.png

 

YCastano_2-1721977010581.png

 

 

1 ACCEPTED SOLUTION
Anonymous
Not applicable

Hi @YCastano ,

 

I have some suggestions to offer here:

 

Place static fields in the query outside the for loop, modify the common part in the custom method to extract it, and set the condition to judge the specific statement.

 

Split the code into multiple code blocks and do not run them all together. For example, you can save the processing results to a temporary file first, and then further process the processed file in the next code block. After the processing is completed, you can consider deleting the file.

 

Avoid using transformations such as groupBy and join unless necessary.

 

If reused multiple times, keep the intermediate results.

 

Best Regards,
Yang
Community Support Team

 

If there is any post helps, then please consider Accept it as the solution  to help the other members find it more quickly.
If I misunderstand your needs or you still have problems on it, please feel free to let us know. Thanks a lot!

View solution in original post

1 REPLY 1
Anonymous
Not applicable

Hi @YCastano ,

 

I have some suggestions to offer here:

 

Place static fields in the query outside the for loop, modify the common part in the custom method to extract it, and set the condition to judge the specific statement.

 

Split the code into multiple code blocks and do not run them all together. For example, you can save the processing results to a temporary file first, and then further process the processed file in the next code block. After the processing is completed, you can consider deleting the file.

 

Avoid using transformations such as groupBy and join unless necessary.

 

If reused multiple times, keep the intermediate results.

 

Best Regards,
Yang
Community Support Team

 

If there is any post helps, then please consider Accept it as the solution  to help the other members find it more quickly.
If I misunderstand your needs or you still have problems on it, please feel free to let us know. Thanks a lot!

Helpful resources

Announcements
September Fabric Update Carousel

Fabric Monthly Update - September 2025

Check out the September 2025 Fabric update to learn about new features.

August 2025 community update carousel

Fabric Community Update - August 2025

Find out what's new and trending in the Fabric community.

Top Kudoed Authors