Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Join us at the 2025 Microsoft Fabric Community Conference. March 31 - April 2, Las Vegas, Nevada. Use code FABINSIDER for $400 discount. Register now

Reply
Anonymous
Not applicable

Number of partitions created when I output a parquet file

How can I control the number of partitions created when I output a parquet file?

1 ACCEPTED SOLUTION
chetnachaudhari
Advocate I
Advocate I

Hi @Anonymous,

  If you are using PySpark, you can control the number of partitions created when you output a Parquet file by using the repartition method or the coalesce method on your DataFrame before writing it to Parquet. These methods allow you to control the number of output partitions, which in turn affects the number of Parquet files generated.

Thanks,

Chetna

View solution in original post

1 REPLY 1
chetnachaudhari
Advocate I
Advocate I

Hi @Anonymous,

  If you are using PySpark, you can control the number of partitions created when you output a Parquet file by using the repartition method or the coalesce method on your DataFrame before writing it to Parquet. These methods allow you to control the number of output partitions, which in turn affects the number of Parquet files generated.

Thanks,

Chetna

Helpful resources

Announcements
Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!

FebFBC_Carousel

Fabric Monthly Update - February 2025

Check out the February 2025 Fabric update to learn about new features.

Feb2025 NL Carousel

Fabric Community Update - February 2025

Find out what's new and trending in the Fabric community.