Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Join us at FabCon Vienna from September 15-18, 2025, for the ultimate Fabric, Power BI, SQL, and AI community-led learning event. Save €200 with code FABCOMM. Get registered

Reply
coolie
Helper I
Helper I

Partitioning Strategy

I've partitioned my Datalake table (<10 million records) into product(10),year,month,day partitions. Data gets updated every day (about 1000-40000 records per product per day) so this seemed like a good strategy, but I saw a warning message recently in SQL endpoint pointing out that query performance may be slow because I have exceeded a guardrail. The guardrail I am exceeding seems to be number of files per table, which for my SKU is about 1000. This seems an impossible guardrail to achieve as every time there is an update a new parquet file is created. Even if I remove the day partitions and re-write the table, I am still going to get a new parquet file per day per product, so will still exceed the guardrail after a few months. What am I missing here?

1 ACCEPTED SOLUTION
v-achippa
Community Support
Community Support

Hi @coolie,

 

Thank you for reaching out to Microsoft Fabric Community.

 

The issue here is due to exceeding the number of files per table guardrail, which can lead to query performance. This happens because excessive small Parquet files generated by daily updates.

  • Use an OPTIMIZE or MERGE operation regularly to merge small files into larger ones. Enable Auto-Compaction and update existing records instead of creating new files daily.
  • The current partitioning strategy creates too many partitions and small files, if possible remove the day partition to reduce file count, so the structure becomes: Product → Year → Month
  • Implement the data retention policies to delete or archive older data

These steps will reduce the  file count and improves the performance.

 

If this post helps, then please consider Accepting as solution to help the other members find it more quickly, don't forget to give a "Kudos" – I’d truly appreciate it! 

 

Thanks and regards,

Anjan Kumar Chippa

View solution in original post

6 REPLIES 6
ObungiNiels
Resolver III
Resolver III

Hi @coolie ,

I agree with what has been set. There are a lot of neat automatic features helping you to optimize your parquet files, using OPTIMIZE when you read in the data is one of them. 

However, OPTIMIZE will only reduce the number of files to a minimum of 1 per partition. Since you specified the partitions columns yourself, you can check the size of the individual files to check whether or not you chose them to granular. 1 - 2 GB per file partitions is indeed a good amount and not too much. If your files are much smaller than that, you might want to overthink the partioning. 

Good luck! 🙂  

Niels 

v-achippa
Community Support
Community Support

Hi @coolie,

 

Thank you for reaching out to Microsoft Fabric Community.

 

The issue here is due to exceeding the number of files per table guardrail, which can lead to query performance. This happens because excessive small Parquet files generated by daily updates.

  • Use an OPTIMIZE or MERGE operation regularly to merge small files into larger ones. Enable Auto-Compaction and update existing records instead of creating new files daily.
  • The current partitioning strategy creates too many partitions and small files, if possible remove the day partition to reduce file count, so the structure becomes: Product → Year → Month
  • Implement the data retention policies to delete or archive older data

These steps will reduce the  file count and improves the performance.

 

If this post helps, then please consider Accepting as solution to help the other members find it more quickly, don't forget to give a "Kudos" – I’d truly appreciate it! 

 

Thanks and regards,

Anjan Kumar Chippa

Hi @coolie,

 

As we haven’t heard back from you, we wanted to kindly follow up to check if the solution I have provided for the issue worked? or let us know if you need any further assistance.
If my response addressed, please mark it as "Accept as solution" and click "Yes" if you found it helpful.

 

Thanks and regards,

Anjan Kumar Chippa

Hi @coolie,

 

We wanted to kindly follow up to check if the solution I have provided for the issue worked.
If my response addressed, please mark it as "Accept as solution" and click "Yes" if you found it helpful.

 

Thanks and regards,

Anjan Kumar Chippa

Hi @coolie,

 

As we haven’t heard back from you, we wanted to kindly follow up to check if the solution I have provided for the issue worked.
If my response addressed, please mark it as "Accept as solution" and click "Yes" if you found it helpful.

 

Thanks and regards,

Anjan Kumar Chippa

lbendlin
Super User
Super User

Your partitions are way too detailed. A typical Parquet file should be around 2GB in size.

Helpful resources

Announcements
Join our Fabric User Panel

Join our Fabric User Panel

This is your chance to engage directly with the engineering team behind Fabric and Power BI. Share your experiences and shape the future.

June FBC25 Carousel

Fabric Monthly Update - June 2025

Check out the June 2025 Fabric update to learn about new features.

June 2025 community update carousel

Fabric Community Update - June 2025

Find out what's new and trending in the Fabric community.