Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Find everything you need to get certified on Fabric—skills challenges, live sessions, exam prep, role guidance, and more. Get started

Reply
huzi-m
New Member

Does it matter how granular we make our partitions?

So I'm looking to setup incremental refresh for a fairly large dataset which is 6gb in size. I was wondering does it affect performance depending on how many partitions we create for the 'archived' data. For example if we're looking to archive 5 years worth of data, this could be done as years, quarters, months and even days. Choosing years we'd have 5 partitions, whereas choosing months we'd have 60.

 

Choosing months gives the advantage of refreshing a particular archived month if needed, rather than the whole year. But I'm not sure if there's any disadvantages? Perhaps someone has done some testing? Would be interested in seeing the results if so.

1 ACCEPTED SOLUTION
lbendlin
Super User
Super User

Make your partitions as big as you can, but not bigger.  I have scenarios where the partition size is dictated by the source system performance.  For example the source system conks out after 500M rows, which covers about 2.5 months.  So - monthly partitions with about 200M rows each it is (to be on the safe side).  Yes, it's 60 partitions, but they are guaranteed to work.  Quarterly partitions would have been risky/pointless.

 

Your situation may vary, but it most likely will also be dictated by the capabilities of the source system. If that can easily handle yearly queries then use year partitions.

View solution in original post

5 REPLIES 5
lbendlin
Super User
Super User

Make your partitions as big as you can, but not bigger.  I have scenarios where the partition size is dictated by the source system performance.  For example the source system conks out after 500M rows, which covers about 2.5 months.  So - monthly partitions with about 200M rows each it is (to be on the safe side).  Yes, it's 60 partitions, but they are guaranteed to work.  Quarterly partitions would have been risky/pointless.

 

Your situation may vary, but it most likely will also be dictated by the capabilities of the source system. If that can easily handle yearly queries then use year partitions.

  • Right, but do you have any evidence of that being so? For example say my source can handle yearly partitions. Why should I choose yearly over monthly. Perhaps there's a blog post I can refer to.

you should choose the minimum possible number of partitions.

Please can you provide some evidence for this?

Occam's Razor.

Helpful resources

Announcements
Sept PBI Carousel

Power BI Monthly Update - September 2024

Check out the September 2024 Power BI update to learn about new features.

September Hackathon Carousel

Microsoft Fabric & AI Learning Hackathon

Learn from experts, get hands-on experience, and win awesome prizes.

Sept NL Carousel

Fabric Community Update - September 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors