Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Prepping for a Fabric certification exam? Join us for a live prep session with exam experts to learn how to pass the exam. Register now.

Reply
jwryu
Helper II
Helper II

OPTIMIZE and ZORDER command resource consumption

OPTIMIZE table
OPTIMIZE table ZORDER BY column

 

ive executed each notebooks with above query and in Fabric Capacity Metrics, the second query resumed less cu... and i do not understand why

 

[OPTIMIZE]

cu(s) : 24,000

duration(s) : 2,680 

 

[OPTIMIZE + ZORDER]

cu(s) : 13,900

duration(s) : 1,550 

 

 

anyone knows if it is normal? please help!

 

Thanks

Regards.

 

 

1 ACCEPTED SOLUTION
Anonymous
Not applicable

Hi @jwryu 
Thanks for using Fabric Community.

The OPTIMIZE command in Delta Lake compacts small files into larger ones, which can help reduce the number of files and improve query performance. However, when you use OPTIMIZE with ZORDER BY column, it does more than just file compaction.

ZORDER is a technique that reorders the data based on the column specified in the ZORDER BY clause. This reordering of data can significantly improve the performance of queries that filter on the ZORDER column.

So, when you run OPTIMIZE table ZORDER BY column, it not only compacts the small files into larger ones but also reorders the data based on the column specified. As a result, it can reduce the amount of data that needs to be read, leading to less compute units (cu) being used and a shorter query duration.

In your case, the OPTIMIZE + ZORDER command used fewer compute units (13,900 cu) and took less time (1,550 seconds) compared to the OPTIMIZE command alone (24,000 cu and 2,680 seconds). This indicates that the ZORDER optimization was effective for your particular workload and data distribution.

I hope this helps! Let me know if you have any other questions.

View solution in original post

1 REPLY 1
Anonymous
Not applicable

Hi @jwryu 
Thanks for using Fabric Community.

The OPTIMIZE command in Delta Lake compacts small files into larger ones, which can help reduce the number of files and improve query performance. However, when you use OPTIMIZE with ZORDER BY column, it does more than just file compaction.

ZORDER is a technique that reorders the data based on the column specified in the ZORDER BY clause. This reordering of data can significantly improve the performance of queries that filter on the ZORDER column.

So, when you run OPTIMIZE table ZORDER BY column, it not only compacts the small files into larger ones but also reorders the data based on the column specified. As a result, it can reduce the amount of data that needs to be read, leading to less compute units (cu) being used and a shorter query duration.

In your case, the OPTIMIZE + ZORDER command used fewer compute units (13,900 cu) and took less time (1,550 seconds) compared to the OPTIMIZE command alone (24,000 cu and 2,680 seconds). This indicates that the ZORDER optimization was effective for your particular workload and data distribution.

I hope this helps! Let me know if you have any other questions.

Helpful resources

Announcements
May FBC25 Carousel

Fabric Monthly Update - May 2025

Check out the May 2025 Fabric update to learn about new features.

May 2025 Monthly Update

Fabric Community Update - May 2025

Find out what's new and trending in the Fabric community.