Solved: OPTIMIZE and ZORDER command resource consumption

jwryu · ‎06-18-2024

OPTIMIZE table

OPTIMIZE table ZORDER BY column

ive executed each notebooks with above query and in Fabric Capacity Metrics, the second query resumed less cu... and i do not understand why

[OPTIMIZE]

cu(s) : 24,000

duration(s) : 2,680

[OPTIMIZE + ZORDER]

cu(s) : 13,900

duration(s) : 1,550

anyone knows if it is normal? please help!

Thanks

Regards.

Anonymous · ‎06-18-2024

Hi @jwryu
Thanks for using Fabric Community.

The OPTIMIZE command in Delta Lake compacts small files into larger ones, which can help reduce the number of files and improve query performance. However, when you use OPTIMIZE with ZORDER BY column, it does more than just file compaction.

ZORDER is a technique that reorders the data based on the column specified in the ZORDER BY clause. This reordering of data can significantly improve the performance of queries that filter on the ZORDER column.

So, when you run OPTIMIZE table ZORDER BY column, it not only compacts the small files into larger ones but also reorders the data based on the column specified. As a result, it can reduce the amount of data that needs to be read, leading to less compute units (cu) being used and a shorter query duration.

In your case, the OPTIMIZE + ZORDER command used fewer compute units (13,900 cu) and took less time (1,550 seconds) compared to the OPTIMIZE command alone (24,000 cu and 2,680 seconds). This indicates that the ZORDER optimization was effective for your particular workload and data distribution.

I hope this helps! Let me know if you have any other questions.

View solution in original post

Anonymous · ‎06-18-2024

Hi @jwryu
Thanks for using Fabric Community.

The OPTIMIZE command in Delta Lake compacts small files into larger ones, which can help reduce the number of files and improve query performance. However, when you use OPTIMIZE with ZORDER BY column, it does more than just file compaction.

ZORDER is a technique that reorders the data based on the column specified in the ZORDER BY clause. This reordering of data can significantly improve the performance of queries that filter on the ZORDER column.

So, when you run OPTIMIZE table ZORDER BY column, it not only compacts the small files into larger ones but also reorders the data based on the column specified. As a result, it can reduce the amount of data that needs to be read, leading to less compute units (cu) being used and a shorter query duration.

In your case, the OPTIMIZE + ZORDER command used fewer compute units (13,900 cu) and took less time (1,550 seconds) compared to the OPTIMIZE command alone (24,000 cu and 2,680 seconds). This indicates that the ZORDER optimization was effective for your particular workload and data distribution.

I hope this helps! Let me know if you have any other questions.

OPTIMIZE and ZORDER command resource consumption

Helpful resources

Fabric Monthly Update - November 2025

FabCon Atlanta 2026

FabCon is coming to Atlanta

OPTIMIZE and ZORDER command resource consumption

Helpful resources

Fabric Monthly Update - November 2025

FabCon Atlanta 2026