Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Fabric Ideas just got better! New features, better search, and direct team engagement. Learn more

Data Pipeline Optimize Activity

For low code users, it would be awesome if there was a Data Pipeline activity that can be used to run Optimize on a table in a Lakehouse (or all tables in a Lakehouse).

 

For example, when using Copy Activity or Dataflow Gen2 to append data to a Lakehouse table, the tables need to get optimized (compacted) after x runs. But there is no automated, low-code way to do it. So users forget to optimize the table in the destination Lakehouse.

Status: New
Comments
KevinFeit
Regular Visitor
I agree with this. In addition to optimize, it should also have the option to vacuum. A big selling point of Fabric is that it makes things easy compared to other platforms, so this would be a great step in that direction. SQL Server has had maintenance plans going back at least to SQL Server 2005, I think. So giving Fabric a similar capability to schedule regular maintenance should be a no-brainer.
frithjof_v
Super User
It could also be beneficial if this mechanism can have AutoCompaction features, so that we can set a threshold to only execute the compaction if the number of small files > x, where x is a threshold value we can adjust, similar to the minNumFiles setting in AutoCompaction https://docs.delta.io/3.1.0/optimizations-oss.html#auto-compaction