Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

The Power BI Data Visualization World Championships is back! It's time to submit your entry. Live now!

Data Pipeline Optimize Activity

For low code users, it would be awesome if there was a Data Pipeline activity that can be used to run Optimize on a table in a Lakehouse (or all tables in a Lakehouse).

 

For example, when using Copy Activity or Dataflow Gen2 to append data to a Lakehouse table, the tables need to get optimized (compacted) after x runs. But there is no automated, low-code way to do it. So users forget to optimize the table in the destination Lakehouse.

Status: New
Comments
KevinFeit
Regular Visitor
I agree with this. In addition to optimize, it should also have the option to vacuum. A big selling point of Fabric is that it makes things easy compared to other platforms, so this would be a great step in that direction. SQL Server has had maintenance plans going back at least to SQL Server 2005, I think. So giving Fabric a similar capability to schedule regular maintenance should be a no-brainer.
frithjof_v
Community Champion
It could also be beneficial if this mechanism can have AutoCompaction features, so that we can set a threshold to only execute the compaction if the number of small files > x, where x is a threshold value we can adjust, similar to the minNumFiles setting in AutoCompaction https://docs.delta.io/3.1.0/optimizations-oss.html#auto-compaction