Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Join us at FabCon Vienna from September 15-18, 2025, for the ultimate Fabric, Power BI, SQL, and AI community-led learning event. Save €200 with code FABCOMM. Get registered

Reply
Element115
Super User
Super User

QUESTION::Databricks delta files Spark SQL DML not supported in Fabric

When will the following DML be supported in a Fabric Spark SQL Notebook?

 

ALTER TABLE TABLE_NAME
  SET TBLPROPERTIES (
    delta.targetFileSize = '128MB',
    delta.tuneFileSizesForRewrites = true
  );

 

Granted you could use PySpark like so:

 

spark.conf.set("spark.databricks.delta.optimize.maxFileSize", "10GB")
spark.conf.set("spark.databricks.delta.optimizeWrite.enabled", "true")
spark.conf.set("spark.databricks.delta.optimizeWrite.binSize", "5GB")

 

1 ACCEPTED SOLUTION

Ok, but then, if this default behavior you mention is active,  why do I see NO snappy file show up in the folders, and the total number of small files is not reduced if I do not run the following commands? 

 

And lately, also small files is not reduced after I run the following commands from a Notebook? (the below sequence is the same cell sequence as in the Notebook):

 

 

spark.conf.set("spark.databricks.delta.optimize.maxFileSize", "10GB")
spark.conf.set("spark.databricks.delta.optimizeWrite.enabled", "true")
spark.conf.set("spark.databricks.delta.optimizeWrite.binSize", "5GB")

 

%%sql

OPTIMIZE RAW_SUMMARY ZORDER BY ([YEAR], [MONTH]);
OPTIMIZE RAW_TRANSACTION ZORDER BY ([YEAR], [MONTH], Direction, VehicleClass, Lane);

OPTIMIZE FACT_SUMMARY ZORDER BY ([YEAR], [MONTH], [DATE]);
OPTIMIZE FACT_TRANSACTION ZORDER BY ([YEAR], [MONTH], [DATE], [HOUR]);

 

spark.conf.set("spark.databricks.delta.retentionDurationCheck.enabled", "false")

 

%%sql

VACUUM RAW_SUMMARY RETAIN 0 HOURS;
VACUUM RAW_TRANSACTION RETAIN 0 HOURS;

VACUUM FACT_SUMMARY RETAIN 0 HOURS;
VACUUM FACT_TRANSACTION RETAIN 0 HOURS;

 

View solution in original post

7 REPLIES 7
v-sgandrathi
Community Support
Community Support

Hi @Element115,

Thank you for using Microoft Fabric Community Forum.

 

Thank you @nilendraFabric  for the insightful points, completely agree with your points.
As mentioned, Microsoft Fabric currently does not support DML operations like ALTER TABLE ... SET TBLPROPERTIES in Spark SQL Notebooks, and there is no official documentation or public roadmap yet indicating when this might be enabled.

The recommended workaround is to use spark.conf.set(...) within PySpark, which allows session-level configuration for Delta Lake optimizations.

 

Fabric already applies several optimizations by default, including:

Optimized Write – enabled automatically to manage small file sizes during writes

V-Order – applied behind the scenes to improve data compression and query performance

So while direct DML is not yet supported, Fabric does apply intelligent defaults for performance tuning, reducing the need for manual configurations in many cases.

 

If this solution worked for you, kindly mark it as Accept as Solution and feel free to give a Kudos, it would be much appreciated!

 

Regards,
Sahasra
Community Support Team.

Upon plunging more deeply into the documentation, and with the help of AI, I adjusted my max and bin sizes down close to the upper bound for the size of the source files. And relaunched the process at EOD, once a day. Now it works.  There was a UI glitch whereby it would not auto-refresh the page showing the content of folders and as a result it was showing the previous folder content without the snappy files. Force refresh, et voilà! All snappy files are there.  

Hi @Element115,

 

Thank you for your detailed follow-up and for sharing your findings. We're glad to hear the issue was resolved and that the optimization steps worked as expected.
Please go ahead and mark this as the accepted solution if it addressed your query.

It helps other users who are searching for this same information and find the information.

 

Thank you.

Ok, but then, if this default behavior you mention is active,  why do I see NO snappy file show up in the folders, and the total number of small files is not reduced if I do not run the following commands? 

 

And lately, also small files is not reduced after I run the following commands from a Notebook? (the below sequence is the same cell sequence as in the Notebook):

 

 

spark.conf.set("spark.databricks.delta.optimize.maxFileSize", "10GB")
spark.conf.set("spark.databricks.delta.optimizeWrite.enabled", "true")
spark.conf.set("spark.databricks.delta.optimizeWrite.binSize", "5GB")

 

%%sql

OPTIMIZE RAW_SUMMARY ZORDER BY ([YEAR], [MONTH]);
OPTIMIZE RAW_TRANSACTION ZORDER BY ([YEAR], [MONTH], Direction, VehicleClass, Lane);

OPTIMIZE FACT_SUMMARY ZORDER BY ([YEAR], [MONTH], [DATE]);
OPTIMIZE FACT_TRANSACTION ZORDER BY ([YEAR], [MONTH], [DATE], [HOUR]);

 

spark.conf.set("spark.databricks.delta.retentionDurationCheck.enabled", "false")

 

%%sql

VACUUM RAW_SUMMARY RETAIN 0 HOURS;
VACUUM RAW_TRANSACTION RETAIN 0 HOURS;

VACUUM FACT_SUMMARY RETAIN 0 HOURS;
VACUUM FACT_TRANSACTION RETAIN 0 HOURS;

 

Hi @Element115,

 

As currently, Fabric Spark Notebooks do not fully support Delta Lake DML operations like OPTIMIZE, ZORDER, or manual Spark config settings for file compaction. These settings are either ignored or have limited impact in Fabric.

Fabric handles file optimization automatically using Optimized Write and V-Order, but these are triggered only under specific conditions, and manual control is not available yet. That’s why small files may still appear, and .snappy.parquet files may not be generated.

 

We suggest submitting this request on the official Microsoft Fabric Ideas forum to help prioritize support for Delta DML and manual optimization controls:

Fabric Ideas - Microsoft Fabric Community

 

If my response was helpful, consider clicking "Accept as Solution" and give us "Kudos" so that other community members can find it easily. Let me know if you need any more assistance!

 

Regards,

Sahasra.

 

@v-sgandrathi That is strange because after the whole notebook ran again yesterday, I got the snappy files in all folders and the folders got cleared of all the small files, except the current folder for May since the I/O process is running against this folder every hour.  So the OPTIMIZE and VACUUM commands do seem to work after all.

 

Or am I missing something here?

nilendraFabric
Community Champion
Community Champion

Hi @Element115 

 

no official documentation or roadmap around when it will be supported. 


as you have mentioned spark.conf.set is the only workaround as of now. 

But if we look deeper ,Fabric uses its own configurations for similar optimizations:

• Optimized Write: Enabled by default to consolidate files during writes.
• V-Order: Automatically applied for compression and read performance.

Helpful resources

Announcements
Join our Fabric User Panel

Join our Fabric User Panel

This is your chance to engage directly with the engineering team behind Fabric and Power BI. Share your experiences and shape the future.

May FBC25 Carousel

Fabric Monthly Update - May 2025

Check out the May 2025 Fabric update to learn about new features.

June 2025 community update carousel

Fabric Community Update - June 2025

Find out what's new and trending in the Fabric community.