Solved: QUESTION::Databricks delta files Spark SQL DML not...

Element115 · ‎05-24-2025

When will the following DML be supported in a Fabric Spark SQL Notebook?

ALTER TABLE TABLE_NAME
  SET TBLPROPERTIES (
    delta.targetFileSize = '128MB',
    delta.tuneFileSizesForRewrites = true
  );

Granted you could use PySpark like so:

spark.conf.set("spark.databricks.delta.optimize.maxFileSize", "10GB")
spark.conf.set("spark.databricks.delta.optimizeWrite.enabled", "true")
spark.conf.set("spark.databricks.delta.optimizeWrite.binSize", "5GB")

Element115 · ‎05-26-2025

Ok, but then, if this default behavior you mention is active, why do I see NO snappy file show up in the folders, and the total number of small files is not reduced if I do not run the following commands?

And lately, also small files is not reduced after I run the following commands from a Notebook? (the below sequence is the same cell sequence as in the Notebook):

spark.conf.set("spark.databricks.delta.optimize.maxFileSize", "10GB")
spark.conf.set("spark.databricks.delta.optimizeWrite.enabled", "true")
spark.conf.set("spark.databricks.delta.optimizeWrite.binSize", "5GB")

%%sql

OPTIMIZE RAW_SUMMARY ZORDER BY ([YEAR], [MONTH]);
OPTIMIZE RAW_TRANSACTION ZORDER BY ([YEAR], [MONTH], Direction, VehicleClass, Lane);

OPTIMIZE FACT_SUMMARY ZORDER BY ([YEAR], [MONTH], [DATE]);
OPTIMIZE FACT_TRANSACTION ZORDER BY ([YEAR], [MONTH], [DATE], [HOUR]);

spark.conf.set("spark.databricks.delta.retentionDurationCheck.enabled", "false")

%%sql

VACUUM RAW_SUMMARY RETAIN 0 HOURS;
VACUUM RAW_TRANSACTION RETAIN 0 HOURS;

VACUUM FACT_SUMMARY RETAIN 0 HOURS;
VACUUM FACT_TRANSACTION RETAIN 0 HOURS;

View solution in original post

v-sgandrathi · ‎05-25-2025

Hi @Element115,

Thank you for using Microoft Fabric Community Forum.

Thank you @nilendraFabric for the insightful points, completely agree with your points.
As mentioned, Microsoft Fabric currently does not support DML operations like ALTER TABLE ... SET TBLPROPERTIES in Spark SQL Notebooks, and there is no official documentation or public roadmap yet indicating when this might be enabled.

The recommended workaround is to use spark.conf.set(...) within PySpark, which allows session-level configuration for Delta Lake optimizations.

Fabric already applies several optimizations by default, including:

Optimized Write – enabled automatically to manage small file sizes during writes

V-Order – applied behind the scenes to improve data compression and query performance

So while direct DML is not yet supported, Fabric does apply intelligent defaults for performance tuning, reducing the need for manual configurations in many cases.

If this solution worked for you, kindly mark it as Accept as Solution and feel free to give a Kudos, it would be much appreciated!

Regards,
Sahasra
Community Support Team.

Element115 · ‎05-28-2025

Upon plunging more deeply into the documentation, and with the help of AI, I adjusted my max and bin sizes down close to the upper bound for the size of the source files. And relaunched the process at EOD, once a day. Now it works. There was a UI glitch whereby it would not auto-refresh the page showing the content of folders and as a result it was showing the previous folder content without the snappy files. Force refresh, et voilà! All snappy files are there.

v-sgandrathi · ‎05-28-2025

Hi @Element115,

Thank you for your detailed follow-up and for sharing your findings. We're glad to hear the issue was resolved and that the optimization steps worked as expected.
Please go ahead and mark this as the accepted solution if it addressed your query.

It helps other users who are searching for this same information and find the information.

Thank you.

Element115 · ‎05-26-2025

Ok, but then, if this default behavior you mention is active, why do I see NO snappy file show up in the folders, and the total number of small files is not reduced if I do not run the following commands?

And lately, also small files is not reduced after I run the following commands from a Notebook? (the below sequence is the same cell sequence as in the Notebook):

spark.conf.set("spark.databricks.delta.optimize.maxFileSize", "10GB")
spark.conf.set("spark.databricks.delta.optimizeWrite.enabled", "true")
spark.conf.set("spark.databricks.delta.optimizeWrite.binSize", "5GB")

%%sql

OPTIMIZE RAW_SUMMARY ZORDER BY ([YEAR], [MONTH]);
OPTIMIZE RAW_TRANSACTION ZORDER BY ([YEAR], [MONTH], Direction, VehicleClass, Lane);

OPTIMIZE FACT_SUMMARY ZORDER BY ([YEAR], [MONTH], [DATE]);
OPTIMIZE FACT_TRANSACTION ZORDER BY ([YEAR], [MONTH], [DATE], [HOUR]);

spark.conf.set("spark.databricks.delta.retentionDurationCheck.enabled", "false")

%%sql

VACUUM RAW_SUMMARY RETAIN 0 HOURS;
VACUUM RAW_TRANSACTION RETAIN 0 HOURS;

VACUUM FACT_SUMMARY RETAIN 0 HOURS;
VACUUM FACT_TRANSACTION RETAIN 0 HOURS;

v-sgandrathi · ‎05-27-2025

Hi @Element115,

As currently, Fabric Spark Notebooks do not fully support Delta Lake DML operations like OPTIMIZE, ZORDER, or manual Spark config settings for file compaction. These settings are either ignored or have limited impact in Fabric.

Fabric handles file optimization automatically using Optimized Write and V-Order, but these are triggered only under specific conditions, and manual control is not available yet. That’s why small files may still appear, and .snappy.parquet files may not be generated.

We suggest submitting this request on the official Microsoft Fabric Ideas forum to help prioritize support for Delta DML and manual optimization controls:

Fabric Ideas - Microsoft Fabric Community

If my response was helpful, consider clicking "Accept as Solution" and give us "Kudos" so that other community members can find it easily. Let me know if you need any more assistance!

Regards,

Sahasra.

Element115 · ‎05-28-2025

@v-sgandrathi That is strange because after the whole notebook ran again yesterday, I got the snappy files in all folders and the folders got cleared of all the small files, except the current folder for May since the I/O process is running against this folder every hour. So the OPTIMIZE and VACUUM commands do seem to work after all.

Or am I missing something here?

nilendraFabric · ‎05-25-2025

Hi @Element115

no official documentation or roadmap around when it will be supported.

as you have mentioned spark.conf.set is the only workaround as of now.

But if we look deeper ,Fabric uses its own configurations for similar optimizations:

• Optimized Write: Enabled by default to consolidate files during writes.
• V-Order: Automatically applied for compression and read performance.