Large string and binary values in Fabric Data Ware...

jovanpop-msft · ‎11-24-2025

Co-authored with: Ancy Philip

Fabric Data Warehouse and SQL analytics endpoints for mirrored items now support large string and binary data using VARCHAR(MAX) and VARBINARY(MAX) types.

Large string or binary values in Data Warehouse

In Fabric Data Warehouse, this enhancement allows you to ingest, store, process, and analyze large descriptive text, logs, JSON, or spatial data—up to 16 MB per cell—without hitting size limits in the common warehouse scenarios.

To define columns for large data types:

Use VARCHAR(MAX) for large strings (e.g., descriptive text, comments, JSON data, log entries, spatial data in Well-Known Text format).
Use VARBINARY(MAX) for large binary objects (e.g., spatial data in Well-Known Binary format).

Large_string_and_binary_values_in_Fabric_Data_Warehouse_and_SQL_analytics_endpoi

You can ingest large text or binary columns using the COPY INTO command or OPENROWSET() function. The maximum size for large string or binary columns depends on the file format:

Parquet – up to 16 MB per cell, 1 GB total row size.
CSV – up to 1 MB per cell, 16 MB total row size.

Once ingested, you can apply any string operations on your data—such as comparisons with LIKE or extracting information using JSON functions—just as you would on VARCHAR(N) or VARBINARY(N) columns.

Performance is influenced by the amount of data processed, not by the declared column type.

Large string or binary values in SQL analytics endpoint for mirrored items

The SQL analytics endpoint for mirrored items (such as Azure SQL Database or Cosmos DB) and Fabric databases (such as Fabric SQL Database and Cosmos DB) now ensures that large values from source systems are read without the previous 8 KB truncation, enabling scenarios like storing JSON docs, logs and long descriptive data. This is especially critical for Cosmos DB, as it prevents JSON corruption caused by truncation.

The mapping for the source Delta types to the SQL column types in SQL analytics endpoint tables is as follows:

Large columns (String or VARCHAR(N) where N ≥ 2000) → represented as VARCHAR(MAX).
Smaller columns (VARCHAR(N) where N < 2000) → represented as VARCHAR(4*N).

Refer to the Autogenerated data types in the SQL analytics endpoint documentation for mapping details.

The maximum size of string or binary values in SQL analytics endpoints depends on the mirrored item type:

Mirrored SQL server, Azure SQL Database, Azure SQL Managed instance and SQL database in Fabric → up to 1 MB.
Mirrored Azure Cosmos DB and Cosmos DB in Fabric → up to 2 MB.

Refer to the documentation for the limits per mirrored item type.

For newly created tables, columns are automatically mapped to VARCHAR(MAX) and VARBINARY(MAX) in the SQL analytics endpoints. For existing tables with columns storing large objects, you can recreate the table to adopt the new data type, or it will be automatically upgraded to VARCHAR(MAX) / VARBINARY(MAX) during the next schema change.

Refer to the Limitations of Fabric Data Warehouse documentation to learn more.

Conclusion

With support for VARCHAR(MAX) and VARBINARY(MAX), Fabric Data Warehouse and SQL analytics endpoints for mirrored items remove previous limitations on large text and binary data. This enhancement ensures accurate ingestion, storage, and querying of complex data types—such as JSON, logs, and spatial objects—without truncation or corruption.

Whether you’re working with mirrored sources like Azure SQL Database and Cosmos DB, or creating new tables in Fabric Data Warehouse, you now have the flexibility to handle large objects efficiently and reliably.

Join us in Barcelona for FabCon and SQLCon, September 28 - October 1, 2026.