Solved: Why does a varchar(20) column from the source appe...

erickf2303 · ‎08-06-2025

Hi community,

I am migrating data from a SQL Server database to a Lakehouse in Microsoft Fabric. In my source, I have a table with a column defined as varchar(20). However, after loading the data into the Lakehouse and querying it through the SQL Analytics Endpoint, this column appears as varchar(8000).

Also occurs in another tables, this only happens with columns of type varchar; other data types are mapped as expected. Is this the expected behavior? Is there any way to preserve the original column length or define a maximum size for varchar columns in the Lakehouse?

I would appreciate any guidance or experiences you can share on this topic.

Thank you!

v-hashadapu · ‎08-07-2025

Hi @erickf2303 , Thank you for reaching out to the Microsoft Community Forum.

Yes, this is expected behaviour in Microsoft Fabric. When you migrate data from SQL Server to a Lakehouse, columns defined as varchar(n) are exposed as varchar(8000) through the SQL Analytics Endpoint. This happens because Fabric stores data in Delta Lake format using Parquet files, which do not support enforcing maximum string lengths. To ensure compatibility across different workloads, the SQL endpoint abstracts string fields as varchar(8000) regardless of their original definition.

There is currently no supported way to retain the original varchar(n) length during ingestion, whether you're using pipelines, notebooks or Dataflows Gen2. Even explicitly defining schemas in PySpark or SQL won't affect how the SQL Analytics Endpoint interprets string fields. This behaviour reflects Fabric’s focus on flexibility in analytical scenarios, rather than enforcing strict relational constraints like SQL Server.

If you require length constraints for validation or downstream use, they must be applied manually during transformation, either in ingestion notebooks, through SQL views on top of the Lakehouse or within Power BI using DAX or Power Query.

What is a lakehouse? - Microsoft Fabric | Microsoft Learn

What is the SQL analytics endpoint for a lakehouse? - Microsoft Fabric | Microsoft Learn

View solution in original post

youdao4 · ‎08-07-2025

Yes, this is currently expected behavior in Microsoft Fabric when loading data into a Lakehouse using methods like Dataflow Gen2, pipelines, or notebooks. The SQL Analytics Endpoint often defaults VARCHAR columns to VARCHAR(8000) because the underlying Delta Lake format used in Lakehouse doesn't enforce a strict maximum length on string columns like traditional SQL Server does.

🧠 Why this happens:

Delta Lake (used under the hood by Fabric Lakehouse) treats string types more flexibly and doesn't store VARCHAR(n) metadata the way SQL Server does.
When exposed through the SQL Analytics Endpoint, Fabric maps all variable-length string types as VARCHAR(8000) for compatibility.

🔧 Workaround / Recommendation:

At the moment, there's no direct way to preserve the original VARCHAR(n) size when data is ingested. However:

You can manually cast or create views with VARCHAR(n) if you want to enforce size for downstream consumers.
Alternatively, enforce size validations at the application or query layer if needed.

Hopefully, future updates will allow more control over schema enforcement during ingestion or provide better support for string length preservation.

Hope this helps!