Solved: NVARCHAR not supported in Fabric Warehouse?

tan_thiamhuat · ‎07-28-2025

why? if I have Chinese characters, I need that, else VARCHAR is very not optimized.

What are the workarounds?

v-hashadapu · ‎07-28-2025

Hi @tan_thiamhuat , Thank you for reaching out to the Microsoft Community Forum.

You need Unicode to store Chinese characters. If your engine supports UTF-8 collations, the most space-efficient option is to use VARCHAR with a _UTF8 collation. This lets you store full Unicode while keeping ASCII at 1 byte per character and Chinese at about 3 bytes. If UTF-8 collations aren’t supported in your environment, use NVARCHAR for any column that might store Chinese. It’s fully Unicode-compliant, based on UTF-16 encoding and safely handles all major language scripts. To support emoji or rare characters outside the basic Unicode range such as extended CJK ideographs, use an _SC collation, which enables proper storage, sorting and comparison of those symbols.

When using UTF-8, watch index byte limits since character size varies and ensure your client drivers fully support UTF-8 collations. If you stick with NVARCHAR, limit it to columns that truly require it, consider applying row compression to reduce storage overhead and avoid storing text as bytes (VARBINARY) or relying on legacy code pages, as these approaches break sorting, searching and proper collation behaviour.

In Microsoft Fabric, Lakehouse (Spark) strings are UTF-8 by default, so mixed-language text works out of the box, no need to choose between VARCHAR and NVARCHAR. Just make sure your ingested files are properly UTF-8 encoded. In Fabric Warehouse, treat it like Azure SQL, use a _UTF8 collation at the database or column level if it’s available in your tenant, otherwise, fall back to NVARCHAR. In Power BI, Import and Direct Lake models use Unicode and dictionary encoding, so the source type doesn’t materially affect model size. However, in DirectQuery, sorting and filtering follow the source collation, so it's critical to configure the correct Unicode or UTF-8 collation at the source.

nchar and nvarchar (Transact-SQL) - SQL Server | Microsoft Learn

char and varchar (Transact-SQL) - SQL Server | Microsoft Learn

COLLATE (Transact-SQL) - SQL Server | Microsoft Learn

Data compression - SQL Server | Microsoft Learn

Data types (Transact-SQL) - SQL Server | Microsoft Learn

View solution in original post

v-hashadapu · ‎07-28-2025

Hi @tan_thiamhuat , Thank you for reaching out to the Microsoft Community Forum.

You need Unicode to store Chinese characters. If your engine supports UTF-8 collations, the most space-efficient option is to use VARCHAR with a _UTF8 collation. This lets you store full Unicode while keeping ASCII at 1 byte per character and Chinese at about 3 bytes. If UTF-8 collations aren’t supported in your environment, use NVARCHAR for any column that might store Chinese. It’s fully Unicode-compliant, based on UTF-16 encoding and safely handles all major language scripts. To support emoji or rare characters outside the basic Unicode range such as extended CJK ideographs, use an _SC collation, which enables proper storage, sorting and comparison of those symbols.

When using UTF-8, watch index byte limits since character size varies and ensure your client drivers fully support UTF-8 collations. If you stick with NVARCHAR, limit it to columns that truly require it, consider applying row compression to reduce storage overhead and avoid storing text as bytes (VARBINARY) or relying on legacy code pages, as these approaches break sorting, searching and proper collation behaviour.

In Microsoft Fabric, Lakehouse (Spark) strings are UTF-8 by default, so mixed-language text works out of the box, no need to choose between VARCHAR and NVARCHAR. Just make sure your ingested files are properly UTF-8 encoded. In Fabric Warehouse, treat it like Azure SQL, use a _UTF8 collation at the database or column level if it’s available in your tenant, otherwise, fall back to NVARCHAR. In Power BI, Import and Direct Lake models use Unicode and dictionary encoding, so the source type doesn’t materially affect model size. However, in DirectQuery, sorting and filtering follow the source collation, so it's critical to configure the correct Unicode or UTF-8 collation at the source.

nchar and nvarchar (Transact-SQL) - SQL Server | Microsoft Learn

char and varchar (Transact-SQL) - SQL Server | Microsoft Learn

COLLATE (Transact-SQL) - SQL Server | Microsoft Learn

Data compression - SQL Server | Microsoft Learn

Data types (Transact-SQL) - SQL Server | Microsoft Learn

BhaveshPatel · ‎07-28-2025

Hi @tan_thiamhuat

We have to understand the difference between Linux Operating System and Windows System. Previously, NVARCHAR supported in SQL Server on premises data. With the new data technology, Microsoft has already moved to Linux Operating Systems and there are some of the NVARCHAR is not supported in Fabric Warehouse, they are instead called STRING data type (Represents character string values) in Delta Lake ( ABC ). Instead you should ask everyone to move to Notebooks.

Thanks & Regards,
Bhavesh

Love the Self Service BI.
Please use the 'Mark as answer' link to mark a post that answers your question. If you find a reply helpful, please remember to give Kudos.

NVARCHAR not supported in Fabric Warehouse?

Helpful resources

Fabric Data Days

Fabric Monthly Update - October 2025

FabCon Atlanta 2026

Fabric Data Days starts November 4th!

NVARCHAR not supported in Fabric Warehouse?

Helpful resources

Fabric Data Days

Fabric Monthly Update - October 2025

FabCon Atlanta 2026