Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Get Fabric Certified for FREE during Fabric Data Days. Don't miss your chance! Learn more

Reply
tan_thiamhuat
Post Patron
Post Patron

NVARCHAR not supported in Fabric Warehouse?

why? if I have Chinese characters, I need that, else VARCHAR is very not optimized.

What are the workarounds?

1 ACCEPTED SOLUTION
v-hashadapu
Community Support
Community Support

Hi @tan_thiamhuat , Thank you for reaching out to the Microsoft Community Forum.

 

You need Unicode to store Chinese characters. If your engine supports UTF-8 collations, the most space-efficient option is to use VARCHAR with a _UTF8 collation. This lets you store full Unicode while keeping ASCII at 1 byte per character and Chinese at about 3 bytes. If UTF-8 collations aren’t supported in your environment, use NVARCHAR for any column that might store Chinese. It’s fully Unicode-compliant, based on UTF-16 encoding and safely handles all major language scripts. To support emoji or rare characters outside the basic Unicode range such as extended CJK ideographs, use an _SC collation, which enables proper storage, sorting and comparison of those symbols.

 

When using UTF-8, watch index byte limits since character size varies and ensure your client drivers fully support UTF-8 collations. If you stick with NVARCHAR, limit it to columns that truly require it, consider applying row compression to reduce storage overhead and avoid storing text as bytes (VARBINARY) or relying on legacy code pages, as these approaches break sorting, searching and proper collation behaviour.

 

In Microsoft Fabric, Lakehouse (Spark) strings are UTF-8 by default, so mixed-language text works out of the box, no need to choose between VARCHAR and NVARCHAR. Just make sure your ingested files are properly UTF-8 encoded. In Fabric Warehouse, treat it like Azure SQL, use a _UTF8 collation at the database or column level if it’s available in your tenant, otherwise, fall back to NVARCHAR. In Power BI, Import and Direct Lake models use Unicode and dictionary encoding, so the source type doesn’t materially affect model size. However, in DirectQuery, sorting and filtering follow the source collation, so it's critical to configure the correct Unicode or UTF-8 collation at the source.

 

nchar and nvarchar (Transact-SQL) - SQL Server | Microsoft Learn

char and varchar (Transact-SQL) - SQL Server | Microsoft Learn

COLLATE (Transact-SQL) - SQL Server | Microsoft Learn

Data compression - SQL Server | Microsoft Learn

Data types (Transact-SQL) - SQL Server | Microsoft Learn

 

View solution in original post

2 REPLIES 2
v-hashadapu
Community Support
Community Support

Hi @tan_thiamhuat , Thank you for reaching out to the Microsoft Community Forum.

 

You need Unicode to store Chinese characters. If your engine supports UTF-8 collations, the most space-efficient option is to use VARCHAR with a _UTF8 collation. This lets you store full Unicode while keeping ASCII at 1 byte per character and Chinese at about 3 bytes. If UTF-8 collations aren’t supported in your environment, use NVARCHAR for any column that might store Chinese. It’s fully Unicode-compliant, based on UTF-16 encoding and safely handles all major language scripts. To support emoji or rare characters outside the basic Unicode range such as extended CJK ideographs, use an _SC collation, which enables proper storage, sorting and comparison of those symbols.

 

When using UTF-8, watch index byte limits since character size varies and ensure your client drivers fully support UTF-8 collations. If you stick with NVARCHAR, limit it to columns that truly require it, consider applying row compression to reduce storage overhead and avoid storing text as bytes (VARBINARY) or relying on legacy code pages, as these approaches break sorting, searching and proper collation behaviour.

 

In Microsoft Fabric, Lakehouse (Spark) strings are UTF-8 by default, so mixed-language text works out of the box, no need to choose between VARCHAR and NVARCHAR. Just make sure your ingested files are properly UTF-8 encoded. In Fabric Warehouse, treat it like Azure SQL, use a _UTF8 collation at the database or column level if it’s available in your tenant, otherwise, fall back to NVARCHAR. In Power BI, Import and Direct Lake models use Unicode and dictionary encoding, so the source type doesn’t materially affect model size. However, in DirectQuery, sorting and filtering follow the source collation, so it's critical to configure the correct Unicode or UTF-8 collation at the source.

 

nchar and nvarchar (Transact-SQL) - SQL Server | Microsoft Learn

char and varchar (Transact-SQL) - SQL Server | Microsoft Learn

COLLATE (Transact-SQL) - SQL Server | Microsoft Learn

Data compression - SQL Server | Microsoft Learn

Data types (Transact-SQL) - SQL Server | Microsoft Learn

 

BhaveshPatel
Community Champion
Community Champion

Hi @tan_thiamhuat 

 

We have to understand the difference between Linux Operating System and Windows System. Previously, NVARCHAR supported in SQL Server on premises data. With the new data technology, Microsoft has already moved to Linux Operating Systems and there are some of the NVARCHAR is not supported in Fabric Warehouse, they are instead called STRING data type (Represents character string values) in Delta Lake ( ABC ). Instead you should ask everyone to move to Notebooks. 

Thanks & Regards,
Bhavesh

Love the Self Service BI.
Please use the 'Mark as answer' link to mark a post that answers your question. If you find a reply helpful, please remember to give Kudos.

Helpful resources

Announcements
Fabric Data Days Carousel

Fabric Data Days

Advance your Data & AI career with 50 days of live learning, contests, hands-on challenges, study groups & certifications and more!

October Fabric Update Carousel

Fabric Monthly Update - October 2025

Check out the October 2025 Fabric update to learn about new features.

FabCon Atlanta 2026 carousel

FabCon Atlanta 2026

Join us at FabCon Atlanta, March 16-20, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.