Solved: Re: Open Mirroring CSV column types not converting

fabricpoweruser · ‎06-18-2025

When uploading CSV files to Microsoft Fabric's Open Mirroring landing zone with a correctly configured _metadata.json (specifying types like datetime2 and decimal(18,2)), why are columns consistently being created as int or varchar in the mirrored database, even when the source CSV data strictly conforms to the declared types? Are there known limitations with type inference for delimited text in Open Mirroring beyond _metadata.json specifications?

Are there specific, unstated requirements or known limitations for type inference and conversion from delimited text files in Fabric's Open Mirroring that go beyond the _metadata.json specification, or are there additional properties we should be using within _metadata.json to force these specific non-string/non-integer data types?

v-prasare · ‎06-19-2025

Hi @fabricpoweruser,

The varchar columns in the mirrored table that you see even though you specified datetime2 and decimal(18,2) in _metadata.json because Open Mirroring for CSV uses basic type inference and does not enforce complex SQL types. This is a known and by design behaviour as per this documentation acknowledged by Microsoft: open-mirroring-landing-zone-format

To avoid this, When working with Microsoft Fabric’s Open Mirroring and you require exact schema mapping particularly for precision data types like decimal(18,2) or datetime2 the most reliable approach is to use Parquet or Delta file formats. These formats support embedded schema definitions, allowing the Fabric ingestion engine to preserve the declared types accurately without depending on type inference.

When working with CSV files, you can still define your schema using the _metadata.json file, but it’s important to understand that these definitions are interpreted as best-effort suggestions rather than strict enforcement. Due to this, you should be prepared to post-process the ingested tables by casting the columns to their correct data types using SQL, or by creating views that apply the appropriate type conversions after the data is loaded.

To align more closely with how Fabric handles delimited text, it’s advisable to use only the core set of supported data types in your schema definitions. For example, you should use Double instead of decimal(18,2) and DateTime instead of datetime2. These types are more consistently interpreted during ingestion, reducing the likelihood of columns defaulting to generic types like varchar.

We understand this might not meet every use case. If schema strictness for CSVs is critical to your workflow, we encourage submitting feedback through the Microsoft Fabric Ideas portal so it can be prioritized by the product team.

Thanks,

Prashanth

MS Fabric community support

View solution in original post

v-prasare · ‎06-19-2025