Solved: Re: Data mirror - Exception reading the data file.

timahenning2 · ‎02-10-2026

Hi,

I received this error from the Data Mirror UI.

"We encountered an error while parsing data file being used, please fix the file and try again. The error is: Exception reading the data file. Exception: class parquet::ParquetStatusException (message: ‘Unknown error: System.OverflowException: Array dimensions exceeded supported range, at ParquetSharp.lO.ManagedRandomAcces sFile.ReadBuffer(lnt64 nbytes, IntPtr& buffer_handle, String& exception)'), ErrorCode: InternalError, ErrorCode: InputValidationError Artifactid: XXXX"

Here are the columns:

[jobOrderID] [int] NULL,

[externalID] [nvarchar](100) NULL,

[bhTimeStamp] [nvarchar](max) NULL,

[dateAdded] [datetime2](3) NULL,

[dateLastModified] [datetime2](3) NULL,

[isDeleted] [bit] NULL,

[action] [nvarchar](30) NULL,

[migrateGUID] [uniqueidentifier] NULL,

[truestDateAdded] [datetime2](3) NULL,

[noteID] [int] NOT NULL,

[personReferenceID] [int] NULL,

[comments] [nvarchar](max) NULL,

[minutesSpent] [int] NULL,

[primaryDepartmentName] [nvarchar](max) NULL,

[commentingPersonID] [int] NULL,

[linkedInID] [nvarchar](200) NULL,

[dateLastSync] [datetime2](3) NULL

The largest value in the 'comments' column is 522932 bytes.

Other tables with the same data types in the same database replicate without issue. I dumped the data out to a CSV file and loaded it into a lakehouse without errors.

Where can I find the data mirror error log details? I would like to know the row and column that failed during import.

Why did the data mirror replication fail on this table?

Tim

v-karpurapud · ‎02-11-2026

Hi @timahenning2
Thank you for reaching out to the Microsoft Fabric community forum.

The replication failure is due to very large values in the NVARCHAR(MAX) columns, particularly in the comments column where some rows are over 500 KB. When Data Mirror syncs, Microsoft Fabric converts each SQL table into Delta Parquet files in OneLake. Parquet is a columnar storage format, and extremely large string values can require significant memory during encoding. If a string exceeds the processing limits during Parquet serialization, the process can fail with an error like: System.OverflowException: Array dimensions exceeded supported range.
This error is not due to a SQL schema issue or data corruption, but happens when encoding large variable-length text values into Parquet.

Currently, Data Mirror does not display row-level diagnostics in the user interface. The Fabric monitoring only shows overall replication status and table-level information, such as whether a table is running, has warnings, failed, stopped, row counts, and last completed time. More details can be found in the Monitor Mirrored Database Replication documentation:[Monitor Mi...soft Learn | Learn.Microsoft.com]

Additional logs are available through Mirrored Database Operation Logs (workspace monitoring), which provide execution-level details, processed rows and bytes, operation names, and failure messages, but not errors for individual rows.[Mirrored d...soft Learn | Learn.Microsoft.com]

To fix the issue, it is recommended to mirror a view instead of the base table and cast or truncate NVARCHAR(MAX) columns to a fixed size like NVARCHAR(4000). Limiting string length ensures values stay within the processing limits for Parquet encoding, allowing successful Data Mirror replication.

If you have any further questions, feel free to reach out and we'll be glad to assist.

Regards,

Microsoft Fabric Community Support Team.

View solution in original post

v-karpurapud · ‎02-11-2026

Hi @timahenning2
Thank you for reaching out to the Microsoft Fabric community forum.

The replication failure is due to very large values in the NVARCHAR(MAX) columns, particularly in the comments column where some rows are over 500 KB. When Data Mirror syncs, Microsoft Fabric converts each SQL table into Delta Parquet files in OneLake. Parquet is a columnar storage format, and extremely large string values can require significant memory during encoding. If a string exceeds the processing limits during Parquet serialization, the process can fail with an error like: System.OverflowException: Array dimensions exceeded supported range.
This error is not due to a SQL schema issue or data corruption, but happens when encoding large variable-length text values into Parquet.

Currently, Data Mirror does not display row-level diagnostics in the user interface. The Fabric monitoring only shows overall replication status and table-level information, such as whether a table is running, has warnings, failed, stopped, row counts, and last completed time. More details can be found in the Monitor Mirrored Database Replication documentation:[Monitor Mi...soft Learn | Learn.Microsoft.com]

Additional logs are available through Mirrored Database Operation Logs (workspace monitoring), which provide execution-level details, processed rows and bytes, operation names, and failure messages, but not errors for individual rows.[Mirrored d...soft Learn | Learn.Microsoft.com]

To fix the issue, it is recommended to mirror a view instead of the base table and cast or truncate NVARCHAR(MAX) columns to a fixed size like NVARCHAR(4000). Limiting string length ensures values stay within the processing limits for Parquet encoding, allowing successful Data Mirror replication.

If you have any further questions, feel free to reach out and we'll be glad to assist.

Regards,

Microsoft Fabric Community Support Team.

timahenning2 · ‎02-11-2026

Thank you for the response. It is helpful.

You mentioned, "To fix the issue, it is recommended to mirror a view instead ...". I don't see the ability to mirror views in the current Fabric configuration UI.

What are the steps to mirror a SQL database source view?

NVARCHAR(MAX) Is the limit 4K due to double byte? VARCHAR(MAX) is 8K. Is there a limit increase coming?

Working with large data types in Fabric Data Warehouse | Microsoft Fabric Blog | Microsoft Fabric

For anyone struggling with figuring out the largest value in your database table column, this query was helpful:

Find the size of the largest value in a column

```sql

SELECT TOP (10)

,DATALENGTH(<YOUR COLUMN>) AS RowSizeBytes

FROM

dbo.<YOUR TABLE>

ORDER BY

DATALENGTH(<YOUR COLUMN>) DESC

```

Thanks for your help.

Tim

v-karpurapud · ‎02-11-2026

Hi @timahenning2

Previously, I suggested mirroring a view, which is a common Fabric ingestion practice for pipelines, dataflows, and other methods that allow views as sources. However, this does not apply to SQL Database Mirroring as it currently only supports selecting tables in the configuration UI.

For SQL Database Mirroring, a practical solution is to create a staging table in the source database that casts or truncates NVARCHAR(MAX) columns to a fixed size, then mirror this staging table instead of the original. Since mirroring works at the table level, this approach lets you manage large string columns within the supported features of SQL Database Mirroring.

Regarding data types, the 4,000-character limit is not a Fabric restriction and does not affect NVARCHAR(MAX). In SQL Server, NVARCHAR(n) supports up to 4,000 characters, and VARCHAR(n) up to 8,000, due to differences in storage bytes. Both NVARCHAR(MAX) and VARCHAR(MAX) can store values up to 2 GB, so there is no 4K limit for NVARCHAR(MAX).

Currently, there is no public Microsoft documentation about increasing the NVARCHAR(n) limit beyond 4,000 characters. If there are issues with large string values, they relate to specific workloads, not a fixed 4K limit on NVARCHAR(MAX).

Regards,

Microsoft Fabric Community Support Team

Data mirror - Exception reading the data file.

Helpful resources

Join our Fabric User Panel

Fabric Monthly Update - February 2026

FabCon Atlanta 2026

FabCon is coming to Atlanta

Data mirror - Exception reading the data file.

Helpful resources

Join our Fabric User Panel

Fabric Monthly Update - February 2026

FabCon Atlanta 2026