Solved: Mirrored cosmos db deeply nested json is getting t...

PBILover · ‎09-23-2025

Hi,

I’ve created a mirrored database in Fabric that contains deeply nested JSON documents. While querying, some documents are returned successfully, but others—especially those with more complex nesting—are not.

WHat are the reasons ?

Thanks

tayloramy · ‎09-23-2025

Hi @PBILover,

When you mirror Azure Cosmos DB into Fabric and query it via the SQL analytics endpoint, any deeply nested JSON is surfaced as a single text column. Today that endpoint typically exposes long strings as VARCHAR(8000), so large JSON blobs (often those “more complex” nested docs) get truncated or fail JSON parsing, which looks like “some documents aren’t returned” or come back cut off. This is expected behavior right now and tied to current limits in preview. Troubleshoot Fabric mirrored databases Data types in Fabric Data Warehouse Mirroring Azure Cosmos DB (Preview).

Troubleshooting steps:

Confirm it’s an endpoint limit (not your data). Check one of the “missing” docs in OneLake or via a notebook; if it’s intact there but cut in the SQL endpoint, you’re hitting the 8 KB-ish limit for JSON text in that endpoint. Microsoft documents truncation of larger JSON in this scenario. Limitations in Cosmos DB in Microsoft Fabric (preview).
Work around the limit for full-fidelity JSON:
- Warehouse path: Copy the mirrored data into a Warehouse table and store the JSON as VARCHAR(MAX) (currently up to ~1 MB). Then use T-SQL JSON functions against that warehouse table. Troubleshoot Fabric mirrored databases.
- Spark path: Create a Lakehouse shortcut to the mirrored data and shred the JSON with Spark (PySpark/SQL) where the strings aren’t capped the same way.
If you must stay on the SQL analytics endpoint:
- Use OPENJSON with OUTER APPLY to avoid dropping rows that don’t have the array/path you’re expanding. How to: Query nested data in Microsoft Fabric mirrored databases from Azure Cosmos DB (Preview).
- Prefer JSON_QUERY() (or OPENJSON) over JSON_VALUE() for long text because JSON_VALUE() returns only NVARCHAR(4000) and will return NULL or error for longer scalars. JSON_VALUE (Transact-SQL). :
- Note casting to NVARCHAR(MAX) won’t recover data that was already truncated by the endpoint’s VARCHAR(8000) surface type.
Seeing inconsistent results? Sometimes it’s just the endpoint’s metadata/scan state-hit Refresh on the SQL analytics endpoint to force a metadata rescan before re-testing.

If you found this helpful, consider giving some Kudos. If I answered your question or solved your problem, mark this post as the solution.

View solution in original post

v-echaithra · ‎09-23-2025

Hi @PBILover ,

Server or Azure SQL Database, which truncates strings longer than 8000 characters. In the case of deeply nested JSON, the serialized JSON string can easily exceed this limit, leading to truncation of the data you're trying to retrieve.
SQL Server and its variants like Azure SQL use VARCHAR(8000) by default for string columns in queries. If a JSON document exceeds this size, it gets truncated. Since you're querying the mirrored database in Fabric, if the nested JSON is returned as a string field, it might be hitting the truncation limit when queried via SQL, especially when it contains large or deeply nested structures.
Instead of using VARCHAR(8000), which limits the size of your data, you can define the column with VARCHAR(MAX). This allows you to store larger strings without truncation.
If you can’t change the underlying schema but still need to query the JSON data without truncation, consider using the FOR XML PATH method in SQL to handle larger string data, which avoids truncation.
If the JSON is very large and you have strict size constraints in SQL, another option is to store the JSON in a separate blob storage like Azure Blob Storage and store only a reference (e.g., a URI or file path) in the database. This way, you avoid having to handle large strings in the database, and you can load the full JSON data into Power BI from the Blob Storage when needed.
If you're still facing issues, make sure the query timeout is not causing the truncation of results. Complex queries against deeply nested JSON might require more processing time, and Power BI or SQL might terminate the query early.
Increase the query timeout both in SQL Server and Power BI under Advanced settings.

Hope this helps,
Chaithra E.

View solution in original post

PBILover · ‎09-29-2025

Thank you @tayloramy @v-echaithra for the suggestions, these are really helpful.

PBILover · ‎09-24-2025

@v-echaithra
how can i change underlying schema?

v-echaithra · ‎09-24-2025

Hi @PBILover ,

If you're able to modify the schema of the database or table, the simplest solution would be to change the column type where the JSON is stored from VARCHAR(8000) (or any fixed size type) to VARCHAR(MAX). This allows much larger data to be stored and avoids truncation.
To Change the Column Type in SQL Server/Fabric, This will store the entire JSON document, no matter how large, as long as it fits within the 1 MB limit of VARCHAR(MAX).

ALTER TABLE your_table_name
ALTER COLUMN your_column_name VARCHAR(MAX)

Hope this helps.
Chaithra E

tayloramy · ‎09-23-2025

Hi @PBILover,

When you mirror Azure Cosmos DB into Fabric and query it via the SQL analytics endpoint, any deeply nested JSON is surfaced as a single text column. Today that endpoint typically exposes long strings as VARCHAR(8000), so large JSON blobs (often those “more complex” nested docs) get truncated or fail JSON parsing, which looks like “some documents aren’t returned” or come back cut off. This is expected behavior right now and tied to current limits in preview. Troubleshoot Fabric mirrored databases Data types in Fabric Data Warehouse Mirroring Azure Cosmos DB (Preview).

Troubleshooting steps:

Confirm it’s an endpoint limit (not your data). Check one of the “missing” docs in OneLake or via a notebook; if it’s intact there but cut in the SQL endpoint, you’re hitting the 8 KB-ish limit for JSON text in that endpoint. Microsoft documents truncation of larger JSON in this scenario. Limitations in Cosmos DB in Microsoft Fabric (preview).
Work around the limit for full-fidelity JSON:
- Warehouse path: Copy the mirrored data into a Warehouse table and store the JSON as VARCHAR(MAX) (currently up to ~1 MB). Then use T-SQL JSON functions against that warehouse table. Troubleshoot Fabric mirrored databases.
- Spark path: Create a Lakehouse shortcut to the mirrored data and shred the JSON with Spark (PySpark/SQL) where the strings aren’t capped the same way.
If you must stay on the SQL analytics endpoint:
- Use OPENJSON with OUTER APPLY to avoid dropping rows that don’t have the array/path you’re expanding. How to: Query nested data in Microsoft Fabric mirrored databases from Azure Cosmos DB (Preview).
- Prefer JSON_QUERY() (or OPENJSON) over JSON_VALUE() for long text because JSON_VALUE() returns only NVARCHAR(4000) and will return NULL or error for longer scalars. JSON_VALUE (Transact-SQL). :
- Note casting to NVARCHAR(MAX) won’t recover data that was already truncated by the endpoint’s VARCHAR(8000) surface type.
Seeing inconsistent results? Sometimes it’s just the endpoint’s metadata/scan state-hit Refresh on the SQL analytics endpoint to force a metadata rescan before re-testing.

If you found this helpful, consider giving some Kudos. If I answered your question or solved your problem, mark this post as the solution.

PBILover · ‎09-24-2025

@tayloramy @v-echaithra
I still get the same error even if i try to use the openJson ,below is the query for your reference

SELECT

f.[id] AS ,

f.[documentType],

ft.id

FROM xyz AS f

Cross APply OPenJson (f.[package])

with(

id varchar(100) '$.id'

)AS ft
where f.id = "xyz"

Msg 13609, Level 16, State 4, Line 1
	JSON text is not properly formatted. Unexpected character '"' is found at position 7994.

tayloramy · ‎09-24-2025

Hi @PBILover,

This means that your JSON is not valid.

Before working with json data, you need to ensure it is valid json.

can you look at your json and see what it looks like at position 7994? Maybe there's some non standard escape characters, or maybe a string is not properly quoted and contains a double quote, breaking the string?

If you found this helpful, consider giving some Kudos. If I answered your question or solved your problem, mark this post as the solution.

PBILover · ‎09-24-2025

Thanks @tayloramy
My json is valid this is because the truncation the json is cut.

Why am I getting a "JSON text isn't properly formatted. Unexpected character '"' is found at position" error message when running T-SQL queries against my SQL analytics endpoint?

Data warehouse can't handle JSON string columns greater than 8 KB in size. Nested data represented as a JSON string in SQL analytics endpoint or warehouse tables can commonly cause the column to increase to more than 8 KB in size. Monitoring levels of nesting and the amount of data if you receive this error message. For more information, see data warehouse limitations.
so i am not able to use the Openjson in this case.

tayloramy · ‎09-24-2025

Hi @PBILover,
You won't be able to parse the json if it is not complete.

If you found this helpful, consider giving some Kudos. If I answered your question or solved your problem, mark this post as the solution.

PBILover · ‎09-23-2025

Thanks for your reply @tayloramy . I have confirmed that my data is getting truncated due to VRACHAR(8000) limitation. Can you please help me in understaning the workaround to mirror that deeply nested Json without truncating the data.
I want to use this data in my power BI report.

Thanks Again,

v-echaithra · ‎09-23-2025

Hi @PBILover ,

Server or Azure SQL Database, which truncates strings longer than 8000 characters. In the case of deeply nested JSON, the serialized JSON string can easily exceed this limit, leading to truncation of the data you're trying to retrieve.
SQL Server and its variants like Azure SQL use VARCHAR(8000) by default for string columns in queries. If a JSON document exceeds this size, it gets truncated. Since you're querying the mirrored database in Fabric, if the nested JSON is returned as a string field, it might be hitting the truncation limit when queried via SQL, especially when it contains large or deeply nested structures.
Instead of using VARCHAR(8000), which limits the size of your data, you can define the column with VARCHAR(MAX). This allows you to store larger strings without truncation.
If you can’t change the underlying schema but still need to query the JSON data without truncation, consider using the FOR XML PATH method in SQL to handle larger string data, which avoids truncation.
If the JSON is very large and you have strict size constraints in SQL, another option is to store the JSON in a separate blob storage like Azure Blob Storage and store only a reference (e.g., a URI or file path) in the database. This way, you avoid having to handle large strings in the database, and you can load the full JSON data into Power BI from the Blob Storage when needed.
If you're still facing issues, make sure the query timeout is not causing the truncation of results. Complex queries against deeply nested JSON might require more processing time, and Power BI or SQL might terminate the query early.
Increase the query timeout both in SQL Server and Power BI under Advanced settings.

Hope this helps,
Chaithra E.

PBILover · ‎09-24-2025

Thanks @v-echaithra for the detailed reply. DO you have any example on how to use XML path method in Fabric?

Mirrored cosmos db deeply nested json is getting truncated in a sql endpoint query

Troubleshooting steps:

Troubleshooting steps:

Why am I getting a "JSON text isn't properly formatted. Unexpected character '"' is found at position" error message when running T-SQL queries against my SQL analytics endpoint?

Helpful resources

Fabric Monthly Update - November 2025

Fabric Data Days

FabCon Atlanta 2026

FabCon is coming to Atlanta

Mirrored cosmos db deeply nested json is getting truncated in a sql endpoint query

Troubleshooting steps:

Troubleshooting steps:

Why am I getting a "JSON text isn't properly formatted. Unexpected character '"' is found at position" error message when running T-SQL queries against my SQL analytics endpoint?

Helpful resources

Fabric Monthly Update - November 2025

Fabric Data Days

FabCon Atlanta 2026