This time we’re going bigger than ever. Fabric, Power BI, SQL, AI and more. We're covering it all. You won't want to miss it.
Learn moreDid you hear? There's a new SQL AI Developer certification (DP-800). Start preparing now and be one of the first to get certified. Register now
I'm integrating a third party OData API hosted on an on-premises server accessible only via on-premises data gateway.
Current Situation:
The Problem: Dataflow Gen2 file destination currently only supports delimited/CSV format. CSV cannot preserve nested column structures from OData - nested columns are either dropped or corrupted.
What I need: JSON or Parquet file format support in Dataflow Gen2 file destination. Both formats natively support nested structures and would solve this completely.
Questions:
What I tried so far:
Root Cause: OData navigation properties mean each nested table requires sequential OData requests per parent record through the gateway - 400 parent records × multiple nested tables = thousands of sequential gateway requests causing timeouts. JSON/Parquet file destination would allow streaming pages directly to file without needing full expansion in memory.
Any help appreciated. Thank you.
Solved! Go to Solution.
Hi @pwr20 ,
Currently, Microsoft Fabric Dataflow Gen2 supports only CSV/delimited formats for file-based destinations such as Lakehouse Files and ADLS Gen2. These formats do not support nested data structures, which makes them unsuitable for landing raw nested OData data without flattening.
When working with nested OData navigation properties, Dataflow Gen2 treats nested records as unsupported for tabular or file destinations. As a result, nested columns are either dropped or replaced with null values when writing to Lakehouse tables, staging destinations, or files. Fully expanding nested data within Dataflow can work for small volumes but does not scale reliably, especially when using an on‑premises data gateway, due to sequential API calls and timeout limitations.
At this time, JSON or Parquet file destinations are not available in Dataflow Gen2, and there is no built‑in, scalable solution to preserve nested OData structures end‑to‑end. This is a current product limitation, not a configuration issue.
I’d encourage you to submit your detailed feedback and ideas via Microsoft's official feedback channels, such as the Microsoft Fabric Ideas.
Feedback submitted here is often reviewed by the product teams and can lead to meaningful improvement.
Thanks,
Prashanth
Hi @pwr20 ,
Currently, Microsoft Fabric Dataflow Gen2 supports only CSV/delimited formats for file-based destinations such as Lakehouse Files and ADLS Gen2. These formats do not support nested data structures, which makes them unsuitable for landing raw nested OData data without flattening.
When working with nested OData navigation properties, Dataflow Gen2 treats nested records as unsupported for tabular or file destinations. As a result, nested columns are either dropped or replaced with null values when writing to Lakehouse tables, staging destinations, or files. Fully expanding nested data within Dataflow can work for small volumes but does not scale reliably, especially when using an on‑premises data gateway, due to sequential API calls and timeout limitations.
At this time, JSON or Parquet file destinations are not available in Dataflow Gen2, and there is no built‑in, scalable solution to preserve nested OData structures end‑to‑end. This is a current product limitation, not a configuration issue.
I’d encourage you to submit your detailed feedback and ideas via Microsoft's official feedback channels, such as the Microsoft Fabric Ideas.
Feedback submitted here is often reviewed by the product teams and can lead to meaningful improvement.
Thanks,
Prashanth
Thank you for the detailed explanation. This really clears things up and confirms what I have been experiencing. It is reassuring to know this is a product limitation rather than a configuration issue on our end.
Based on your feedback and our own investigation, we are now looking at two alternative approaches for the nested OData entities:
1. Using the Fabric Pipeline Copy Data Activity with the on-premises data gateway to land raw JSON or Parquet files into Lakehouse Files, and then using a Fabric Notebook to flatten the nested structures from there. we are currently investigating whether Fabric Pipeline Copy Data Activity supports our OData on-premises scenario and will update the thread with our findings
2. Azure Data Factory with a self-hosted integration runtime as a fallback option if the Copy Data Activity approach does not work out. For the flat/scalar entities, we will continue using Dataflow Gen2 as it works well for those.
I will also submit feedback via Microsoft Fabric Ideas as you suggested, as JSON and Parquet file destinations in Dataflow Gen2 would be a very valuable addition for scenarios like this.
Thanks again for taking the time to respond.
Thank you for the detailed response!
I tried the Text.FromBinary(Json.FromValue()) approach, but unfortunately it triggers a 404 error in our case. It appears that when Dataflow applies this transformation, the OData connector attempts to resolve the nested table via a navigation URL rather than operating on the already loaded data. The URL it constructs appends the nested entity path and that endpoint returns a 404 from our API.
We worked around this by using $select in the OData URL to return the nested data inline, combined with Table.ExpandTableColumn. This avoids the navigation request and works cleanly for smaller loads.
However, we are now facing a scale challenge. We have a large number of parent records each containing thousands of nested records, and the $filter parameter only applies at the parent level not the nested level. The standalone endpoint for the nested entity is also restricted in our API.
Do you have any suggestions for handling incremental or filtered loads of nested OData data at scale within Fabric Dataflow Gen2?
Hi @pwr20
give this a try
Use Dataflow Gen2 to pull just the parent records (scalar columns only, fast and reliable). Write those to a Lakehouse table. Then trigger a Notebook that reads the parent IDs and calls the OData API directly using requests over HTTP for the nested data. The Notebook can handle retries, parallelism, and write JSON/Parquet directly to Lakehouse Files.
Thank you for the suggestion.
Unfortunately this approach won't work in our case because the OData API is hosted on-premises and is only accessible through an on-premises data gateway. Fabric Notebooks cannot route requests through the on-premises gateway, so a direct HTTP call using requests from a Notebook would not be able to reach the API.
This is why we are constrained to Dataflow Gen2 for the data extraction part, as it is the only Fabric component that can connect through the on-premises gateway to our OData source.
Let me know your thoughts.
Hi @pwr20,
We would like to confirm if our community members answer resolves your query or if you need further help. If you still have any questions or need more support, please feel free to let us know. We are happy to help you.
@nilendraFabric ,Thanks for your prompt response
Thank you for your patience and look forward to hearing from you.
Best Regards,
Prashanth Are
MS Fabric community support
Hi @pwr20
The Lakehouse Files destination (currently in Preview) only supports writing CSV files. The ADLS Gen2 destination is also in Preview but limited to delimited format. You should submit this as a feature request at aka.ms/FabricIdeas
Your diagnosis is accurate. Dataflows treat nested records in columns as errors and replace them with null when writing to tabular destinations.
There is no clean solution but try this
Serialize nested columns to JSON text in M before writing to a table destination. In Power Query, use Text.FromBinary(Json.FromValue([NestedColumn])) to convert each nested column into a JSON string. Write those string columns to a Lakehouse table. Then parse them back in a Notebook downstream. This preserves the structure without needing to expand navigation properties.