Don't miss your chance to take exam DP-600 or DP-700 on us!
Request nowLearn from the best! Meet the four finalists headed to the FINALS of the Power BI Dataviz World Championships! Register now
Hello everyone,
I’m currently building an ingestion pipeline in Microsoft Fabric and I’d like to know if anyone has already implemented something similar successfully.
I need a solid, production-ready structure for:
Incremental loading from a REST API (using epoch timestamps)
Handling dynamic pagination (nextPage from API response)
Loading raw data into Lakehouse
Automatically handling schema changes
Performing MERGE (upsert) into a Warehouse
Updating a watermark table after success
The API:
Requires startDate and endDate for incremental filtering
Uses curPage and returns nextPage
Returns nested JSON that must be flattened
I currently:
Use a Web activity to read nextPage
Use an Until loop for pagination
Copy data into Lakehouse with OverwriteSchema
Run schema sync script before MERGE
Then update watermark
It works, but I want to confirm:
Has anyone built a stable architecture like this in Fabric?
Is this the recommended approach?
Is there a cleaner way to handle pagination and incremental logic?
I’m looking for a proven, solid structure before moving this into production.
Any guidance, examples, or best practices would be greatly appreciated.
Thank you.
Solved! Go to Solution.
Hey @Om_Bhartiya ,
What you’re building is absolutely valid and yes, this pattern is commonly implemented in Microsoft Fabric. Your current approach is technically correct, but for production you can make it cleaner, more scalable, and easier to maintain.
Below is a solid production-ready architecture pattern that many teams use successfully.
Recommended Production Architecture (Medallion Style)
This keeps everything restartable and idempotent.
Ingestion Layer (Bronze – Raw Landing in Lakehouse)
For REST pagination + incremental:
Recommended Structure - Instead of complex Web + Until orchestration:
Stop condition:
@equals(variables('nextPage'), null)
✔ Keep pagination logic entirely inside the loop
✔ Do NOT mix schema handling here
✔ Always land raw JSON unchanged
Load into:
This gives you replay capability.
Schema Handling (Do NOT rely only on OverwriteSchema)
OverwriteSchema works, but in production it can break downstream models.
Better pattern:
This allows controlled schema evolution in Delta tables.
Silver Layer (Flatten + Normalize)
Use a Notebook or Dataflow Gen2 to:
Store cleaned table in Lakehouse (Silver).
Warehouse MERGE (Upsert Pattern)
Now perform MERGE from Silver → Warehouse.
Use pattern:
MERGE INTO target t
USING source s
ON t.id = s.id
WHEN MATCHED AND s.updated_at > t.updated_at THEN UPDATE
WHEN NOT MATCHED THEN INSERTKey best practices:
Watermark Update (ONLY After Success)
Very important:
Update watermark only if:
Wrap merge + watermark update in:
If failure occurs → watermark must NOT update.
Final Production-Ready Structure
Pipeline Flow:
This is a very common enterprise pattern in Fabric.
If this explanation helped, please mark it as the solution so others can find it easily.
If it helped, a quick Kudos is always appreciated it highlights useful answers for the community.
Thanks for being part of the discussion!
Thank you so much for the detailed explanation
The way you broke down the production‑ready architecture, especially the separation into watermark control, Bronze/Silver layers, and the clean MERGE pattern, really clarified the best practices for me. I also appreciate the emphasis on keeping pagination logic isolated and handling schema evolution safely through notebooks instead of relying only on OverwriteSchema.
This gives me a much clearer and more reliable structure to move forward with before taking the pipeline to production. I truly appreciate the time and depth you put into the explanation — it helped a lot!
Thanks again!
HI @Om_Bhartiya,
we haven't heard back from you regarding our last response and wanted to check if your issue has been resolved.
Should you have any further questions, feel free to reach out.
Thank you for being a part of the Microsoft Fabric Community Forum!
Hi @Om_Bhartiya,
Your current approach aligns well with established best practices for building REST API ingestion pipelines in Microsoft Fabric. In production settings, it's common to use a pipeline that reads the watermark (last successful load timestamp) from a control table, constructs API requests with startDate and endDate parameters for incremental extraction, and handles pagination with a Web activity and an Until loop. This loop continues until the API no longer provides a nextPage value, and each page is ingested into the Lakehouse as the raw/bronze layer, often using schema drift or overwrite options to manage schema changes.
For nested JSON, it's best to land the raw data first and then use notebooks or Dataflows Gen2 for flattening and transformations before moving it to curated layers. Incremental updates are typically managed with a MERGE (upsert) operation into the Warehouse or Delta tables, and the watermark table is updated after successful pipeline completion to support ongoing incremental loads.
This architecture -- watermark control -- incremental API call -- pagination loop → raw Lakehouse ingestion -- schema handling -- MERGE into Warehouse -- watermark update, is widely used and effective in Fabric. Separating the pipeline into Bronze, Silver, and Gold layers can enhance scalability and maintainability. Adding retry policies, logging, and error handling can also improve production readiness.
Thank you.
If it's okay, could I share my current pipeline structure here?
Since I’m a fresher and still learning, I’d love to get your feedback on whether I’m implementing things correctly or if there are improvements I should make before taking it to production.
Hey @Om_Bhartiya ,
What you’re building is absolutely valid and yes, this pattern is commonly implemented in Microsoft Fabric. Your current approach is technically correct, but for production you can make it cleaner, more scalable, and easier to maintain.
Below is a solid production-ready architecture pattern that many teams use successfully.
Recommended Production Architecture (Medallion Style)
This keeps everything restartable and idempotent.
Ingestion Layer (Bronze – Raw Landing in Lakehouse)
For REST pagination + incremental:
Recommended Structure - Instead of complex Web + Until orchestration:
Stop condition:
@equals(variables('nextPage'), null)
✔ Keep pagination logic entirely inside the loop
✔ Do NOT mix schema handling here
✔ Always land raw JSON unchanged
Load into:
This gives you replay capability.
Schema Handling (Do NOT rely only on OverwriteSchema)
OverwriteSchema works, but in production it can break downstream models.
Better pattern:
This allows controlled schema evolution in Delta tables.
Silver Layer (Flatten + Normalize)
Use a Notebook or Dataflow Gen2 to:
Store cleaned table in Lakehouse (Silver).
Warehouse MERGE (Upsert Pattern)
Now perform MERGE from Silver → Warehouse.
Use pattern:
MERGE INTO target t
USING source s
ON t.id = s.id
WHEN MATCHED AND s.updated_at > t.updated_at THEN UPDATE
WHEN NOT MATCHED THEN INSERTKey best practices:
Watermark Update (ONLY After Success)
Very important:
Update watermark only if:
Wrap merge + watermark update in:
If failure occurs → watermark must NOT update.
Final Production-Ready Structure
Pipeline Flow:
This is a very common enterprise pattern in Fabric.
If this explanation helped, please mark it as the solution so others can find it easily.
If it helped, a quick Kudos is always appreciated it highlights useful answers for the community.
Thanks for being part of the discussion!
Share feedback directly with Fabric product managers, participate in targeted research studies and influence the Fabric roadmap.
Check out the February 2026 Power BI update to learn about new features.