Re: Get data from REST API source with iterated ne...

iBusinessBI · ‎07-24-2023

I have a working Dataflow that does the following:

1. Pulls a Customers list from an API call (https://api.somecrm.com/v1/customers).
2. For EACH Customer it calls another API to get his/hers purchase history (https://api.somecrm.com/v1/purchasehist/{customerid})

3. I have 3 other nested queries, but I'll skip them for simplicity.

I am trying to build the same solution in the new Pipelines in the Fabric. Two reasons for this:

1. Although the Dataflows works fine, it sometimes gets: "429 Too Many Requests error", so I feel that Pipelines will give me more control, such as "Wait" step.

2. I would like to save some outputs as JSON files in a Lakehouse, not just tables.
I am new to this environment so I'd like to learn how can I implement this scenario.

I've seen several documentations and videos, but I am looking for the this specific scenario - several API calls with an iteration...

I tried to build but got stuck in the ForEach activity with a syntax - how do I tell it how to get a specific customerID...

Please help

Thanks

Nero · ‎08-15-2023

This method has extra steps for saving JSON files along the way:

Pipeline 1 : (get list of customerids, store in JSON file, create a Lakehouse Table)

Copy Data Activity (REST, HTTP should work too) Source=customers API endpoint, Destination=LH Customers.json file
Dataflow Activity - get data: source=Lakehouse Customers.json file [generated by step 1], data destination= Lakehouse Table: somecrm_customers (Replace)

Pipeline 2: ( loop through customer_ids in LH Table somecrm_customers [generated by Pipeline 1], store a JSON file for each customer)

Lookup Activity (Source=Lakehouse Table somecrm_customers ) - Note there is a 5,000 row limit on the lookup Activity
For Each Activity - iterate through each customerid output from the lookup
2.1 [Inside For Each Activity] Copy Data Activity (REST) source=purchasehist API endpoint (insert customerid as variable), Destination= PurchaseHist<customerid>.json files

if you don’t want to use dataflows at all, you can replace dataflow in Pipeline 1 with a Notebook Activity instead (Use pyspark to Read json file and save as Lakehouse table)

iBusinessBI · ‎08-15-2023

Thanks a lot for the detailed explanation, @Nero
As for Lookup activity - what if I have more than 5000 rows?

Nero · ‎08-15-2023

hey @iBusinessBI,

To get past the 5,000 limit we added an outer loop "pipeline" that iterated over the customerids in batches of 5,000 as suggested here

We actually used a Dataflow to copy all the customerids from the Lakehouse table into a Fabric Datawarehouse table (the datawarehouse table let us do TSQL functionality like ROW_NUMBER() which helped to separate the customerids into batches of 5,000, we also used that same warehouse table as the source for the lookup activity).

I can write up more detailed instructions if that helps? (but will take me a while!). I am hopeful that there will be new Fabric functionality soon that would negate the need to use the Warehouse to solve this problem. Or maybe someone else knows a Notebook solution.

GraceGu · ‎07-24-2023

1. Web activity, ouputs array

2. For each activity, use @activity('webactivity').output.value to iterate array

3. Copy activity inside For Each activity, use @item() to access each element. You can refer to azure - how to store web activity result in a variable? - Stack Overflow

Get data from REST API source with iterated nested API calls (with Data Pipelines)

Helpful resources

Join our Fabric User Panel

Fabric Monthly Update - June 2025

Fabric Community Update - June 2025

Join the #PBI10 DataViz contest

Get data from REST API source with iterated nested API calls (with Data Pipelines)

Helpful resources

Join our Fabric User Panel

Fabric Monthly Update - June 2025

Fabric Community Update - June 2025