Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Calling all Data Engineers! Fabric Data Engineer (Exam DP-700) live sessions are back! Starting October 16th. Sign up.

Reply
DennesTorres
Impactful Individual
Impactful Individual

RunMultiple DAG and tricks

Hi,

In summary, my main question is:

If I need to use RunMultiple to call an external API multiple times and get the JSON result, are there specific tricks to be used to extract the API result from inside the DAG result ?

Let me explain better why I'm asking:

I have a notebook which makes these calls sequentially. I want to improve the performance changing to parallel execution.

On the original notebook, when I call the API, this is the piece of code which starts processing the result:

    result = requests.post(function_url, data = body, headers=headers)
    
    if result.status_code!=200:
        result.raise_for_status()
    
    data=json.loads(result.text)

    if data==None:
        continue

    for serviceLine in data:

 

From this point forwards, everythink works well.

When using RunMultiple, I created a parameterized notebook to make one call and return the result. This is how I'm returning the result:

result = requests.post(function_url, data = body, headers=headers)

if result.status_code!=200:
    result.raise_for_status()
    
data=json.loads(result.text)

mssparkutils.notebook.exit(data)

 

The problem is: Everything is arriving differently and I'm having to make more changes to the processing code (not visible on the code blocks above) than I expected.

First, to extract the JSON from the DAG result, I had to make some transformations on it:


def prepareJSON(jsonValue):
    jsonValue = jsonValue.replace("\'", "\"")
    jsonValue = jsonValue.replace('None','[]')
    jsonValue = jsonValue.replace('True','true')
    jsonValue = jsonValue.replace('False','false')
    return(jsonValue)  

    exitVal=result[account_number]['exitVal']

    if exitVal==None:
        continue

    exitVal = prepareJSON(exitVal)
    row=json.loads(exitVal)

 

On the code above, "account_number" is the value used as the name of the activity in the DAG. 

 

If it had stopped here, I would not be so concerned. But at the end of the processing it generated errors because it can't join the results, some columns have mixed data types.

Back to the question: Is there any special tricks or suggestions to extract a value from inside the DAG without having to change so much my original processing code ?

Kind Regards,

 

Dennes

2 ACCEPTED SOLUTIONS
Anonymous
Not applicable

Hi @DennesTorres
The internal team replied as follows:

Just a suggestion, but seems like it's simply a problem with json->string->json conversion. Perhaps try json.dumps at notebook exit to safely convert the json to string and then json.loads once you've combined them together? Also remember the new soft limit for runMultiple is by default 50 so may be better to convert to python multithreading but note this will only run on the driver node so use single node pool

Hope this helps. Please let me know if you have any further questions.

View solution in original post

Hi,

Thank you. Yes, I found a similar solution. 

The problem was the load. Making json.load before returning the data converts string -> dictionary . When returning the dictionary in the DAG, the conversion from dictionary -> string doesn't result in the same string.

The solution was to not use JSON load inside the parallel execution. I returned the original string, retrieved outside and used the json.load outside the parallel exeuction. It work perfectly.

Is the same as the recommendation.

About the runMultiple limit, I'm aware. About python multithreading, it's something I need to explore.

Kind Regards,

Dennes

View solution in original post

3 REPLIES 3
Anonymous
Not applicable

Hi @DennesTorres 
Thanks for using Fabric Community.
At this time, we are reaching out to the internal team to get some help on this. We will update you once we hear back from them.
Thanks 

Anonymous
Not applicable

Hi @DennesTorres
The internal team replied as follows:

Just a suggestion, but seems like it's simply a problem with json->string->json conversion. Perhaps try json.dumps at notebook exit to safely convert the json to string and then json.loads once you've combined them together? Also remember the new soft limit for runMultiple is by default 50 so may be better to convert to python multithreading but note this will only run on the driver node so use single node pool

Hope this helps. Please let me know if you have any further questions.

Hi,

Thank you. Yes, I found a similar solution. 

The problem was the load. Making json.load before returning the data converts string -> dictionary . When returning the dictionary in the DAG, the conversion from dictionary -> string doesn't result in the same string.

The solution was to not use JSON load inside the parallel execution. I returned the original string, retrieved outside and used the json.load outside the parallel exeuction. It work perfectly.

Is the same as the recommendation.

About the runMultiple limit, I'm aware. About python multithreading, it's something I need to explore.

Kind Regards,

Dennes

Helpful resources

Announcements
FabCon Global Hackathon Carousel

FabCon Global Hackathon

Join the Fabric FabCon Global Hackathon—running virtually through Nov 3. Open to all skill levels. $10,000 in prizes!

September Fabric Update Carousel

Fabric Monthly Update - September 2025

Check out the September 2025 Fabric update to learn about new features.

FabCon Atlanta 2026 carousel

FabCon Atlanta 2026

Join us at FabCon Atlanta, March 16-20, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.