Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Enhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends August 31st. Request your voucher.

Reply
tr5610
New Member

Livy Error with runMultiple() DAG Chunking - InvalidHttpRequestToLivy: from cannot be less than 0 HT

I’m running into the Livy pagination bug that returns:

InvalidHttpRequestToLivy: from cannot be less than 0 HTTP status code: 400.

Setup:
- Orchestrator notebook uses runMultiple() to start worker notebooks in parallel.  Job size is approx 1000 tables, so chunking is done to have a DAG for smaller table batches.  Orchestrator calls runMultiple() multiple times (once per chunk of 100 tables) -- also tried reducing to chunks of 20 tables.
- Each chunk launches 20 worker notebooks in parallel -- similar post said to increase concurrency and I tried that up to 40
- Worker (child) notebooks return audit and watermark data via mssparkutils.notebook.exit(json_data) -- If I turn off all output from worker notebooks issue stops.

Error:
After processing approx 200 tables over 15-20 minutes no matter how parallelism or chunk size is set, 
InvalidHttpRequestToLivy: from cannot be less than 0 HTTP status code: 400

Root Cause Analysis:
- Livy session persists across multiple runMultiple() calls.  Seems to keep the same log buffer for the entire spark session which has a fixed limit and appearance is there is no trimming.
- Each chunk's worker output accumulates in Livy's log buffer as it gets passed back to the orchestrator notebook.
- Multiple runMultiple() calls appear to corrupt Livy's cursor state for log reading
- Error occurs when Livy tries to read logs with invalid cursor position

Workarounds Attempted:
- Reduced worker stdout logging (spark.sparkContext.setLogLevel("ERROR"))
- Added delays between chunks
- Livy session "reset" attempts (ineffective)

- Increasing size of spark session through additional nodes and executors

- Decreasing batch size, increasing batch size, decreasing parallel workers (concurrency), increasing parallel workers

Question:
Is this a known issue? Are there proper ways to clear/reset Livy session state between runMultiple() calls, or alternative approaches for large-scale
parallel notebook orchestration?

Impact: Blocking ETL processing of 750K+ tables across 500+ customers.

1 ACCEPTED SOLUTION
tr5610
New Member

The solution provided by the product team was to place this line before the runMultiple to prevent the DAG orchestration from displaying its graphics.  It seems to be working and they said a full fix would be implemented soon.  

spark.conf.set("spark.notebookutils.run.progressbar.enabled", "false")

View solution in original post

7 REPLIES 7
tr5610
New Member

The solution provided by the product team was to place this line before the runMultiple to prevent the DAG orchestration from displaying its graphics.  It seems to be working and they said a full fix would be implemented soon.  

spark.conf.set("spark.notebookutils.run.progressbar.enabled", "false")

v-prasare
Community Support
Community Support

Hi @tr5610 , 

In this scenario i suggest you to raise a support ticket here. so, that they can assit you in addressing the issue you are facing. please follow below link on how to raise a support ticket:

How to create a Fabric and Power BI Support ticket - Power BI | Microsoft Learn

 

 

 

Thanks,

Prashanth Are

MS Fabric community support

v-prasare
Community Support
Community Support

Hi @tr5610 ,

We would like to follow up to see if the solution provided by the community member resolved your issue. Please let us know if you need any further assistance.


@J-Mo & @vrsanaidu2025 , thanks for your prompt response.

 


Thanks,

Prashanth Are

MS Fabric community support


If our super user response resolved your issue, please mark it as "Accept as solution" and click "Yes" if you found it helpful.

No there has been no solution provided.

vrsanaidu2025
New Member

Instead of reusing the same session for every runMultiple batch, try Create a new Livy session per DAG chunk. This avoids log cursor conflicts and buffer overflow across chunks.

Did you have any specific detail for this?  I believe the livy session and the notebook session can't be seperated.  When I forced a restart of the livy session the notebook stopped.

When you say “create a new Livy session per DAG chunk,” do you mean there’s a PySpark or Fabric-native command I can run inside the same notebook to reset/reinitialize the Livy session—before calling runMultiple() again?

 

Just to clarify, my current setup is entirely notebook-driven:

---One orchestration notebook.

---It uses nbutils.runMultiple() inside a for loop to submit batches (chunks).

---Each runMultiple() call executes a batch of worker notebooks in parallel.

 

There’s no pipeline orchestration around it, so the notebook session itself stays open while runMultiple() runs multiple times.

 

 

Helpful resources

Announcements
Fabric July 2025 Monthly Update Carousel

Fabric Monthly Update - July 2025

Check out the July 2025 Fabric update to learn about new features.

August 2025 community update carousel

Fabric Community Update - August 2025

Find out what's new and trending in the Fabric community.