Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Did you hear? There's a new SQL AI Developer certification (DP-800). Start preparing now and be one of the first to get certified. Register now

avinandac

High Concurrency Support for the Fabric Livy API— Scalable Spark Automation (Preview)

Author: Avinanda Chattapadday Senior Product Manager


If you've been automating Spark workloads in Microsoft Fabric with the Livy API, you can now run more work in parallel with less overhead. 

The Fabric Livy API already lets you submit and manage Spark jobs programmatically—no Notebooks or Spark Job Definitions required. With the addition of High Concurrency (HC) sessions, you can run multiple workloads in parallel, reuse sessions, and isolate execution without managing any of the complexity yourself. 

 

Why this matters 

Managing concurrency in Spark has typically meant creating and coordinating multiple Livy sessions on the client side. This adds overhead, increases cost, and makes it harder to monitor what’s running. 

High Concurrency changes that equation. 

Instead of creating a separate session for each job, Fabric allows multiple workloads to share a single Spark session, with each running in its own isolated REPL. Fabric manages session packing, reuse, monitoring, and billing so you focus on your application logic. 

 

New capabilities 

  • Parallel Execution — Run multiple Spark statements at the same time within a managed HC session instead of queuing them sequentially.
  • Session Reuse with  sessionTag — Tag your requests so Fabric can route them to existing sessions when capacity is available, reducing startup overhead and compute overhead.
  • Workload Isolation — Each request runs its own REPL, so failures or cancellations don't affect others in the same session.
  • Built-in Monitoring — HC jobs appear in the Fabric Monitoring Hub as HC_<LakehouseName>_<SessionId>, giving you top-level visibility without extra instrumentation.
  • Cost Efficiency — Fewer sessions means less idle compute and better resource utilization. 

Use cases 

  • Pipeline Orchestration — Execute multiple Spark notebooks in parallel within a single pipeline, sharing the same Spark session for maximum throughput.
  • CI/CD Automation — Trigger test and production Spark jobs from your DevOps workflows with predictable resource usage.
  • Multi-tenant Workloads — Serve concurrent requests from different processes or users with isolated execution contexts.
  • Cost-Sensitive Batch Processing — Pack related ETL jobs into shared sessions to minimize compute allocation overhead.


How it works 

Acquire an HC session against your Lakehouse's Livy endpoint: 

POST https://api.fabric.microsoft.com/v1/workspaces/{workspace_id}/lakehouses/{lakehouse_id}/livyapi/versions/2023-12-01/sessions

 
Include a sessionTag in your request body to enable server-side session packing:

{ 
  "artifactName": "my-lakehouse", 
  "sessionTag": "nightly-etl-run", 
  "conf": { "spark.some.config": "value" }, 
  "executorMemory": "4g", 
  "executorCores": 2, 
  "numExecutors": 4 
} 

Fabric handles creating or reusing an underlying Spark session, spinning up an isolated REPL, and surfacing the job in monitoring. Poll the HC session endpoint until the state is Idle, then start executing statements. 

Up to five REPLs can run per underlying Livy session, and the same session Tag intelligently routes workloads to available capacity. 

 

Get started  

High Concurrency support for the Fabric Livy API is available now. Whether you're building automated Spark pipelines or optimizing Notebook orchestration, HC sessions give you the control and efficiency you've been asking for. 

 

We look forward to seeing what you create! Try it out and share your feedback via Reddit, LinkedIn or MS ideas, and let us know how Livy Endpoint HC is transforming your Spark automation workflows.