Get certified for free when you join Fabric Data Days 2026 and dive into Fabric, Power BI, SQL, AI, and other essential data skills.
Join nowData Days is here! Join us now for 60+ days of learning, challenges, and connection. Learn more
Your file has been submitted successfully. We’re processing it now - please check back in a few minutes to view your report.
Fabric Spark Pool Optimiser — right-size your Spark pools in 3 minutes
Every workspace in Microsoft Fabric gets the same default Spark pool. Medium node, up to 10 nodes. Nobody changes it — even in production, even when the actual workload is a 5,000-row dimension table or a monitoring notebook reading 0.002 GB.
This notebook analyses 7 days of real Spark session history across all your workspaces and tells you exactly which pools are oversized, undersized, or correctly sized — with a step-by-step configuration guide for each one.
What it does:
- Auto-discovers all workspaces you have access to
- Detects orchestrator workspaces automatically (runMultiple / Data Factory)
- Separates automated pipeline sessions from interactive dev sessions — dev sessions skew duration data and are excluded from the CU calculation
- Analyses GB read/written/shuffled via the Spark History stages API
- Estimates monthly CU savings based on real usage
- Renders an interactive dashboard directly in the notebook output
No lakehouse needed. No configuration. Just import and Run All.
Tested across two organisations. In one run: 50 workspaces analysed, 8 pools to change, 1,441 CU estimated monthly saving.
Feedback welcome — especially if you find API behaviour that differs in your environment.
https%3A%2F%2Fgithub.com%2Fenekoegiguren%2Ffabricsparkpooloptimiser