Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

The Power BI Data Visualization World Championships is back! It's time to submit your entry. Live now!

Reply
dbeavon3
Memorable Member
Memorable Member

Any Thoughts about Manually Killing Executors in a Fabric Notebook.

I have a series of cells that are executed in a fabric notebook.  At the very end there are some cess that need to use a lot more memory.  I frequently find that my executors are being killed by yarn like so (137 exit code)

 

dbeavon3_0-1764785313589.png

 

... upgrading from small to medium nodes didn't quite do the trick either.

 

One thing that works consistently is to set "spark.task.maxFailures" to a very elevated number like 10 or 100.  That basically allows an entire executor to die, and a new one to take its place (one time to achieve success, or as many as it takes to reach the max failures) 

 

I am grateful to have "spark.task.maxFailures" for troubleshooting.  However I really don't like this type of trial-and-error programming approach.  It feels dirty to me (even for a python -based or notebook -based solution.)

 

I've tried various things to free up memory like calling "unpersist" on dataframes and what-not. But they don't consistently work.  The only thing that consistently works is for the entire executor to get killed and replaced.  (Another possible solution would be to split apart the notebook into two separate notebooks, each with their own distinct spark session.)

 

Since I already know the stage of the notebook where I need to recover all the executor memory, I'm evaluating the use of sparkContext.killExecutors.  My notebook uses dynamically allocated executors and I'd just as soon recycle the bad ones myself. (Rather than to rely on "spark.task.maxFailures" to do it for me.)

 

Has anyone else tried to use the "killExecutors" method, to avoid a persistent error in an executor that won't release its memory?  I found the idea here:

dbeavon3_1-1764785851068.png

 

 

1 ACCEPTED SOLUTION
dbeavon3
Memorable Member
Memorable Member

The code I posted earlier was sort of a dead-end so I'll post the solution I found.

 

I discovered you cannot kill the executors that are created automatically (spark.dynamicAllocation.enabled).  So the first step is to configure the notebook to run with manual allocation of executors.
Set spark.dynamicAllocation.enabled and spark.executor.instances:

 

dbeavon3_0-1764794393711.png

 

 

When you get to the part of the code where you want a fresh executor, you need to use the JVM gateway, to pass commands from python to the real underlying JVM programming platform.  Then get executors like so

 

spark.conf.get("spark.executor.instances")
my_jsc = spark._jsc.sc() 

v003 = my_jsc.getExecutorIds()

 

The ID's are pretty simple ('1', or '2', or '3', or some combination of those).  After you have the list of ID's you can kill them like so:

executorIds = [v003.apply(i) for i in range(v003.length())]   

for executor_id in executorIds:
    my_jsc.killExecutor(executor_id) 


Finally you can create fresh ones, that will (hopefully) have the same amount of memory as a freshly-started notebook:

 

my_jsc.requestExecutors(1)

 

Hopefully this is useful.  I think it is unfortunate that it was even necessary in the first place.  However I discovered that it is hard to get an executor to release memory.  (it is hard enough just to inspect the memory being used in executors, after we start getting 137 errors).  The approach here may be a preferred alternative to just trial-and-error approaches, like increasing max failures, or changing memory sizes from small->medium->large.

View solution in original post

1 REPLY 1
dbeavon3
Memorable Member
Memorable Member

The code I posted earlier was sort of a dead-end so I'll post the solution I found.

 

I discovered you cannot kill the executors that are created automatically (spark.dynamicAllocation.enabled).  So the first step is to configure the notebook to run with manual allocation of executors.
Set spark.dynamicAllocation.enabled and spark.executor.instances:

 

dbeavon3_0-1764794393711.png

 

 

When you get to the part of the code where you want a fresh executor, you need to use the JVM gateway, to pass commands from python to the real underlying JVM programming platform.  Then get executors like so

 

spark.conf.get("spark.executor.instances")
my_jsc = spark._jsc.sc() 

v003 = my_jsc.getExecutorIds()

 

The ID's are pretty simple ('1', or '2', or '3', or some combination of those).  After you have the list of ID's you can kill them like so:

executorIds = [v003.apply(i) for i in range(v003.length())]   

for executor_id in executorIds:
    my_jsc.killExecutor(executor_id) 


Finally you can create fresh ones, that will (hopefully) have the same amount of memory as a freshly-started notebook:

 

my_jsc.requestExecutors(1)

 

Hopefully this is useful.  I think it is unfortunate that it was even necessary in the first place.  However I discovered that it is hard to get an executor to release memory.  (it is hard enough just to inspect the memory being used in executors, after we start getting 137 errors).  The approach here may be a preferred alternative to just trial-and-error approaches, like increasing max failures, or changing memory sizes from small->medium->large.

Helpful resources

Announcements
December Fabric Update Carousel

Fabric Monthly Update - December 2025

Check out the December 2025 Fabric Holiday Recap!

FabCon Atlanta 2026 carousel

FabCon Atlanta 2026

Join us at FabCon Atlanta, March 16-20, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.