<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Any Thoughts about Manually Killing Executors in a Fabric Notebook. in Data Engineering</title>
    <link>https://community.fabric.microsoft.com/t5/Data-Engineering/Any-Thoughts-about-Manually-Killing-Executors-in-a-Fabric/m-p/4892163#M13838</link>
    <description>&lt;P&gt;The code I posted earlier was sort of a dead-end so I'll post the solution I found.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I discovered you cannot kill the executors that are created automatically (spark.dynamicAllocation.enabled).&amp;nbsp; So the first step is to configure the notebook to run with manual allocation of executors.&lt;BR /&gt;Set&amp;nbsp;spark.dynamicAllocation.enabled and spark.executor.instances:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="dbeavon3_0-1764794393711.png" style="width: 400px;"&gt;&lt;img src="https://community.fabric.microsoft.com/t5/image/serverpage/image-id/1314466iA1105FE7850D61DD/image-size/medium?v=v2&amp;amp;px=400" role="button" title="dbeavon3_0-1764794393711.png" alt="dbeavon3_0-1764794393711.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;When you get to the part of the code where you want a fresh executor, you need to use the JVM gateway, to pass commands from python to the real underlying JVM programming platform.&amp;nbsp; Then get executors like so&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="java"&gt;spark.conf.get("spark.executor.instances")
my_jsc = spark._jsc.sc() 

v003 = my_jsc.getExecutorIds()&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The ID's are pretty simple ('1', or '2', or '3', or some combination of those).&amp;nbsp; After you have the list of ID's you can kill them like so:&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;LI-CODE lang="python"&gt;executorIds = [v003.apply(i) for i in range(v003.length())]   

for executor_id in executorIds:
    my_jsc.killExecutor(executor_id) &lt;/LI-CODE&gt;&lt;P&gt;&lt;BR /&gt;Finally you can create fresh ones, that will (hopefully) have the same amount of memory as a freshly-started notebook:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="python"&gt;my_jsc.requestExecutors(1)&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Hopefully this is useful.&amp;nbsp; I think it is unfortunate that it was even necessary in the first place.&amp;nbsp; However I discovered that it is hard to get an executor to release memory.&amp;nbsp; (it is hard enough just to inspect the memory being used in executors, after we start getting 137 errors).&amp;nbsp; The approach here may be a preferred alternative to just trial-and-error approaches, like increasing max failures, or changing memory sizes from small-&amp;gt;medium-&amp;gt;large.&lt;/P&gt;</description>
    <pubDate>Wed, 03 Dec 2025 20:51:49 GMT</pubDate>
    <dc:creator>dbeavon3</dc:creator>
    <dc:date>2025-12-03T20:51:49Z</dc:date>
    <item>
      <title>Any Thoughts about Manually Killing Executors in a Fabric Notebook.</title>
      <link>https://community.fabric.microsoft.com/t5/Data-Engineering/Any-Thoughts-about-Manually-Killing-Executors-in-a-Fabric/m-p/4892083#M13835</link>
      <description>&lt;P&gt;I have a series of cells that are executed in a fabric notebook.&amp;nbsp; At the very end there are some cess that need to use a lot more memory.&amp;nbsp; I frequently find that my executors are being killed by yarn like so (137 exit code)&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="dbeavon3_0-1764785313589.png" style="width: 773px;"&gt;&lt;img src="https://community.fabric.microsoft.com/t5/image/serverpage/image-id/1314447iCD07E3701F7DDE54/image-dimensions/773x134?v=v2" width="773" height="134" role="button" title="dbeavon3_0-1764785313589.png" alt="dbeavon3_0-1764785313589.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;... upgrading from small to medium nodes didn't quite do the trick either.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;One thing that works consistently is to set&amp;nbsp;"&lt;STRONG&gt;spark.task.maxFailures&lt;/STRONG&gt;" to a very elevated number like 10 or 100.&amp;nbsp; That basically allows an entire executor to die, and a new one to take its place (one time to achieve success, or as many as it takes to reach the max failures)&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I am grateful to have "spark.task.maxFailures" for troubleshooting.&amp;nbsp; However I really don't like this type of trial-and-error programming approach.&amp;nbsp; It feels dirty to me (even for a python -based or notebook -based solution.)&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I've tried various things to free up memory like calling "unpersist" on dataframes and what-not. But they don't consistently work.&amp;nbsp; The only thing that consistently works is for the entire executor to get killed and replaced.&amp;nbsp; (Another possible solution would be to split apart the notebook into two separate notebooks, each with their own distinct spark session.)&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Since I already know the stage of the notebook where I need to recover all the executor memory, I'm evaluating the use of&amp;nbsp;&lt;STRONG&gt;sparkContext.killExecutors&lt;/STRONG&gt;.&amp;nbsp; My notebook uses dynamically allocated executors and I'd just as soon recycle the bad ones myself. (Rather than to rely on "spark.task.maxFailures" to do it for me.)&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Has anyone else tried to use the "killExecutors" method, to avoid a persistent error in an executor that won't release its memory?&amp;nbsp; I found the idea here:&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="dbeavon3_1-1764785851068.png" style="width: 738px;"&gt;&lt;img src="https://community.fabric.microsoft.com/t5/image/serverpage/image-id/1314449i2E69DCF183FFB530/image-dimensions/738x318?v=v2" width="738" height="318" role="button" title="dbeavon3_1-1764785851068.png" alt="dbeavon3_1-1764785851068.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 03 Dec 2025 18:20:53 GMT</pubDate>
      <guid>https://community.fabric.microsoft.com/t5/Data-Engineering/Any-Thoughts-about-Manually-Killing-Executors-in-a-Fabric/m-p/4892083#M13835</guid>
      <dc:creator>dbeavon3</dc:creator>
      <dc:date>2025-12-03T18:20:53Z</dc:date>
    </item>
    <item>
      <title>Re: Any Thoughts about Manually Killing Executors in a Fabric Notebook.</title>
      <link>https://community.fabric.microsoft.com/t5/Data-Engineering/Any-Thoughts-about-Manually-Killing-Executors-in-a-Fabric/m-p/4892163#M13838</link>
      <description>&lt;P&gt;The code I posted earlier was sort of a dead-end so I'll post the solution I found.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I discovered you cannot kill the executors that are created automatically (spark.dynamicAllocation.enabled).&amp;nbsp; So the first step is to configure the notebook to run with manual allocation of executors.&lt;BR /&gt;Set&amp;nbsp;spark.dynamicAllocation.enabled and spark.executor.instances:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="dbeavon3_0-1764794393711.png" style="width: 400px;"&gt;&lt;img src="https://community.fabric.microsoft.com/t5/image/serverpage/image-id/1314466iA1105FE7850D61DD/image-size/medium?v=v2&amp;amp;px=400" role="button" title="dbeavon3_0-1764794393711.png" alt="dbeavon3_0-1764794393711.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;When you get to the part of the code where you want a fresh executor, you need to use the JVM gateway, to pass commands from python to the real underlying JVM programming platform.&amp;nbsp; Then get executors like so&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="java"&gt;spark.conf.get("spark.executor.instances")
my_jsc = spark._jsc.sc() 

v003 = my_jsc.getExecutorIds()&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The ID's are pretty simple ('1', or '2', or '3', or some combination of those).&amp;nbsp; After you have the list of ID's you can kill them like so:&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;LI-CODE lang="python"&gt;executorIds = [v003.apply(i) for i in range(v003.length())]   

for executor_id in executorIds:
    my_jsc.killExecutor(executor_id) &lt;/LI-CODE&gt;&lt;P&gt;&lt;BR /&gt;Finally you can create fresh ones, that will (hopefully) have the same amount of memory as a freshly-started notebook:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="python"&gt;my_jsc.requestExecutors(1)&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Hopefully this is useful.&amp;nbsp; I think it is unfortunate that it was even necessary in the first place.&amp;nbsp; However I discovered that it is hard to get an executor to release memory.&amp;nbsp; (it is hard enough just to inspect the memory being used in executors, after we start getting 137 errors).&amp;nbsp; The approach here may be a preferred alternative to just trial-and-error approaches, like increasing max failures, or changing memory sizes from small-&amp;gt;medium-&amp;gt;large.&lt;/P&gt;</description>
      <pubDate>Wed, 03 Dec 2025 20:51:49 GMT</pubDate>
      <guid>https://community.fabric.microsoft.com/t5/Data-Engineering/Any-Thoughts-about-Manually-Killing-Executors-in-a-Fabric/m-p/4892163#M13838</guid>
      <dc:creator>dbeavon3</dc:creator>
      <dc:date>2025-12-03T20:51:49Z</dc:date>
    </item>
  </channel>
</rss>

