Power BI is turning 10! Tune in for a special live episode on July 24 with behind-the-scenes stories, product evolution highlights, and a sneak peek at what’s in store for the future.
Save the dateEnhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends August 31st. Request your voucher.
delete_duplicates_sql = f"""
MERGE INTO delta.`{target_table_path}` AS target
USING (
SELECT * FROM RankedRowsToDelete
) AS source
ON source.{target_id} = target.{target_id} AND {watermark_join_on_expression} AND COALESCE(CAST(source.{layer}_pipeline_insert_date AS TIMESTAMP), '1970-01-01 00:00:00') = COALESCE(CAST(target.{layer}_pipeline_insert_date AS TIMESTAMP), '1970-01-01 00:00:00') AND (target.year = year AND target.month = month)
WHEN MATCHED THEN DELETE
"""
I get a Container exited with a non-zero exit code 137 after about 20 or so minutes. This error code seems to imply some memory issue.
Py4JJavaError: An error occurred while calling o358.sql.
: org.apache.spark.SparkException: Job aborted due to stage failure: Task 5 in stage 44.0 failed 4 times, most recent failure: Lost task 5.3 in stage 44.0 (TID 5891) (vm executor ExecutorLostFailure (executor 7 exited caused by one of the running tasks) Reason: Container from a bad node: container on host: vm-. Exit status: 137. Diagnostics: [2024-12-29 23:55:12.171]Container killed on request. Exit code is 137
[2024-12-29 23:55:12.203]Container exited with a non-zero exit code 137.
[2024-12-29 23:55:12.212]Killed by external signal
I've tried modifying the workspace environment going from 4 executor small nodes to 10 executor medium nodes and this does not solve the issue either. Does anyone have any recommendations
Were you able to get this issue fixed? I am experiencing the exact same issue. A lot of executors failing with error code 137. I am migrating jobs currently running fine (daily, never once crashing) from Azure Synapse into Fabric. I am using identical pool sizes, but even so - the fabric jobs are crashing left and right.
Even when doubling the spark pool size (going from 3x small nodes to 3x medium nodes) I am seeing similar executor failures. Sometimes the jobs manage to finish, sometimes they pull the livy session down with them and the entire application fails.
Monitoring the spark application memory usage while executing, the executors are only satuated to around 50% memory usage when they die. They do however almost always die when fully utilized on CPU...
I have tried everything; disabled persisting of dataframes, increased overhead memory on executors, ... But no change; Fabric just cant keep my simple jobs alive. And they are simple; reading from delta, saving to delta, working with 100MB-4GB delta tables. This can be run on a potato, but apparently not in Fabric..
Note: I am not using Native Execution engine, because that **bleep** brings it own case of issues to the party. Gluten exploding in my face at every turn.. So this should be as close to 1:1 to Azure Synapse as it gets, I would think..
User | Count |
---|---|
6 | |
3 | |
2 | |
2 | |
2 |
User | Count |
---|---|
18 | |
17 | |
5 | |
4 | |
4 |