Power BI is turning 10! Tune in for a special live episode on July 24 with behind-the-scenes stories, product evolution highlights, and a sneak peek at what’s in store for the future.
Save the dateEnhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends August 31st. Request your voucher.
Py4JJavaError: An error occurred while calling o32067.csv. : org.apache.spark.SparkException: Job aborted due to stage failure: Serialized task 182:0 was 197125878 bytes, which exceeds max allowed: spark.rpc.message.maxSize (134217728 bytes). Consider increasing spark.rpc.message.maxSize or using broadcast variables for large values.
Where or what settings can I do in the Python notebook to increase its size?
Thanks.
Solved! Go to Solution.
Hi @tan_thiamhuat ,
This error usually comes up in Spark when the data or objects you’re sending between nodes are too large for the default configuration. The key setting here is spark.rpc.message.maxSize, and you can bump it up directly in your notebook.
In your Python notebook, you can increase this limit by adding a cell at the top with the following:
%%configure -f { "conf": { "spark.rpc.message.maxSize": "512" } }
You can adjust the value (like 512) to something higher if needed, depending on your data size.
If you’re still hitting limits after increasing this, it’s often a good idea to refactor your code to avoid sending huge objects between nodes, maybe by using broadcast variables or splitting up the data.
Hope this helps! Let us know if you run into any more issues.
Hi @tan_thiamhuat,
As we haven’t heard back from you, we would like to follow up to see if the solution provided by the super user resolved your issue. Please let us know if you need any further assistance.
If our super user response resolved your issue, please mark it as "Accept as solution" and click "Yes" if you found it helpful.
Regards,
Vinay Pabbu
Hi @tan_thiamhuat ,
This error usually comes up in Spark when the data or objects you’re sending between nodes are too large for the default configuration. The key setting here is spark.rpc.message.maxSize, and you can bump it up directly in your notebook.
In your Python notebook, you can increase this limit by adding a cell at the top with the following:
%%configure -f { "conf": { "spark.rpc.message.maxSize": "512" } }
You can adjust the value (like 512) to something higher if needed, depending on your data size.
If you’re still hitting limits after increasing this, it’s often a good idea to refactor your code to avoid sending huge objects between nodes, maybe by using broadcast variables or splitting up the data.
Hope this helps! Let us know if you run into any more issues.
Hello @tan_thiamhuat
This means a Spark job tried to send a message (such as a serialized task or data) that is larger than the configured maximum (`spark.rpc.message.maxSize`, default 128 MiB).
%%configure -f
{
"conf": {
"spark.rpc.message.maxSize": "512"
}
}
try this too
df = df.repartition(100) # Increase the number as needed
This is your chance to engage directly with the engineering team behind Fabric and Power BI. Share your experiences and shape the future.
Check out the June 2025 Fabric update to learn about new features.
User | Count |
---|---|
41 | |
14 | |
9 | |
6 | |
3 |
User | Count |
---|---|
47 | |
46 | |
14 | |
9 | |
6 |