Join us at FabCon Atlanta from March 16 - 20, 2026, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.
Register now!The Power BI Data Visualization World Championships is back! Get ahead of the game and start preparing now! Learn more
Py4JJavaError: An error occurred while calling o4364.load. : org.apache.spark.SparkClassNotFoundException: [DATA_SOURCE_NOT_FOUND] Failed to find the data source: cloudFiles. Please find packages at `https://spark.apache.org/third-party-projects.html`.
Hi @sunilmaghanuru ,
Please follow the steps below:
1. The error message indicates a mismatch between Python versions in the worker and driver environments. Ensure that both environments use the same minor Python version.Check the environment variables PYSPARK_PYTHON and PYSPARK_DRIVER_PYTHON to ensure they are correctly set.
2.The error may also be caused by Java heap space limitations. Consider adjusting the driver memory configuration in your Spark session.
3.Ensure that your Spark version supports cloudFiles. Compatibility can sometimes be an issue. If you're using Spark with a build that doesn't include `cloudFiles` by default, you might need to include the appropriate package when starting your Spark session.
4.Some data sources require specific configuration settings. Review the documentation to ensure you have configured everything correctly for `cloudFiles`.
Best Regards,
Neeko Tang
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
Turn streaming data into instant insights with Microsoft Fabric. Learn to connect live sources, visualize in seconds, and use Copilot + AI for smarter decisions.