Explore and share Fabric Notebooks to boost Power BI insights in the new community notebooks gallery.
Check it out now!Microsoft is giving away 50,000 FREE Microsoft Certification exam vouchers. Get Fabric certified for FREE! Learn more
HI,
We are using MS Fabric pyspark notebooks. How can i get the spark session configuration in the notebook?
Specifically i need to know if the parquet predicatepushdown is enabled
Please advise how i can retrieve all the session configurations please
thanks
Solved! Go to Solution.
Hi @msprog ,
Thank you for reaching out to the Microsoft Fabric Community.
In Microsoft Fabric PySpark Notebooks, you can retrieve all current Spark session configurations using the command spark.conf.getAll(), which returns a complete dictionary of configuration settings for your active session.
To specifically check whether Parquet predicate pushdown is enabled — an optimization that improves performance by reducing the amount of data read from Parquet files through filtering at the file scan level — you can query the configuration key spark.sql.parquet.filterPushdown with spark.conf.get("spark.sql.parquet.filterPushdown").
If this returns true, the pushdown feature is enabled and functioning. If it returns false, the feature is disabled. If you find it's disabled but need it enabled for performance reasons, you can update the setting directly within the notebook using spark.conf.set("spark.sql.parquet.filterPushdown", "true").
This ensures that your Parquet data processing is optimized to read only the necessary data, improving both performance and resource efficiency.
I hope my suggestions give you good idea, if you need any further assistance, feel free to reach out.
If this post helps, then please give us Kudos and consider Accept it as a solution to help the other members find it more quickly.
Thank you.
@v-tsaipranay thanks for this. very helpful. one abservation when i ran the scripts:
when i run spark.conf.get("spark.sql.parquet.filterPushdown") i get the value as true.
However, when i run the spark.sparkContext.getConf().getAll() , i get around 300 session configurations but i find that spark.sql.parquet.filterPushdown is not present among the 300 key values.
I was expecting it to be present. ami missing something here please?
Hi @msprog ,
Thank you for reaching out to the Microsoft Fabric Community.
In Microsoft Fabric PySpark Notebooks, you can retrieve all current Spark session configurations using the command spark.conf.getAll(), which returns a complete dictionary of configuration settings for your active session.
To specifically check whether Parquet predicate pushdown is enabled — an optimization that improves performance by reducing the amount of data read from Parquet files through filtering at the file scan level — you can query the configuration key spark.sql.parquet.filterPushdown with spark.conf.get("spark.sql.parquet.filterPushdown").
If this returns true, the pushdown feature is enabled and functioning. If it returns false, the feature is disabled. If you find it's disabled but need it enabled for performance reasons, you can update the setting directly within the notebook using spark.conf.set("spark.sql.parquet.filterPushdown", "true").
This ensures that your Parquet data processing is optimized to read only the necessary data, improving both performance and resource efficiency.
I hope my suggestions give you good idea, if you need any further assistance, feel free to reach out.
If this post helps, then please give us Kudos and consider Accept it as a solution to help the other members find it more quickly.
Thank you.
User | Count |
---|---|
11 | |
4 | |
2 | |
2 | |
1 |
User | Count |
---|---|
5 | |
5 | |
4 | |
3 | |
3 |