Power BI is turning 10! Tune in for a special live episode on July 24 with behind-the-scenes stories, product evolution highlights, and a sneak peek at what’s in store for the future.
Save the dateEnhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends August 31st. Request your voucher.
I have data in a Lakehouse and I have deleted some of it. I am trying to load it from a Fabric Notebook.
When I use spark.sql("SELECT * FROM parquet.`<abfs_path>/Tables/<table_name>`" then I get the old data I have deleted from the lakehouse.
When I use spark.read.load(<abfs_path>/Tables/<table_name>) I dont get this deleted data.
I have to use the abfs path as I am not setting a default lakehouse and can't set one to solve this.
Why is this old data coming up when I use spark.sql when the paths are exactly the same?
Solved! Go to Solution.
solved by changing it to delta
spark.sql("SELECT * FROM delta.`<abfs_path>/Tables/<table_name>`")
@Zoe_Guest Thanks for being part of Fabric community and making it grow.
@wardy912 Thanks for your prompt response.
Thanks,
Prashanth Are
MS Fabric community support
I can't set it as the default lakehouse which is why i want to use the abfs path, how do you do this with the abfs path?
solved by changing it to delta
spark.sql("SELECT * FROM delta.`<abfs_path>/Tables/<table_name>`")
The paths are the same but you're using a different method to query them
spark.sql("SELECT * FROM parquet.`<abfs_path>/Tables/<table_name>`")
Spark SQL - may be using cached metadata
spark.read.load("<abfs_path>/Tables/<table_name>")
Dataframe API - reads current state of files
You could add a cell to your notebook that clears the cache if you want to use the Spark SQL code
spark.catalog.clearCache()
Please give a thumbs up if this helps, thanks
Unfortuantly clearing the cache doesn't work.
however this also gets the deleted data, so i think it's in specifying parquet.
spark.read.format("parquet").load(_table_abfs)