Solved: Underlying files have been updated. Explicitly inv...

smpa01 · ‎07-28-2024

I need to read a DW table in a NB.

When I try to read this through a shortcut (of data warehouse table in a lakehouse), I am coming across following

// #1 - did not work
df_one = spark.read.format('delta').load("abfss://abcd@onelake.dfs.fabric.microsoft.com/StagingLakehouse.Lakehouse/Tables/dw_tbl")
df_two = df_one.collect()
print(df_two)

//#2 - did not work
df_one = spark.read.format('delta').load("abfss://abcd@onelake.dfs.fabric.microsoft.com/StagingLakehouse.Lakehouse/Tables/dw_tbl").createOrReplaceTempView('x1')
df_two = spark.table('x1')
print(df_two.collect())

//#3 - did not work
etl_df = DeltaTable.forName(spark, "dw_tbl").toDF()
print(etl_df.collect())

#4 - did not work
df_one = spark.read.format('delta').load("abfss://abcd@onelake.dfs.fabric.microsoft.com/StagingLakehouse.Lakehouse/Tables/dw_tbl")
df_two = df_one.toPandas()

I am getting an error here consistently

It is possible the underlying files have been updated. You can explicitly invalidate the cache in Spark by running 'REFRESH TABLE tableName' command in SQL or by recreating the Dataset/DataFrame involved.

I can also read the table directly from DW instaead of shortcut but I am not sure if I will be coming across this error

spark.read.format('delta').load('abfss://grpId@onelake.dfs.fabric.microsoft.com/dwID/Tables/dbo/info_etl')

Which method would be result in error free reading and will not generate error in succedding codes (not trying to write back in DW table using NB; the dw table works more of a lookup src table for lakehouse table manipulation).

Thank you in advance.

Did I answer your question? Mark my post as a solution!

Proud to be a Super User!

My custom visualization projects

Plotting Live Sound: Viz1

Beautiful News:Viz1, Viz2, Viz3

Visual Capitalist: Working Hrs

Others:Easing Graph, Animated Calendar

Anonymous · ‎07-28-2024

HI @smpa01,

I test on my lakehouse to create a shortcut from data warehouse, and use your codes to read data from the shortcut path. These functions works well.

Have you setting the default the Lakehouse ? These notebook codes use some functions from library which not initialized and they will return the error. (setting default Lakehouse will attach some libraries and configurations such as the spark session)

BTW, it seems like you are working with source data from staging table, right? AFAIK, it is a tempoar table which used in the Extract, Transform, and Load (ETL) process within a data warehouse.

Regards,

Xiaoxin Sheng

View solution in original post

Anonymous · ‎07-28-2024

HI @smpa01,

I test on my lakehouse to create a shortcut from data warehouse, and use your codes to read data from the shortcut path. These functions works well.

Have you setting the default the Lakehouse ? These notebook codes use some functions from library which not initialized and they will return the error. (setting default Lakehouse will attach some libraries and configurations such as the spark session)

BTW, it seems like you are working with source data from staging table, right? AFAIK, it is a tempoar table which used in the Extract, Transform, and Load (ETL) process within a data warehouse.

Regards,

Xiaoxin Sheng

Underlying files have been updated. Explicitly invalidate the cache in Spark by running 'REFRESH TAB

Helpful resources

Join our Fabric User Panel

Fabric Monthly Update - May 2025

Fabric Community Update - June 2025

Become a Certified Power BI Data Analyst!

Underlying files have been updated. Explicitly invalidate the cache in Spark by running 'REFRESH TAB

Helpful resources

Join our Fabric User Panel

Fabric Monthly Update - May 2025

Fabric Community Update - June 2025