Reading a OneLake Shortcut - Getting frequent erro...

JosueMolina · ‎11-14-2024

We have a medallion structure, with one Lakehouse for each layer. Bronze - Silver - Gold

For the Gold layer, we use a Notebook to move and filter data from our Silver layer. We mostly filter data based on Region (think sales for the whole country and we have some tables we need for one region)

Now, some tables we will load directly to Bronze and create a shortcut to Silver, so we can reference them in the same Lakehouse, as well as using the Gold layer notebook to filter them out, since we just set the Silver layer abfs path and call each table one by one.

So the error we get is that 7 out of 10 times we try to load a shortcut table from the Silver lakehouse and it will tell us that the underlying files must have changed and to do a Refresh for the table.

This happens ALL the time. And if I loop over a list of 10 tables, and I get this error for the 5th table, I will usually get the same error for the following tables even if they aren't shortcuts.

Seems like some sort of bug in the Shortcut information between both Lakehouses where the file reference is not being properly sent to the Spark session.

annhwallinger · ‎06-05-2025

Any help from Microsoft on this one? It is plaguing our development.

Anonymous · ‎11-17-2024

HI @JosueMolina,

Did this issue occurred on the specific shortcut and table or random appeared on different tables? Can you please share some more detail about these?

BTW, I'm not so recommend your to looping reference between shortcuts. AFAIK, each shortcut will load and recognize data, they may cause the issue if the referenced object and operations not fully complete.

Regards,

Xiaoxin Sheng

JosueMolina · ‎11-18-2024

This happens with different tables and does not happen every single time we try to load the Delta Table in our Notebook, but it will happen 7 out of 10 times.

For clarity on the Looping part. We just have a list of table names (all strings) and we loop over them going through a read operation, filter and then save to default Lakehouse. While I am looking at the option of just having a parametrized Notebook, I have the theory that it will not solve the issue for the shortcut.

Here is an error I got today:

Py4JJavaError: An error occurred while calling o8318.count. 
: org.apache.spark.SparkException: Job aborted due to stage failure: Task 8 in stage 1544.0 failed 4 times, most recent failure: Lost task 8.3 in stage 1544.0 (TID 15529) (vm-d5775677 executor 2): org.apache.spark.SparkFileNotFoundException: Operation failed: "Not Found", 404, HEAD, http://onelake.dfs.fabric.microsoft.com/<workspace>/<lakehouse>.Lakehouse/Tables/<shortcut_table>/pa... 

It is possible the underlying files have been updated. You can explicitly invalidate the cache in Spark by running 'REFRESH TABLE tableName' command in SQL or by recreating the Dataset/DataFrame involved.

Anonymous · ‎12-01-2024

Hi @JosueMolina,

If this issue appear on random table, I'd like to suggest you contact the dev team to check the backend processing and logs to confirm the root causes.

Support | Microsoft Power BI
Regards,

Xiaoxin Sheng

JosueMolina · ‎12-02-2024

Thanks. Issue is difficult to replicate on a call as it does not happen consistently. It seems to be affected on refresh rates between the OneLake Shortcut and the Spark session from the notebook. I'll see what they can help with.

FabianSchut · ‎11-15-2024

This topic seems to have a similar problem: https://community.fabric.microsoft.com/t5/Data-Engineering/Underlying-files-have-been-updated-Explic.... It seems that they got it resolved by attaching a default lakehouse to the notebook which includes some libraries and configurations such as the spark session. Can you try to set a default lakehouse?

JosueMolina · ‎11-15-2024

Lakehouse I am trying to read is the default lakehouse for the Notebook. Yes.

Anonymous · ‎11-14-2024

Hi @JosueMolina,

You can refer to the following docuemnt which mention the data pipeline and activity limits if your scenario meet to these limitations:

Data Factory limitations overview - Microsoft Fabric | Microsoft Learn

Regards,

Xiaoxin Sheng

JosueMolina · ‎11-15-2024

Thanks, @Anonymous

But this is happening when trying to read via Notebooks, not Pipelines. And this is not an error with activity limits, but instead a problem with Spark caching and OneLake Shortcuts.

annhwallinger · ‎05-28-2025

we have the same issue

Reading a OneLake Shortcut - Getting frequent errors

Helpful resources

Join our Fabric User Panel

Fabric Monthly Update - May 2025

Fabric Community Update - June 2025

Become a Certified Power BI Data Analyst!

Reading a OneLake Shortcut - Getting frequent errors

Helpful resources

Join our Fabric User Panel

Fabric Monthly Update - May 2025

Fabric Community Update - June 2025