Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Join us at FabCon Vienna from September 15-18, 2025, for the ultimate Fabric, Power BI, SQL, and AI community-led learning event. Save €200 with code FABCOMM. Get registered

Reply
JosueMolina
Helper III
Helper III

Reading a OneLake Shortcut - Getting frequent errors

We have a medallion structure, with one Lakehouse for each layer. Bronze - Silver - Gold

 

For the Gold layer, we use a Notebook to move and filter data from our Silver layer. We mostly filter data based on Region (think sales for the whole country and we have some tables we need for one region)

 

Now, some tables we will load directly to Bronze and create a shortcut to Silver, so we can reference them in the same Lakehouse, as well as using the Gold layer notebook to filter them out, since we just set the Silver layer abfs path and call each table one by one. 

So the error we get is that 7 out of 10 times we try to load a shortcut table from the Silver lakehouse and it will tell us that the underlying files must have changed and to do a Refresh for the table.

 

This happens ALL the time. And if I loop over a list of 10 tables, and I get this error for the 5th table, I will usually get the same error for the following tables even if they aren't shortcuts.

Seems like some sort of bug in the Shortcut information between both Lakehouses where the file reference is not being properly sent to the Spark session. 

10 REPLIES 10
annhwallinger
Frequent Visitor

Any help from Microsoft on this one? It is plaguing our development.

Anonymous
Not applicable

HI @JosueMolina,

Did this issue occurred on the specific shortcut and table or random appeared on different tables? Can you please share some more detail about these?

BTW, I'm not so recommend your to looping reference between shortcuts. AFAIK, each shortcut will load and recognize data, they may cause the issue if the referenced object and operations not fully complete.

Regards,

Xiaoxin Sheng

This happens with different tables and does not happen every single time we try to load the Delta Table in our Notebook, but it will happen 7 out of 10 times. 

For clarity on the Looping part. We just have a list of table names (all strings) and we loop over them going through a read operation, filter and then save to default Lakehouse. While I am looking at the option of just having a parametrized Notebook, I have the theory that it will not solve the issue for the shortcut.

Here is an error I got today:

Py4JJavaError: An error occurred while calling o8318.count. 
: org.apache.spark.SparkException: Job aborted due to stage failure: Task 8 in stage 1544.0 failed 4 times, most recent failure: Lost task 8.3 in stage 1544.0 (TID 15529) (vm-d5775677 executor 2): org.apache.spark.SparkFileNotFoundException: Operation failed: "Not Found", 404, HEAD, http://onelake.dfs.fabric.microsoft.com/<workspace>/<lakehouse>.Lakehouse/Tables/<shortcut_table>/pa...

It is possible the underlying files have been updated. You can explicitly invalidate the cache in Spark by running 'REFRESH TABLE tableName' command in SQL or by recreating the Dataset/DataFrame involved.

 

Anonymous
Not applicable

Hi @JosueMolina,

If this issue appear on random table, I'd like to suggest you contact the dev team to check the backend processing and logs to confirm the root causes.

Support | Microsoft Power BI
Regards,

Xiaoxin Sheng

Thanks. Issue is difficult to replicate on a call as it does not happen consistently. It seems to be affected on refresh rates between the OneLake Shortcut and the Spark session from the notebook. I'll see what they can help with. 

FabianSchut
Super User
Super User

This topic seems to have a similar problem: https://community.fabric.microsoft.com/t5/Data-Engineering/Underlying-files-have-been-updated-Explic.... It seems that they got it resolved by attaching a default lakehouse to the notebook which includes some libraries and configurations such as the spark session. Can you try to set a default lakehouse?

Lakehouse I am trying to read is the default lakehouse for the Notebook. Yes. 

Anonymous
Not applicable

Hi @JosueMolina,

You can refer to the following docuemnt which mention the data pipeline and activity limits if your scenario meet to these limitations:

Data Factory limitations overview - Microsoft Fabric | Microsoft Learn

Regards,

Xiaoxin Sheng

Thanks, @Anonymous 

But this is happening when trying to read via Notebooks, not Pipelines. And this is not an error with activity limits, but instead a problem with Spark caching and OneLake Shortcuts.

we have the same issue

Helpful resources

Announcements
Join our Fabric User Panel

Join our Fabric User Panel

This is your chance to engage directly with the engineering team behind Fabric and Power BI. Share your experiences and shape the future.

May FBC25 Carousel

Fabric Monthly Update - May 2025

Check out the May 2025 Fabric update to learn about new features.

June 2025 community update carousel

Fabric Community Update - June 2025

Find out what's new and trending in the Fabric community.