Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Join us at FabCon Vienna from September 15-18, 2025, for the ultimate Fabric, Power BI, SQL, and AI community-led learning event. Save €200 with code FABCOMM. Get registered

Reply
smeetsh
Resolver II
Resolver II

Delay between lakehouse ingestion and the data showing up in the sql-endpoint of that lakehouse

Hi All,

 

A known issue is the fact that if I ingest data into a lakehouse table. I can take sometime for that data to show up in the actual SQL endpoint of that lakehouse table. (known issue 1092). Until now i have been trying to fix this issue by adding a 5 or 10 minute delay in fromt of the step that queries the lakehouse sql endpoint (we use this a lot to join raw data with out dim tables, do ETL etc and get the data ready for the bussines analysts. The works with varying succes and causes unneccesary delays, pipelines that should take only a minute now take 10 minutes or more, and sometime it still hasn't caugt up.

I found the below article and i am wondering if anyone has already tried it?
Known issue - Delayed data availability in SQL analytics endpoint when using a pipeline 

Do I understand correctly the fix is as simple as adding a script activity that has a script like:

SELECT TOP(1) 1
FROM [lakehouse].[dbo].[tablename]

 

Or is this not a fix for the delay between a fabric  lakehouse and its analytics endpoint?

Cheers

Hans

1 ACCEPTED SOLUTION
v-lgarikapat
Community Support
Community Support

Hi @smeetsh ,
Thank you for bringing this up.

This is a known issue, and our product team is currently working on a fix on the backend. We understand that the delays can be frustrating, especially when they impact overall performance.

The issue also originates from Microsoft’s end, and their team is actively working on a resolution. We appreciate your patience and understanding in the meantime.

If you found this post helpful, please 'consider giving it Kudos' and marking it as the 'accepted solution' to assist other members in finding it more easily.

Thank you.
Best Regards,

LakshmiNarayana.

View solution in original post

4 REPLIES 4
v-lgarikapat
Community Support
Community Support

Hi @smeetsh ,
Thank you for bringing this up.

This is a known issue, and our product team is currently working on a fix on the backend. We understand that the delays can be frustrating, especially when they impact overall performance.

The issue also originates from Microsoft’s end, and their team is actively working on a resolution. We appreciate your patience and understanding in the meantime.

If you found this post helpful, please 'consider giving it Kudos' and marking it as the 'accepted solution' to assist other members in finding it more easily.

Thank you.
Best Regards,

LakshmiNarayana.

Hi @smeetsh ,

As we haven't heard back from you, we are closing this thread. If you are still experiencing the same issue, we kindly request you to create a new thread we’ll be happy to assist you further.

Thank you for your patience and support.
Best Regards,
Lakshmi Narayana

v-lgarikapat
Community Support
Community Support

Hi @smeetsh ,
Thank you for reaching out to the Microsoft Community Forum
SELECT TOP(1) 1 FROM [lakehouse].[dbo].[tablename]
does not fix the delay. While it confirms that the table exists and is accessible, it doesn’t guarantee that the latest ingested data is visible yet. This is why sometimes your pipeline still fails or returns incomplete data even after a delay.
Loop with Condition Check
Instead of using a fixed delay, implement a loop in your pipeline that runs a query like:
SELECT COUNT(1)
FROM [lakehouse].[dbo].[tablename]
WHERE [LoadTimestamp] >= @ExpectedTimestamp
This checks if the new data is visible and only then proceeds. This avoids unnecessary delay and ensures consistency.
Use Spark or Notebooks for Direct Access
If you're working with notebooks or Spark pipelines, read the table directly using Delta format:
df = spark.read.format("delta").load("Tables/your_table_path")
This gives you immediate access to the latest data, bypassing the SQL delay entirely.
Use REFRESH TABLE (Optional)
In Spark notebooks, running:
REFRESH TABLE your_table_name
can sometimes force metadata sync -but its effectiveness may vary.


If this solution helped resolve your query, kindly mark it as Solution Accepted and consider giving a Kudos so it can assist others in the community facing similar issues.

Let me know if you need further assistance!

Best Regards,
Lakshmi Narayana

Edit : I don't think the if condition will work, since i would have to use a sql endpoint to be able to query the lakehouse, which is exactly where the problem lies. My raw data comes from an API and does not include the date/time that i made the api call. I can read a commit from a parquet file with a notebook, but there is no similar fucntionality to read it from the sql endpoint, so there is nothing to compare.

 

Thank you for this, I want to try the if condition, but I am unclear how it is supposed to work?

I dont have a column [LoadTimestamp]  and @expectedtimestamp looks like a variable to me?

 

please someone explain in more detail

 

I am still puzzled though why Microsoft published the script as a work around to force a refresh. Should I maybe do just select 1 and not select top(1) 1?

 

I can't use a notebook, since the data needs to be writen to a warehouse table for out analysts to do their work. A notebook can read and write to a lakehouse , but can only read from a warehouse.

The basic architecture of lakehouse to warehouse is something I cannot change.

 

Helpful resources

Announcements
May FBC25 Carousel

Fabric Monthly Update - May 2025

Check out the May 2025 Fabric update to learn about new features.

June 2025 community update carousel

Fabric Community Update - June 2025

Find out what's new and trending in the Fabric community.

Top Solution Authors