This is best Fabric, Power BI, SQL and AI community event. How do we know? The last event sold out! Save €200 with code FABCMTY200.
Register nowA new Data Days event is coming soon! This time we’re going bigger than ever. Fabric, Power BI, SQL, AI and more. Don't miss out.
Hi everyone,
I am experiencing a persistent latency issue in our Microsoft Fabric Eventhouse. A table that was previously performing perfectly has suddenly "downgraded" its ingestion path from Streaming to Batching and refuses to recover.
The Situation:
The Problem: In our PROD Eventhouse, ingestion latency is stuck at 12–15 seconds. The table is generating 45+ shards (extents) every 10 minutes, confirming it is in Batching Mode.
The Discrepancy: Our DEV Eventhouse (identical schema and higher data volume) is still Streaming perfectly with ~5-second latency and 0 shards created per 10 minutes.
The History: PROD was working fine (5s latency) and then switched to this slow Batching state on its own without any schema or policy changes.
What we have verified:
Streaming Policy: Both DEV and PROD have streamingingestion enabled.
Batching Policy: We tested various ingestionbatching settings in a separate environment. We confirmed that the 12s latency in PROD is a result of the Batching Path overhead
Unable to restore streaming: Running .alter table ... streamingingestion enable in PROD does not trigger a return to the Streaming.
Our Questions:
Why would an Eventhouse suddenly "blacklist" a table from the Streaming path and move it to permanent Batching if the Capacity (CU) is healthy (~50%)?
How do we reverse this process? Once a table is stuck in this "Permanent Batching" state, what is the specific command or workflow to force the Eventhouse to re-evaluate it for the Streaming (Fast) Lane?
We need to restore the 5-second latency to meet our real-time requirements.
Any insights into the internal health-check logic of the Eventhouse would be incredibly helpful.
Thank you !
Solved! Go to Solution.
Hi @lavginqo ,
Based on your findings, the root cause is the update policy on the PROD table using the ingestion_time() function, which is not supported with streaming ingestion. Update policy queries execute after ingestion and are subject to streaming ingestion restrictions, therefore ingestion‑context functions such as ingestion_time(), which are evaluated only during ingestion, are not available in update policies. When such an incompatibility is detected, the engine automatically routes ingestion through the batching path to ensure correctness, and this will persist regardless of re enabling the streaming ingestion policy. This also explains why your DEV environment continues to work as expected, since it does not have the same update policy defined.
To restore low latency streaming ingestion, the recommended approach is to modify or remove the update policy so that it no longer uses ingestion_time(). As a best practice, you can separate concerns by keeping the landing table optimized for streaming without update policies or unsupported functions and applying any transformation logic, including time based calculations, in downstream tables or processes. For validation, you may temporarily disable the update policy in PROD and re enable streaming ingestion to confirm that latency returns to expected levels. Once confirmed, you can redesign the update logic in a way that is compatible with streaming ingestion.
Hope this helps.
Thank you.
Hi @lavginqo
For me, the strategy to work against PROD and see the impact directly without risk would be to follow these steps: basically clone the table and apply the changes on the test version. I understand that with cloning, all configuration is inherited, so we start from exactly the same state.
De acuerdo, aquí tienes el plan completo con tu tabla real:
.create table TableArabalca_test based-on TableArabalca
.alter table TableArabalca_test policy update
@'[{
"IsEnabled": true,
"Source": "TableArabalca_test",
"Query": "esta es la query de arabalca",
"IsTransactional": true
}]'
At this point, the test table is equally broken as TableArabalca. That’s exactly what you want: start from the same state.
.alter table TableArabalca_test policy streamingingestion enable
.alter table TableArabalca_test policy update
@'[{
"IsEnabled": true,
"Source": "TableArabalca",
"Query": "TableArabalca",
"IsTransactional": false
}]'
TableArabalca_test
| summarize latency_seconds = datetime_diff('second', now(), max(ingestion_time()))
You should see latency around 12–15 seconds. If so, the clone is correct and you can continue.
.alter table TableArabalca_test policy update
@'[{
"IsEnabled": true,
"Source": "TableArabalca_test",
"Query": "esta es la query de arabalca",
"IsTransactional": false
}]'
Only change IsTransactional from true to false. The query remains unchanged.
.delete table TableArabalca_test policy streamingingestion
Wait ~2 minutes, then:
.alter table TableArabalca_test policy streamingingestion enable
TableArabalca_test
| summarize latency_seconds = datetime_diff('second', now(), max(ingestion_time()))
Once validated on TableArabalca_test, apply the same commands on TableArabalca:
.alter table TableArabalca policy update
@'[{
"IsEnabled": true,
"Source": "TableArabalca",
"Query": "esta es la query de arabalca",
"IsTransactional": false
}]'
.delete table TableArabalca policy streamingingestion
Wait ~2 minutes:
.alter table TableArabalca policy streamingingestion enable
If my comment helped solve your question, it would be great if you could mark it as the accepted solution. It helps others with the same issue and it also motivates me to keep contributing.
Thanks a lot. I really appreciate it.
Hi @arabalca,
We found a difference in the update policy in Prod.
There was an : order by ingestion_time() desc in the update policy which was apparently adding to the delay
After removing that piece of code- the latency is now gone.
But your response was very very helpful to help us find the problem using the ingestion failure query.
Thanks
Hi @lavginqo ,
Thanks for letting us know that the issue is resolved, glad to hear the latency has been addressed. Please feel free to reach out if you need any further assistance.
Hi @arabalca,
We found a difference in the update policy in Prod.
There was an : order by ingestion_time() desc in the update policy which was apparently adding to the delay
After removing that piece of code- the latency is now gone.
But your response was very very helpful to help us find the problem using the ingestion failure query.
Thanks
Hola @lavginqo3 ,
Me alegro por ti. Muchas gracias por darle "me gusta" a mi publicación, lo aprecio mucho.
Si mis respuestas te ayudaron a resolver el problema, te agradecería que indicaras que te gustan las diferentes respuestas y que marcaras la solución como aceptada para que otros usuarios puedan aprender de ella.
¡Gracias de nuevo!
Hi @lavginqo3 ,
I am happy for you .Thanks a lot for liking my post, I really appreciate it
If my answers helped you solve the issue, it would be great if you could like the different replies and mark the solution as accepted so other users can learn from it.
Thanks again!
Hi @lavginqo2 ,
Just following up to see if the Response provided by community members were helpful in addressing the issue. if the issue still persists Feel free to reach out if you need any further clarification or assistance.
Best regards,
Chaithra E.
Hi @lavginqo ,
Thanks a lot for liking my post, I really appreciate it
If my answers helped you solve the issue, it would be great if you could like the different replies and mark the solution as accepted so other users can learn from it.
Thanks again!
Hi @arabalca
Thank you for the detailed steps !
Unfortunately, at this point we can't do any testing in Prod directly as the system is operational.
But I have noted your suggestions and will check if and when we can try them. I will post the results as soon as I can.
Thank you so much !
Hi @lavginqo ,
Based on your findings, the root cause is the update policy on the PROD table using the ingestion_time() function, which is not supported with streaming ingestion. Update policy queries execute after ingestion and are subject to streaming ingestion restrictions, therefore ingestion‑context functions such as ingestion_time(), which are evaluated only during ingestion, are not available in update policies. When such an incompatibility is detected, the engine automatically routes ingestion through the batching path to ensure correctness, and this will persist regardless of re enabling the streaming ingestion policy. This also explains why your DEV environment continues to work as expected, since it does not have the same update policy defined.
To restore low latency streaming ingestion, the recommended approach is to modify or remove the update policy so that it no longer uses ingestion_time(). As a best practice, you can separate concerns by keeping the landing table optimized for streaming without update policies or unsupported functions and applying any transformation logic, including time based calculations, in downstream tables or processes. For validation, you may temporarily disable the update policy in PROD and re enable streaming ingestion to confirm that latency returns to expected levels. Once confirmed, you can redesign the update logic in a way that is compatible with streaming ingestion.
Hope this helps.
Thank you.
Hi @v-echaithra
Thanks for your inputs !
Please note that Dev enviornment has the same update policy as well.
Thanks
Hi @lavginqo ,
Before trying to recover it, try checking this to understand what’s actually happening:
The result of streamingingestion failures is key: if you see recurring errors (schema mismatch, resource pressure, etc.), it means the engine is having issues and may be falling back to batching because of that.
You can try recreating the policy and it might fix it, but it’s important to first check if there are failures to confirm that this is the root cause:
Also, check if there is a batching policy at table level:
Take a look and let me know what results you get.
If you are experiencing many failures, it’s possible that it switched to batching because of that.
If this helps solve your problem, I’d appreciate it if you could mark it as the accepted solution and give it a like 👍
Thanks!
Hi @arabalca
Thank you for this very useful suggestions !
Please find below my findings after running the queries in Prod:
The latency started on 2026-02-22 due to some reason and since we have a DB Retention policy of 30days, we are unable to check the FailureKind for that day.
The FistFailureDate=2026-03-15 indicates that the UpdatePolicy issue might be an impact of the broken Streaming Ingestion since 2026-02-22. We are not seeing any latency on downstream tables derived from the Update Policy on the landing table.
I also checked these statistics in the Dev enviornment and it seems healthy and Ingesting succesfully (except very few transient failures= 8 out of 20k) . It doesn't have any ingestion policy as well.
Since, we won't like to make changes in Prod until we are sure that it is due to the Batching policy-we have set up a parallel Eventhouse in Dev. The idea is to enable the batching policy on the landing table to force batching and monitor the impact on latency. If the latency goes up then the idea is to delete the policy and monitor again.
Do you think this is the right approach ?
I was also wondering, if the ingestion is Batching and if we delete the Batching Policy on the table, will it start inherting the DB batching policy ?
Please let me know your thoughts and suggest next steps.
I really appreciate your help !
Thank you !
Hi @lavginqo
For me, the strategy to work against PROD and see the impact directly without risk would be to follow these steps: basically clone the table and apply the changes on the test version. I understand that with cloning, all configuration is inherited, so we start from exactly the same state.
De acuerdo, aquí tienes el plan completo con tu tabla real:
.create table TableArabalca_test based-on TableArabalca
.alter table TableArabalca_test policy update
@'[{
"IsEnabled": true,
"Source": "TableArabalca_test",
"Query": "esta es la query de arabalca",
"IsTransactional": true
}]'
At this point, the test table is equally broken as TableArabalca. That’s exactly what you want: start from the same state.
.alter table TableArabalca_test policy streamingingestion enable
.alter table TableArabalca_test policy update
@'[{
"IsEnabled": true,
"Source": "TableArabalca",
"Query": "TableArabalca",
"IsTransactional": false
}]'
TableArabalca_test
| summarize latency_seconds = datetime_diff('second', now(), max(ingestion_time()))
You should see latency around 12–15 seconds. If so, the clone is correct and you can continue.
.alter table TableArabalca_test policy update
@'[{
"IsEnabled": true,
"Source": "TableArabalca_test",
"Query": "esta es la query de arabalca",
"IsTransactional": false
}]'
Only change IsTransactional from true to false. The query remains unchanged.
.delete table TableArabalca_test policy streamingingestion
Wait ~2 minutes, then:
.alter table TableArabalca_test policy streamingingestion enable
TableArabalca_test
| summarize latency_seconds = datetime_diff('second', now(), max(ingestion_time()))
Once validated on TableArabalca_test, apply the same commands on TableArabalca:
.alter table TableArabalca policy update
@'[{
"IsEnabled": true,
"Source": "TableArabalca",
"Query": "esta es la query de arabalca",
"IsTransactional": false
}]'
.delete table TableArabalca policy streamingingestion
Wait ~2 minutes:
.alter table TableArabalca policy streamingingestion enable
If my comment helped solve your question, it would be great if you could mark it as the accepted solution. It helps others with the same issue and it also motivates me to keep contributing.
Thanks a lot. I really appreciate it.
Hi @arabalca
I have some issues with my account (unable to use lavginqo account) - unable to mark your solution as accepeted. But it was very helpful.
Thank you for all your help !
Check out the June 2026 Fabric update to learn about new features.
Sign up to receive a private message when registration opens and key events begin.