The ultimate Microsoft Fabric, Power BI, Azure AI, and SQL learning event: Join us in Stockholm, September 24-27, 2024.
Save €200 with code MSCUST on top of early bird pricing!
Find everything you need to get certified on Fabric—skills challenges, live sessions, exam prep, role guidance, and more. Get started
Hello!
I am testing Fabric real-time analytics with KQL databases and eventstreaming from Event Hub. Using Fabric Eventstream with direct ingestion to a KQL database, I get the latency I expect (a couple of seconds) and it seems to work fine.
However when I add event processing before ingestion, with a simple filter operation and mapping columns operation, the ingestion latency into the KQL database seems to go up to 30 seconds. In my test I have about 20 small events being sent every second. The watermark delay graph in the EventStream still shows 3 second, but as I said the latency is higher. The inserts into the KQL database seems to get batched together in larger groups and inserted with much higher delay than when using Eventstream with direct ingestion. I am thinking it maybe uses Kusto Ingestion batching, instead of Streaming Ingestion?
Is this something that is expected, and is there a way to alter settings to reduce the delay?
Best regards,
Emil
Solved! Go to Solution.
Hi @dataStreamer ,
Based on your description, it sounds like you're experiencing increased latency when adding event processing operations before ingestion into a KQL database. This behavior can indeed be attributed to the nature of event processing and how it interacts with the ingestion process.
When you add event processing operations, such as filtering and mapping, the events must be processed before they can be ingested into the KQL database. This processing step introduces additional latency compared to direct ingestion, where events are ingested without any preprocessing. The increase in latency you're observing is likely due to the time it takes to process these events, especially if they're being batched together for more efficient processing.
Regarding your question about Kusto Ingestion batching versus Streaming Ingestion, it's important to understand that event processing before ingestion can lead to events being batched together to optimize the ingestion process. This batching is a part of the ingestion mechanism designed to enhance performance and reduce costs by aggregating data before ingestion.
To potentially reduce the delay, you might consider the following approaches:
Review Event Processing Logic: Simplify the event processing logic if possible. Complex operations can significantly increase processing time.
Adjust Ingestion Batching Policy: While specific settings for adjusting the ingestion batching policy in Fabric Eventstream are not directly exposed, understanding the default behavior can help. The default settings aim to balance latency and throughput. For more information on ingestion batching in general, you can refer to the documentation on Data Connectors and Ingestion Supported Formats.
Monitor and Optimize: Continuously monitor the performance and latency using the tools provided by Fabric Eventstream. The Monitoring Event Streams documentation can guide you on how to monitor streaming event data, ingestion status, and performance.
It's also worth noting that some level of delay is expected when processing events before ingestion due to the additional computational overhead. Balancing the need for real-time processing with the inherent latency of such operations is key.
Best Regards,
Neeko Tang
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
No matter what i have done with eventhub and direct mode its about 30 seconds , batching is fine when you have huge loads but this can be low like a message per 10 second etc but you want it real time. It maybe better with a note book
Hi @dataStreamer ,
Based on your description, it sounds like you're experiencing increased latency when adding event processing operations before ingestion into a KQL database. This behavior can indeed be attributed to the nature of event processing and how it interacts with the ingestion process.
When you add event processing operations, such as filtering and mapping, the events must be processed before they can be ingested into the KQL database. This processing step introduces additional latency compared to direct ingestion, where events are ingested without any preprocessing. The increase in latency you're observing is likely due to the time it takes to process these events, especially if they're being batched together for more efficient processing.
Regarding your question about Kusto Ingestion batching versus Streaming Ingestion, it's important to understand that event processing before ingestion can lead to events being batched together to optimize the ingestion process. This batching is a part of the ingestion mechanism designed to enhance performance and reduce costs by aggregating data before ingestion.
To potentially reduce the delay, you might consider the following approaches:
Review Event Processing Logic: Simplify the event processing logic if possible. Complex operations can significantly increase processing time.
Adjust Ingestion Batching Policy: While specific settings for adjusting the ingestion batching policy in Fabric Eventstream are not directly exposed, understanding the default behavior can help. The default settings aim to balance latency and throughput. For more information on ingestion batching in general, you can refer to the documentation on Data Connectors and Ingestion Supported Formats.
Monitor and Optimize: Continuously monitor the performance and latency using the tools provided by Fabric Eventstream. The Monitoring Event Streams documentation can guide you on how to monitor streaming event data, ingestion status, and performance.
It's also worth noting that some level of delay is expected when processing events before ingestion due to the additional computational overhead. Balancing the need for real-time processing with the inherent latency of such operations is key.
Best Regards,
Neeko Tang
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
Join the community in Stockholm for expert Microsoft Fabric learning including a very exciting keynote from Arun Ulag, Corporate Vice President, Azure Data.
Check out the August 2024 Fabric update to learn about new features.