Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Data Days is here! Join us now for 60+ days of learning, challenges, and connection. Learn more

Reply
rmbelda
New Member

Eventhouse SDK client throughput limited to ~270 QPS while Eventhouse executes queries in ~35ms

Environment:
- Microsoft Fabric Eventhouse
- Always On enabled
- Eventhouse query latency measured by EventhouseQueryLogs:
AvgDurationMs = 35 ms
P95 = 75 ms

Test:
- .NET 8 console application
- Microsoft.Azure.Kusto.Data SDK
- 1000 concurrent workers
- 32 SDK client instances
- Same service principal for all requests

Results:
- 8981 queries in 33.5 seconds
- Throughput = 267.8 QPS
- ExecuteQueryAsync average latency = 3416 ms
- Result reading latency = 1 ms

Eventhouse monitoring:
- Same time window shows query execution average of 35 ms
- P95 of 75 ms
- No throttling observed
- No ingestion activity
- MachinesTotal = 2

Observation:
Most latency appears before query execution reaches the Eventhouse engine.
Looking for possible gateway, SDK, authentication or service principal concurrency limits.

4 REPLIES 4
MJParikh
Super User
Super User

Hi @rmbelda,

The gap between 35ms server-side and 3416ms client-side points to client infrastructure, not Eventhouse. Here are the likely causes worth investigating.

ServicePointManager connection limit

.NET defaults to 2 concurrent connections per endpoint in console applications. With 1000 workers hitting the same Eventhouse URL, you serialize through 2 sockets. Set ServicePointManager.DefaultConnectionLimit to 1024 at application startup, before any HTTP traffic begins. On .NET 8 with SocketsHttpHandler, also check MaxConnectionsPerServer on the underlying handler the Kusto SDK uses.

MSAL token acquisition serialization

A single service principal means a single token cache. MSAL serializes token requests through a per-authority semaphore. Under 1000-way concurrency, token acquisition queues even with cached tokens. Enable MSAL logging to confirm whether AcquireTokenForClient hits the in-memory cache or the network.

Token provider duplication across clients

Your 32 KustoConnectionStringBuilder instances likely instantiate separate token providers. On expiry, you get 32 parallel refresh attempts against AAD, with retry storms if AAD throttles. Share one TokenCredential or one ConfidentialClientApplication across all clients.

HttpClient pool fragmentation

The Kusto SDK creates HTTP infrastructure per client instance. 32 instances generate 32 connection pools, each governed by its own limit. Reduce to 1 to 4 ICslQueryProvider instances and tune the per-instance connection ceiling.

SNAT and ephemeral port pressure

1000 concurrent outbound connections from a single VM exhaust the ephemeral port range under TIME_WAIT. Run netstat -ano during the test and look for thousands of connections in TIME_WAIT to the Eventhouse endpoint. Enable HTTP keep-alive and verify reuse.

Gateway-side per-principal concurrency

The Eventhouse frontend applies per-principal concurrent request limits before queries reach the engine. EventhouseQueryLogs captures execution time only, so requests queued at the gateway are invisible in your 35ms measurement. Split the workload across two service principals as a test. If throughput roughly doubles, the bottleneck is per-principal at the gateway.

Diagnostic steps

  1. Capture client-side DiagnosticListener traces for HttpClient. Measure DNS resolution, TCP connect, TLS handshake, and request transmission separately to isolate where the 3381ms lives.
  2. Re-run with ServicePointManager.DefaultConnectionLimit set to 1024 and a single shared TokenCredential.
  3. Reduce SDK clients from 32 to 4 and compare throughput.
  4. Test with two service principals splitting load 500/500.

Run .show queries during the test and reconcile submitted client_request_id values against executed counts.

Theoretical throughput with 35ms execution and 1000 workers approaches 28,500 QPS. You observe 270 QPS, so something serializes roughly 100x of your concurrency. The two highest-probability culprits in my experience are the default connection limit and shared token cache contention.

 

I hope this will help to resolve your issue.


Thank you!
Proud to be a Super User!
📩 Need more help?
✔️ Don’t forget to Accept as Solution if this guidance worked for you.
💛 Your Like motivates me to keep helping

MJParik, thank you so much for your reply. I'll perform the suggested adjustments and come back here to share the results.

Hi @rmbelda ,

I hope the above details help you fix the issue. If you still have any questions or need more help, feel free to reach out. We’re always here to support you.

Best Regards, 
Community Support Team

Hi @rmbelda  ,
Thanks for reaching out to the Microsoft fabric community forum. 


I would also take a moment to thank @MJParikh   , for actively participating in the community forum and for the solutions you’ve been sharing in the community forum. Your contributions make a real difference. 
I hope the above details help you fix the issue. If you still have any questions or need more help, feel free to reach out. We’re always here to support you.

Best Regards, 
Community Support Team

 

Helpful resources

Announcements
Fabric Data Days is here Carousel

Fabric Data Days 2026

Don't miss out on Data Days, June 15 through August 7. Learn Fabric, Power BI, SQL, AI and more.

June Fabric Update Carousel

Fabric Monthly Update - June 2026

Check out the June 2026 Fabric update to learn about new features.