Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Fabric Data Days Monthly is back. Join us on March 26th for two expert-led sessions on 1) Getting Started with Fabric IQ and 2) Mapping & Spacial Analytics in Fabric. Register now

Srisakthi

Eventhouse and its best practices

Eventhouse Overview

Eventhouse is designed for ingesting, storing, processing, and querying massive volumes of data, with a strong focus on real‑time analytics and interactive exploration. It is particularly effective for high‑velocity data scenarios where insights are needed instantly.

An Eventhouse acts as a container for multiple KQL databases, which can be shared across projects. This centralized approach makes it easy to manage and operate multiple KQL databases under a single umbrella, improving governance and operational efficiency.

Eventhouse is best suited for:

  • Event stream data
  • Time‑series data
  • IoT telemetry
  • Log and telemetry data

It also supports ingestion across multiple file formats, making it flexible for diverse data sources.

Data Ingestion

Eventhouse supports direct ingestion from a variety of sources, including:

  • Event Streams
  • Apache Kafka
  • REST APIs
  • Dataflows

Once ingested, data is automatically partitioned and indexed based on ingestion time, enabling efficient querying and fast analytical performance without additional configuration.

Data Storage and Caching

Eventhouse leverages a tiered storage model that balances performance and cost. Data can reside in:

  • RAM
  • Local SSDs
  • Cold storage (Azure Blob Storage)

This behavior is governed by caching policies. Since Azure Blob Storage supports hot and cold tiers, Eventhouse aligns with the same concept:

  • Hot data (frequently accessed) is stored on local SSDs for low‑latency queries
  • Cold data (infrequently accessed) is moved to Azure Blob Storage for cost‑effective, durable storage

By default, all ingested data is treated as hot data. Caching policies should be configured carefully to optimize both query performance and cost.

Best Practices for Cost, Performance, and Reliability

When working with Eventhouse, the following best practices can help you achieve optimal efficiency:

  1. Always‑On

     The Always‑On feature can significantly reduce costs. When enabled, Eventhouse can be suspended during periods of inactivity       and resumed when needed.

  • Suspending the service eliminates uptime charges
  • Be aware that resuming may introduce some latency
  • Keeping it always running incurs uptime costs, but avoids cold start delays
  1. Minimum Consumption

     Minimum Consumption is a subset of Always‑On and works best when combined with it.

  • Allows Eventhouse to scale up during spikes in workload
  • Prevents scaling down below a defined minimum capacity
  • Helps handle sudden ingestion bursts without performance degradation
  1. Caching Policy

     Caching policies play a critical role in query performance.

  • Define which data should remain hot and which can move to cold storage
  • Use the query_datascope setting to control how queries interact with hot and cold data
  • Proper configuration can significantly reduce storage and compute costs
  1. Retention Policy

     Retention policies are commonly used alongside caching policies.

  • Automatically delete data beyond a defined ingestion period
  • Controlled using softdelete and recoverability settings
  • Helps manage storage growth and maintain compliance requirements
  1. Update Policies

     Update policies act as lightweight, automated ETL pipelines.

  • Transform ingested data and store results in destination tables
  • Multiple update policies can be created per table
  • Since they execute during ingestion, excessive or complex transformations can increase resource consumption and throttle ingestion throughput
  • Use with caution and monitor resource usage closely
  1. Materialized Views

     For workloads involving frequent aggregations, Materialized Views are highly recommended.

  • Pre‑compute and store aggregated results
  • Always return up‑to‑date data
  • Consume fewer resources compared to running aggregation queries repeatedly on source tables
  • Significantly improve query performance for analytical scenarios

 

Reference links:

https://learn.microsoft.com/en-us/kusto/management/cache-policy?view=microsoft-fabric

https://learn.microsoft.com/en-us/kusto/management/materialized-views/materialized-view-use-cases?vi...

https://learn.microsoft.com/en-us/kusto/management/materialized-views/materialized-view-use-cases?vi...