Join us at FabCon Atlanta from March 16 - 20, 2026, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.
Register now!The Power BI Data Visualization World Championships is back! Get ahead of the game and start preparing now! Learn more
Hi everyone,
I’m running into a challenging Real-Time Intelligence issue related to schema enforcement and message drift inside an Eventstream, and I’d appreciate help understanding whether this is expected behavior, a configuration limitation, or a bug.
🟦 Scenario
We have an IoT setup publishing data to Azure Event Hubs.
The messages follow a defined schema, but occasionally:
New fields appear
Field ordering changes
Some devices send null or missing fields
A small subset of devices sends string values where numbers are expected
The Event Hub input feeds a Fabric Eventstream, which then routes data to:
A KQL database (hot store)
A Lakehouse (cold store)
A Real-Time Dashboard
Everything works fine until schema drift happens.
🟥 Problem
When messages with drift arrive, the Eventstream intermittently:
Drops messages silently (only visible in diagnostic logs)
Fails the downstream KQL ingestion with:
Type mismatch: expected real but received string
Column 'temperature' missing in source payload
Applies schema incorrectly, creating unexpected fields such as temperature_1
Stops updating the Real-Time dashboard until the schema is reconciled
Fails the Lakehouse ingestion with Delta errors like:
Inconsistent column types detected across input batches
This results in partial data, out-of-order ingestion, or full pipeline stalls.
🟩 What I’ve tried
Enabled "Schema inference" in Eventstream → still fails on mixed types
Disabled schema enforcement → downstream Warehouse/KQL breaks
Built a mapping transformation in Eventstream → fails when fields are missing
Used a KQL update policy to coerce types → stops ingestion when undefined fields appear
Added a Dataflow Gen2 as transformer → introduces latency, defeating the RTI purpose
Created a custom pre-cleaning Azure Function → works but is costly and adds complexity
❓ Questions
Does Real-Time Intelligence currently support schema drift at scale, or does every upstream schema variation require manual adjustment?
Is there a recommended RTI pattern for handling IoT messages where fields may be missing, extra, or incorrectly typed?
Are Eventstream transformations supposed to guarantee schema consistency, or is type validation delegated to the downstream sinks?
Is this behavior expected with KQL ingestion, or is this a bug in RTI’s schema reconciliation?
Is the only robust workaround implementing an external cleansing function (Azure Function or Stream Analytics) before data reaches Eventstream?
This is impacting our ability to run true real-time analytics with thousands of devices, so any guidance or validation is extremely appreciated.
Thanks in advance!
Solved! Go to Solution.
As you mention, your schema and values change. It sounds like the way you have your eventstream defined today you are enforcing the schema.
But question, if you are sending the data to an event hub first why are you sending it to an eventstream? If it is not behind a private endpoint, just send it directly to eventhouse, there is no need to have the eventstream. Get data from Azure Event Hubs - Microsoft Fabric | Microsoft Learn
If the event hub is behind a private endpoit, you will need the eventstream though. Instead of sending it to 3 destinations (eventhouse, lakehouse, and real-time dashboard), send the data just to Eventhouse using Direct Ingestion. Do not put any transformations in the Eventstream. When you configure Direct Ingestion, on the screen where it asks you for mapping, change the nested json levels down to 0 and bring the entire json object into the row and then do your casting/type checking and everything through update policies. Mirror the data into Lakehouse instead of dual-writing using Eventhouse Onelake availability. This saves you the overhead of having to transform the data in two places.
https://learn.microsoft.com/en-us/fabric/real-time-intelligence/media/get-data-eventstream/inspect-d...
Hello @SavioFerraz
The answer given by @kustortininja is correct.
Regarding the direct 'Data connection' without Eventstream, consider the Event Hubs option:
In the past, I showed this in a blog post.
I prefer to use the bronze-silver-gold medallion architecture where we ingest the incoming (eventhub/iothub) messages as-is in a dynamic columns via a table mapping in Direct Ingest.
So it's ELT (extract, load, transform) instead of ETL.
So each message is ingested as-is in the bronze layer.
From the original message, you can also take separate values like device id, device type, timestamp, etc. if these help with table update policies towards the silver layer where you create typed columns.
Using an activator you could test for corrumt messages (eg. counting the number of incoming message and the number of transformed message per timespan).
Check out Eventhouse shortcuts in Lakehouse to make an Eventhouse table available as Lakehouse table.
If this answer helps, please upvote it.
As you mention, your schema and values change. It sounds like the way you have your eventstream defined today you are enforcing the schema.
But question, if you are sending the data to an event hub first why are you sending it to an eventstream? If it is not behind a private endpoint, just send it directly to eventhouse, there is no need to have the eventstream. Get data from Azure Event Hubs - Microsoft Fabric | Microsoft Learn
If the event hub is behind a private endpoit, you will need the eventstream though. Instead of sending it to 3 destinations (eventhouse, lakehouse, and real-time dashboard), send the data just to Eventhouse using Direct Ingestion. Do not put any transformations in the Eventstream. When you configure Direct Ingestion, on the screen where it asks you for mapping, change the nested json levels down to 0 and bring the entire json object into the row and then do your casting/type checking and everything through update policies. Mirror the data into Lakehouse instead of dual-writing using Eventhouse Onelake availability. This saves you the overhead of having to transform the data in two places.
https://learn.microsoft.com/en-us/fabric/real-time-intelligence/media/get-data-eventstream/inspect-d...
Turn streaming data into instant insights with Microsoft Fabric. Learn to connect live sources, visualize in seconds, and use Copilot + AI for smarter decisions.
| User | Count |
|---|---|
| 3 | |
| 3 | |
| 2 | |
| 2 | |
| 2 |