Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Don't miss out! 2025 Microsoft Fabric Community Conference, March 31 - April 2, Las Vegas, Nevada. Use code MSCUST for a $150 discount. Prices go up February 11th. Register now.

Reply
chetanhiwale
Helper I
Helper I

Real time data transformation options

I am trying to understand what are the all possible transformation options available for real time data. After studying real time intelligence experience in fabric, I can pin point few of them

1. Transformation available in Eventstream

2. KQL update policy ( not possible, when using joins )

Apart from this , did I miss something, please suggest. 


At the core, I am looking how can I transform data using medallion architecture. For bronze we can leverage update policy, for silver to gold , we can leverage materzied views. Issue arises in silver layer, as join doesnt fully support update policy (streaming ingestion need to be disabled). This makes difficult to perform joins.  Is there any other approach to process real time data. Please suggest your thoughts folks, also please correct me if I am wrong. 

1 ACCEPTED SOLUTION
datacoffee
Most Valuable Professional
Most Valuable Professional

Hi

 

There is no rule to use update policies or mat.views at those places. So you could use the mat.views where you need joins.

 

You can also use joins in the Eventstream - even though they are a bit different.

 

When I wrote my blogpost and the YouTube video on the RTI medallion architecture, I also used update policies for silver and mat.view for gold. But that is not given. Those are just examples, and can be used there they fit in your process.

 

i hope you can find a way to succeed, or else let me know. I will gladly help 😊

 

If this is useful, please consider accepting it as a answer.

View solution in original post

7 REPLIES 7
aayushiharalal
Frequent Visitor

Thank you so much for clarifying my doubt! I truly appreciate the detailed explanation. The post was extremely helpful, and I will definitely look into this further. Your knowledge and insights are invaluable.

Thanks again for sharing your expertise!

nilendraFabric
Solution Supplier
Solution Supplier

Hello @chetanhiwale 

 

You’re on the right track with your understanding of real-time data transformation options in Microsoft Fabric’s Real-Time Intelligence experience. Let’s explore the available options and how they fit into the medallion architecture:
Real-Time Data Transformation Options
1. Eventstream Transformations
Eventstream in Microsoft Fabric offers several transformation options for real-time data processing:
• Filter: Allows you to filter out events based on specific conditions.
• Manage Fields: Enables selection, renaming, or removal of specific fields.
• Aggregate: Performs aggregate calculations on data, such as summing values or calculating averages.
• Stream: Chains streams together for advanced processing workflows.
• Activator: Triggers actions based on real-time data conditions or thresholds.
2. KQL Update Policies
KQL update policies are indeed a powerful tool for transforming data in real-time, especially within the bronze layer of the medallion architecture. However, as you correctly noted, they have limitations when it comes to joins, particularly when streaming ingestion is enabled.
3. Materialized Views
Materialized views are excellent for transforming data from the silver to gold layer, as you mentioned. They provide up-to-date results of aggregation queries and are more performant than running aggregations directly on source tables.
Medallion Architecture in Real-Time Processing
Let’s break down the transformation options for each layer of the medallion architecture:
Bronze Layer
• Use Eventstream for initial data ingestion and basic transformations.
• Apply KQL update policies for simple transformations without joins.
Silver Layer
This is where the challenge lies, especially for complex transformations involving joins. Here are some approaches to consider:
1. Batch Processing with Update Policies: Disable streaming ingestion temporarily to allow for more complex transformations, including joins. This approach sacrifices some real-time aspects but can be scheduled frequently to minimize latency.
2. Intermediate Tables: Create intermediate tables that aggregate or pre-process data, which can then be joined more efficiently.
3. Materialized Views: For some scenarios, you might be able to use materialized views to pre-aggregate data, reducing the need for complex joins in real-time.
4. Custom ETL Processes: Implement custom ETL processes using Spark or other data processing frameworks within Fabric to handle complex transformations.
Gold Layer
• Utilize materialized views for final aggregations and business-ready data.
• Implement KQL queries for ad-hoc analysis and reporting.

 

please accept the solution if this helps.

Thanks

datacoffee
Most Valuable Professional
Most Valuable Professional

Hi

 

There is no rule to use update policies or mat.views at those places. So you could use the mat.views where you need joins.

 

You can also use joins in the Eventstream - even though they are a bit different.

 

When I wrote my blogpost and the YouTube video on the RTI medallion architecture, I also used update policies for silver and mat.view for gold. But that is not given. Those are just examples, and can be used there they fit in your process.

 

i hope you can find a way to succeed, or else let me know. I will gladly help 😊

 

If this is useful, please consider accepting it as a answer.

Hello
I have same senerio where I need to work with real-time ETL with medallion architecture, but the catch here is I have to use LakeHouse as a dataset. Can you please suggest a flow where we can move with medallion arthitecture.

datacoffee
Most Valuable Professional
Most Valuable Professional

Hello
Sure - if data is stored in a Lakehouse, then I would look into creating a shortcut from the Lakehouse to the Eventhouse. Then you have data as external tables and can use the update policies and materialized views as normal


Thank you for the reply. I will look into it practically.
I would like to know one more thing from you. 
While I was working with the same sample data as metioned in the Blog Post, it indicates two Tables. Those are:
NYCTaxi
Vendors
but while I am loading it using the Event stream I can only see NYCTaxi. I am not sure if I am missing any step or some point for the same.
Please guide.

datacoffee
Most Valuable Professional
Most Valuable Professional

Surely

 

 The vendors table is a manual table I've made for the demo of the blogpost. Only to show how to make joins. You don't need it to use the Eventstream and get data all the way to the gold layer

Helpful resources

Announcements
Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount! Prices go up Feb. 11th.

JanFabricDE_carousel

Fabric Monthly Update - January 2025

Explore the power of Python Notebooks in Fabric!

JanFabricDW_carousel

Fabric Monthly Update - January 2025

Unlock the latest Fabric Data Warehouse upgrades!

JanFabricDF_carousel

Fabric Monthly Update - January 2025

Take your data replication to the next level with Fabric's latest updates!

Top Solution Authors