Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

A new Data Days event is coming soon! This time we’re going bigger than ever. Fabric, Power BI, SQL, AI and more. Don't miss out.

tylerc

Medallion Architecture in Fabric Real-Time Intelligence

Introduction

Building a multi-layer, medallion architecture using Fabric Real-Time Intelligence (RTI) requires a different approach compared to traditional data warehousing techniques. But even transactional source systems can be effectively processed in RTI. To demonstrate, we’ll look at how sales orders (created in a relational database) can be continuously ingested and transformed through a RTI bronze, silver, and gold layer.

Pre-Requisites

Code samples and scripts are available in the GitHub repository Medallion Architecture in Fabric Real-Time Intelligence (RTI). To implement this example in your own environment, get started by creating the AdventureWorksLT sample database in Azure. Refer to the Deploy to Azure SQL Database  documentation for information on how to install the sample database. Then, enable Change Data Capture (CDC). The 00_CDCSetup_AdWorksLT.sql script (located in the repository CDCSetup folder) can be used to enable and configure CDC.

You’ll also need access to a workspace assigned to a Fabric-enabled capacity. Depending on your environment, you may be able to use a trial if an existing capacity is not available.

Medallion Architecture Primer

As big data and data lakes became popular, architectural patterns arose to help organize, process, and improve the quality of the data. The Medallion architecture defines three layers (bronze, silver, gold) where data can land and progress to serve different needs. The bronze (or raw) layer is typically used for initial ingestion of the data; data cleansing, validation, and transformation occurs in the silver layer – and additional modeling/aggregation takes place in a gold layer.

In Fabric RTI, a medallion architecture also contains these three layers. Data, whether it arrives in near real-time or in batches, is continuously processed and transformed along the bronze, silver, and gold layers – as shown in the following figure.

Diagram_of_a_Medallion_Architecture_in_Fabric_Real-Time_Intelligence_including_bDiagram_of_a_Medallion_Architecture_in_Fabric_Real-Time_Intelligence_including_b

Figure - Medallion Architecture in Fabric Real-Time Intelligence

With RTI, data isn’t typically updated or deleted (as often occurs in a traditional data warehouse). Instead, all events are inserted. Different processes are used to reflect updates and/or filter out deleted records – but without overwriting the data.

To demonstrate, we start with a data source – the AdventureWorksLT database. Change Data Capture (CDC) is enabled for two tables – SalesOrderHeader and SalesOrderDetail. In Fabric, an eventstream ingests and pushes these CDC events into a bronze eventhouse table. Update policies populate (and maintain) silver tables – and materialized views remove any duplicates. Finally, a 'latest value' view joins header and details into a single, denormalized gold layer.

Create the Fabric Eventhouse

Prior to connecting to the source database, we create an eventhouse in our Fabric workspace. An eventhouse contains one or more KQL databases; similar to SQL Server, a KQL database can contain multiple tables, views, and functions. It supports its own query language (also referred to as KQL). For those unfamiliar with KQL, SQL queries are also supported – and can be converted to KQL as needed.

Bonus tip: a KQL Database in Fabric has an entity diagram view, which shows the relationships among tables, functions, policies, views, etc. Peeking ahead, the entity diagram in our example will eventually look like the following:

A_Fabric_KQL_Database_entity_diagram_viewA_Fabric_KQL_Database_entity_diagram_view

Figure - KQL Database entity diagram view

Implement the Bronze Layer: Data Ingestion with an Eventstream

An Eventstream can connect to and process a variety of data sources (it can also expose an EventHub compatible endpoint so applications can push events directly to the stream). In this example, we connect to a CDC-enabled Azure SQL DB.

While an eventstream can also filter, join, split, or aggregate incoming events, our eventstream simply pushes the event data into a single landing table in the eventhouse – rawCDCEvents.

Screenshot_of_the_Eventstream_designer_in_Fabric_showing_an_Azure_SQL_DB_as_a_soScreenshot_of_the_Eventstream_designer_in_Fabric_showing_an_Azure_SQL_DB_as_a_so

Figure - Eventstream designer

When configuring the Eventstream (notes and tips)–

  • When creating the source, and entering the initial list of Azure tables, do not include spaces between tables names (as of April 2025, there is a UI glitch that won’t let you save if spaces are present).
  • The designer can create the rawCDCEvents table on your behalf. Alternatively, you can create it ahead of time with a simple KQL Statement:

.create table rawCDCEvents (schema:dynamic,payload:dynamic);

The Silver Layer: Table Update Policies and Materialized Views in the Eventhouse

The following query shows an example of records that are pushed into the rawCDCEvents table (the KQL folder in the GitHub repository contains all KQL queries and commands).

A_view_of_records_in_the_rawCDCEvents_table._Shows_two_columns_schema_and_payloaA_view_of_records_in_the_rawCDCEvents_table._Shows_two_columns_schema_and_payloa

Figure – A view of records in the rawCDCEvents table

The payload column is what we’re interested in – a JSON representation of a CDC event. KQL has good JSON support, so we can parse each event into something suitable for light transformation. To do this, we define a table update policy. Specifically, a query is encapsulated into a function; this function is then referenced in an update policy. The policy also specifies a target table.

Think of an update policy like a trigger in a database table; the policy runs every time data is ingested; there isn’t anything needed in terms of scheduling the update. And, though it is called an update policy, we are only appending new rows to the target table; there are no updates to existing rows. The following figures show the function and update policy used to populate the silverSalesOrderHeader table.

KQL_Function_definition_used_in_a_Table_Update_Function_to_query_for_Sales_OrderKQL_Function_definition_used_in_a_Table_Update_Function_to_query_for_Sales_Order

Figure - Function used in Table Update Function.

KQL_Table_Update_Policy_DefinitionKQL_Table_Update_Policy_Definition

Figure - Table Update Function

Note that the silver tables can contain multiple rows for a given Order Header or Order Detail. For example, when a new Order Header is created, a row with an op (i.e. operation) value of c is ingested. Later, if the Order Header is updated, another row with an op value of u is ingested.

A Materialized View, using the summarize operation and arg_max() aggregation function, provides a persisted, deduplicated query that makes it easier for downstream users to analyze orders. Similar to a materialized view in a data warehouse, the view is maintained automatically (as new data arrives).

Materialized_View_Definition_references_the_silverSalesOrderHeader_tableMaterialized_View_Definition_references_the_silverSalesOrderHeader_table

Figure - Materialized View Definition

Gold Layer: The latest (and denormalized) view of orders

Depending on the use case, creation of a gold layer may be done outside the eventhouse; for example, additional transformations may be done in a Power BI semantic model. For this scenario, we make use of a view to do a bit of final clean-up. Specifically, we join the two materialized views together – and filter our any deleted records.

Latest_Value_View_definition_joins_together_the_Sales_Orders_Header_and_Sales_OrLatest_Value_View_definition_joins_together_the_Sales_Orders_Header_and_Sales_Or

Figure - Latest Value View

Generate Test Data

In addition to the scripts needed to configure CDC, the SQL script 01_NewOrders.sql (located in the CDCSetup folder) can be used to create a new Sales Order. Run this script (from a tool like SSMS or Azure Data Studio) to create new rows in the database – and then inspect the events being ingested into the eventhouse by running KQL queries. The SQL script itself is simple, but it covers all the basic operations – inserts, updates, and deletes.

New_Sales_Order_Header_and_Sales_Order_Detail_being_generated_in_the_Azure_SQL_DNew_Sales_Order_Header_and_Sales_Order_Detail_being_generated_in_the_Azure_SQL_D

Figure - New Sales Order generated (and then modified) with T-SQL script

Summary and Next Steps

Fabric RTI can continuously process transactional events - including updated and/or deleted records. While the approach differs from traditional data warehousing techniques, RTI can handle large volumes of data – while providing a full audit trail of changes for analysis and monitoring. I invite you to head over to the GitHub repository – and try recreating this sample in your environment. Submit your feedback on Fabric Ideas and join the conversation on the Fabric Community.