Reply
sgeheeb
Helper III
Helper III
Partially syndicated - Outbound

Model Design - Using a Surrogate Key

Hi,

 

I am trying to figure out which way of designen my data model is better.

 

Background:

I am building a data model based on Business Central (SaaS) data, so I cannot perform any SQL operations to prepare the data outside of Power Query.

I have a fact table (Sales) and a dimension table (Item).

 

A simplified version of the fact table:

Document No_  | Item No_ | Variant Code | Quantity

---------------------------------------------------------

D1                        I1              V1                     2

D2                        I2              V1                     3

 

 

A simplified version of the dimension table:

Item No_ | Variant Code | Description

----------------------------------------------

I1               V1                   Item 1 Variant 1

I2               V1                   Item 2 Variant 1

 

An Item is identified by its [Item No_] and [Variant Code]. I need to create a relationship the fact and the dimension table.

Now, two possiblities come to mind:

 

Version 1: Surrogate Key

---------------------------

This would be the approach I would use if I had SQL at my disposal. Create a surrogate key for every ([Item No_]; [Variant Code]) in the dimension table and set the correct foreign key in the fact table when merging from my staging tables into my DWH tables.

I have replicated this approach in Power Query:

Step 1: Build an Index in the [Item] table (called [Item Key])

Step 2: Merge [Item] to [Sales] with a left join

Step 3: Expand [Item].[Item Key]

Step 4: Delete [Item No_] and [Variant Code] from the [Sales] table

Step 5: Create the relationship in the Model view

 

This was the intuitive approach I took. As soon as a get to the Merge-part, everything becomes terribly slow. Preview refreshes take forever and so does refreshing the data in Power BI. The performance of the reports themselves seems fine so far.

 

Version 2: Composite Business Key

--------------------------------------

I am not sure if this is the correct term. But basically, I would not do any Merges. Instead, I would concentenate the [Item No_] and [Variant Code] fields in both the fact and the dimension table and then use these new colums to create the relationship.

Step 1: Concatenate [Item No_] and [Variant Code] in the [Sales] table

Step 2: Concatenate [Item No_] and [Variant Code] in the [Item] table

Step 3: Delete [Item No_] and [Variant Code] from the [Sales] table

Step 4: Create the relationship in the Model view

 

This provides much better performance when refreshing previews and data. Now, is there a downside to using Solution 2 over Solution 1? Initially, Solution 2 is much faster, but I am worried that having the concatenated relationship-columns (Strings) instead of the Surrogate Keys (Integers) might have a negative performance impact in the long run (probalby more on the report performance then on Power Query / Data refresh).

 

Any input / experience on this topic is much appreciated. If you have a third solution which I did not consider, please let me know.

 

Cheers

 

1 ACCEPTED SOLUTION
johnt75
Super User
Super User

Syndicated - Outbound

You should be fine using the concatenated values as strings, they will be compressed and stored the same way as integers. There's a really good article with benchmark tests at https://www.maxwikstrom.se/performance/power-bi-data-types-in-relationships-does-it-matter/ 

View solution in original post

2 REPLIES 2
johnt75
Super User
Super User

Syndicated - Outbound

You should be fine using the concatenated values as strings, they will be compressed and stored the same way as integers. There's a really good article with benchmark tests at https://www.maxwikstrom.se/performance/power-bi-data-types-in-relationships-does-it-matter/ 

Syndicated - Outbound

Thanks! That's a great artilcle indeed! I think I'll go with the concatenated string, it will also make the transformation process easier to understand

avatar user

Helpful resources

Announcements
March PBI video - carousel

Power BI Monthly Update - March 2025

Check out the March 2025 Power BI update to learn about new features.

March2025 Carousel

Fabric Community Update - March 2025

Find out what's new and trending in the Fabric community.

Top Solution Authors (Last Month)
Top Kudoed Authors (Last Month)