Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Find everything you need to get certified on Fabric—skills challenges, live sessions, exam prep, role guidance, and more. Get started

Reply
richard-powerbi
Post Patron
Post Patron

Merge hierarchical dim tables to improve model?

mdl2.png

 

  1. I am looking for a theoretical explanation whether it is a valid/smart technique to use model 3 instead of model 1 or 2. I am talking about the idea of merging hierarchical dim tables to simplify/improve the model.
  2. Also I'm wondering if for model 2 I should put the key table above the dim tables like drawn, or if I should put the key table between the dim tables and the fact tables.

You need to know:

  • There is a clear hierarchy between A, B and C. A can have multiple of B, and B can have multiple of C.
  • In reality my hierarchy is bigger, think of A to G.
  • In reality I have more fact tables that connect to different levels in the hierachy of the merged dim table: depending on the granularity of the fact table, the fact table connects to a different level in the hierachy.

The reasoning why I am thinking about doing this:

  • Let's say I start with model 1 because I want to separate dimensions from facts
  • Then I run into a problem: dimensions do not filer each other, this is a strong requirement for my reports
  • Then I decide to make model 2: the key table filters the dim tables
  • Then I realize this model has a lot of tables, a lot of relationships, and a lot of the same keys at several places
  • I get the idea to merge all dim tables to get model 3
  • Because of the cardinality between B and C, extra rows for B are created in the merged dim table 😞
  • Now there is a many to one relationship between the dim table and the fact table for B-B 😞
  • There still is a one to one relationship between the dim table and the fact table for C-C
  • To be able to filter the B-fact table I need to enable bidirectional filtering for B-B 😞

Advantages I see for model 3:

  • Decreases amount of tables
  • Decreases amount of relationships
  • Decreases occurrences of the same data at several places (keys)
  • Allows for creating hierarchies in Power BI
  • Allows my dimension to filter each other (because they are in 1 table)

Disadvantages:

  • Non typical star schema design??
  • The many to one relationship between the dim table and the fact table for B-B worries me: reversed from what it should be in star schema design
  • One has to be careful to use enough fields for filtering inside the merged dim table to filter the fact table(s) down to 1 row
  • ?? other ??

I would love to hear what experts have to say about this.
Personally I feel model 3 is the right one for me, but it being non typical star schema design worries me. I want to expand a lot on this model with extra tables in the future. I hope I don't get stuck with this approach. 

12 REPLIES 12
samfeinb
Frequent Visitor

@richard-powerbi 

Thank you for sharing this.  I am facing something similar.  I feel the hierarchical fact tables with dimensions cross filtering on all must be a fairly common scenario for an enterprise data model but i haven't found much on this subject.

 

My issue is with a hierarchy of fact tables relating to the following hierarchy (top to bottom): client, account, opportunity, support request, ticket.  I settled on an interim solution which was to create a (too wide fact table) on service ticket in interim.  I tried snowflaking, but the queries got very complex.  It created a hierarchy table with only the IDs and names so that this could be used to access all of the facts via inactive relationships.  The data quality of the service tickets is poor.  I'm wondering what you ultimately settled on?

 

 

 

parry2k
Super User
Super User

@richard-powerbi you wanna strive for the star schema which is preferred, Hope this post helps.

 

I would  Kudos if my solution helped. 👉 If you can spend time posting the question, you can also make efforts to give Kudos whoever helped to solve your problem. It is a token of appreciation!

 

 



Subscribe to the @PowerBIHowTo YT channel for an upcoming video on List and Record functions in Power Query!!

Learn Power BI and Fabric - subscribe to our YT channel - Click here: @PowerBIHowTo

If my solution proved useful, I'd be delighted to receive Kudos. When you put effort into asking a question, it's equally thoughtful to acknowledge and give Kudos to the individual who helped you solve the problem. It's a small gesture that shows appreciation and encouragement! ❤


Did I answer your question? Mark my post as a solution. Proud to be a Super User! Appreciate your Kudos 🙂
Feel free to email me with any of your BI needs.

@parry2k I've read that before. I've also read many other documentation. I hope someone can help me with my specific dilemma.

@richard-powerbi not sure what is the dilemma here, it is fundamental of data warehousing. Anyhow, good luck.



Subscribe to the @PowerBIHowTo YT channel for an upcoming video on List and Record functions in Power Query!!

Learn Power BI and Fabric - subscribe to our YT channel - Click here: @PowerBIHowTo

If my solution proved useful, I'd be delighted to receive Kudos. When you put effort into asking a question, it's equally thoughtful to acknowledge and give Kudos to the individual who helped you solve the problem. It's a small gesture that shows appreciation and encouragement! ❤


Did I answer your question? Mark my post as a solution. Proud to be a Super User! Appreciate your Kudos 🙂
Feel free to email me with any of your BI needs.

@parry2k Did you actually read my post? I find your answers a bit strange... I've read tons of things about modeling, Kimball, and so on. If I knew exactly what to do I wouldn't be asking these questions right? Isn't that what this forum is for?...

@richard-powerbi indeed forum is to help each other and share knowledge and I read your post in detail and the reason I pointed to the document as it was not mentioned that you read that post. Again, it is all about best practice, all 3 models will work, are scalable, perform well under large dataset etc. There are many other parameters before even someone can answer why choose 1 model over another model. Based on your last reply, you know enough about warehousing and I'm a bit lost what exactly you are looking for. Sorry I might not be the right person or understood your question. Cheers!! 



Subscribe to the @PowerBIHowTo YT channel for an upcoming video on List and Record functions in Power Query!!

Learn Power BI and Fabric - subscribe to our YT channel - Click here: @PowerBIHowTo

If my solution proved useful, I'd be delighted to receive Kudos. When you put effort into asking a question, it's equally thoughtful to acknowledge and give Kudos to the individual who helped you solve the problem. It's a small gesture that shows appreciation and encouragement! ❤


Did I answer your question? Mark my post as a solution. Proud to be a Super User! Appreciate your Kudos 🙂
Feel free to email me with any of your BI needs.

@parry2k I'm sorry if I didn't manage to explain it well enough. 

 

My dilemma is this:

How can it be that - to me - model 3 seems better, although I have never seen it anywhere. I've searched the Internet endlessly to find something similar, but I can't find documentation about it.

So even thought I think - with my current knowledge - that model 3 'works', the fact that I can't find anything about it makes me doubt it is good model design. So I'm worried I run into problems later when using this in Power BI.

 

Quoting this (following your link)...

There's no table property that modelers set to configure the table type as dimension or fact. It's in fact determined by the model relationships. A model relationship establishes a filter propagation path between two tables, and it's the Cardinality property of the relationship that determines the table type. A common relationship cardinality is one-to-many or its inverse many-to-one. The "one" side is always a dimension-type table while the "many" side is always a fact-type table.

... my conclusion would be that table B in model 3 cannot be a fact table. The cardinality (caused by the merged hierarchies and hereby the extra rows created for B) would define my dim table as a fact table. But I have only facts in table B... so I want to filter them, thus I make the relationship bidirectional so that I can filter my B-facts anyway. At the same time, at a lower granularity in the hierarchy, for example C, has a one to one cardinality. And another fact table (not drawn) with multiple rows for C could have a many to one (dim side) cardinality, making it a perfect star schema part of the model. But as becomes clear, the model seems 'mixed', as in being partly good star schema design, partly not.

 

Depending on the level the 'tables containing facts' connect to the hierarchy inside the 'big merged table containing dimensions', the cardinality is different for 'tables containing  facts' at different granularities. As a result the star schema has different cardinalities at the same time:

  • 'big merged table containing dimensions' -- many to 1-- 'tables containing facts at higher grain'
  • 'big merged table containing dimensions' -- 1 to 1-- 'tables containing facts at equal grain'
  • 'big merged table containing dimensions' -- 1 to many - 'tables containing facts at lower grain'

As far as my understanding goes, this whole concept blurs the lines between what is a dimension and what is a fact inside this model. Although I made clear distinct between what are dimensions (descriptive) and facts (numeric), because I put them in different tables, the changing cardinality defines them otherwise depending on the grain.

 

Whenever I read about star schema design I always read that facts must be one the many side, and dimensions on the one side. So can anyone explain me if this makes model 3 by definition faulty star schema design, or can someone explain me why it could be good design in some cases.  

 

Inside my head this is how I see model 3 'work' as good star schema design:

  • As long as I make sure, that I filter down to the correct level in the hierarchy, I know that the values for the fact tables will be filtered correctly.
  • With DAX I can use functions like ALLEXCEPT to foresee bad filtering.

Anyone?

hi  @richard-powerbi 

To my knoeledge, Model 3 is not a good choose in power bi for now, dim table in model 3 is like a date hierarchy, but in fact, A,B,C are not like it, they are individual dim table, so the star schema which is preferred.

 

Regards,

Lin

Community Support Team _ Lin
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

What is the disadvantage of model 3?

And how would I best let dimensions filter each other in the star schema?

Hi Richard,

 

Have you figured out a way of doing it?

 

I have this pretty big model that I built since I was a beginner and didn't really understand, but I made it work somehow. A lot has been added to it since so I am looking to improve it.

 

Like you I find nothing other than very basic data models used as examples for best practices, which is seldom the case in real life.

 

For example I need to filter on system, but sometimes on the sub-system. Then there are units below there, and even sub-units.

 

They all have different fact tables that are on the different granularities, and I kind of need a main system DIM to tie all the other DIMS together. E.g. DimSystem ->DimUnit->FactUnitValues, DimSystem -> FactSystemValues, And then a DimDate connected to all fact tables.

 

Then I for example need some dimensions added to one fact table to describe a process. So then it becomes DIMSystem -> DimExtradimentions ->Factneedextradimentions. So i need to but in a DIM between the main DIM and the fact.

 

Any thoughts?

 

Best regards,

 

Ole

Hi!

 

Very good thoughts. After many different approach, I also decided to use model 3 for my current job, as it has very similar challenges. Someone posted this:

https://learn.microsoft.com/en-us/power-bi/guidance/star-schema
Please check again at least the 'Junk dimensions' part. Basically this is your model 3, so it is accepted by Microsoft as well. 
Relationship: you are able to keep 1 to many from dim to fact. How?
dim table: SELECT XXX, YYY, ZZZ FROM A UNION SELECT XXX, YYY, '' AS ZZZ FROM B (in case table B has no ZZZ column). Then remove duplicates, add index as a Key.
fact tables: SELECT XXX, YYY, ZZZ, KPI1 FROM A

SELECT XXX, YYY, '' AS ZZZ, KPI2 FROM B

Within Power Query Merge the dim table to factA based on XXX, YYY, ZZZ and expand out the dim Key then just delete XXX, YYY, ZZZ -> You only have Key and KPI1/2. Fact B need to be merged same way, just ZZZ will be null values, but that will bring the proper Key back to the fact table.

Then within Power BI you will able to join your only 1 dim table to both of your fact tables as 1 to many. 

Now I am testing to merge all of my dimensions into 1 table, only date need to be handled on a different date table.

I hope, this helps!

Helpful resources

Announcements
Sept PBI Carousel

Power BI Monthly Update - September 2024

Check out the September 2024 Power BI update to learn about new features.

September Hackathon Carousel

Microsoft Fabric & AI Learning Hackathon

Learn from experts, get hands-on experience, and win awesome prizes.

Sept NL Carousel

Fabric Community Update - September 2024

Find out what's new and trending in the Fabric Community.

Top Kudoed Authors