Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
wilson_smyth
Post Patron
Post Patron

Best design choice for multiple granularity

I have a data model which i need to report on at multiple granularities.

Below is an example query that will populate the fact table with ID's from each dimension.

 

model.JPG

 


select o1.id, o2.id, o3.id
from o1
join o1o2Bridge on o1.id = o1o2Bridge.id1
join o2 on o1o2Bridge.id2 = o2.id
join o2o3Bridge on o2.id = o2o3Bridge.id2
join o3 on o3.id = o2o3Bridge.id3

 

This works fine if all tables have data and associated bridging data. But if each row in O1 does not have an associated row in O3, the row wont appear in the fact table.

e.g. O1.ID = 2 wont be in fact table as it is not associated with any row from O2, same for O1.ID = 3

 

Assume i want to get counts of O2 rows linked to O1 rows, noting that not all O2 rows will be linked to an O3 row

Id like to hear if i am neglecting to consider any options.

 

My options  as i see them are:

- use outer joins 

pull all data, add a row to O3 called "no linked O3" & link any O2 rows to this dimension entry that dont have an associated Row3. similar to the practice of adding "unknown" to a dimension to handle nulls.

Im not a fan of this as if im doing basic counts, they will appear incorrect.
e.g. For O1.ID = 2, a count of O2 will show a value of 1, even though the value should be zero.
e.g. if i do a count of O3 under O2 objects, i will still get a value of 1 for Obj2.id = 3.
I know i can correct this in dax but it seems a lot of work.

 

-Multiple fact tables.

Have a fact table showing O1 & O2 & a fact table showing O1, O2 & O3. This way two granularies are accounted for, but im now reusing almost the same query twice, and running it twice. Model is now also larger and more complex 

Im still leaning towards this.


Would appreciate any input on the best practices in this situation.

 

 

 

3 REPLIES 3
v-danhe-msft
Employee
Employee

Hi @wilson_smyth,

Based on my research, I suggest you to use the merge funciton in query editor:

Referenc:http://radacad.com/append-vs-merge-in-power-bi-and-power-query

 

Also you could add an index column in each table, then you could merge them by the index.

 

Regards,

Daniel He

Community Support Team _ Daniel He
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Thansk for the reply.


Merging/appending is not the issue. obtaining the data is easy via merge/append or just modifying my source queries. Im asking about design practices, specifically:

 

- when there are many shared dimensions, is it better to have seperate fact tables at the desired granularity so that i dont have to deal with nulls when a row does not exist at a higher granularity

OR

should i have 1 single fact table, and wherever there are NULLS due to data not available at a lower granularity, populate it with the "unknown" value that i populate into the dimensions & deal with it in dax code.

 

@wilson_smyth 

 

I find myseflf regularly in the same sitiuation. Therefor the subject interests me greatly


What was your end conclusion on this subject?

Where you able to find more info?

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.