Solved: Star Schema Question

Ritesh_Air · ‎08-05-2020

Hi,

I am starting to create a model. I have 2 approaches:

1. If I use the query route which has all the joins then I get about half a million rows (fact table and couple of dimension table).

2. If I go through the STAR SCHEMA route then there are about 9 million rows to begin with in fact table. I will join other dimension tables to get desired result but my question is : Where does the processing happens, if I go through STAR SCHEMA route, won't it take lot of time for queries to run?

Thanks,

Ritesh

amitchandak · ‎08-05-2020

@Ritesh_Air , refer

https://radacad.com/power-bi-basics-of-modeling-star-schema-and-how-to-build-it

https://docs.microsoft.com/en-us/power-bi/guidance/star-schema

Share with Power BI Enthusiasts: Full Power BI Video (20 Hours) YouTube
Microsoft Fabric Series 60+ Videos YouTube
Microsoft Fabric Hindi End to End YouTube

View solution in original post

amitchandak · ‎08-05-2020

@Ritesh_Air , refer

https://radacad.com/power-bi-basics-of-modeling-star-schema-and-how-to-build-it

https://docs.microsoft.com/en-us/power-bi/guidance/star-schema

Share with Power BI Enthusiasts: Full Power BI Video (20 Hours) YouTube
Microsoft Fabric Series 60+ Videos YouTube
Microsoft Fabric Hindi End to End YouTube

lbendlin · ‎08-05-2020

The answer is as always - It depends.

Flat source: liked by the Vertipaq engine because it can be stored with nice compression. Creates processing cost at the source

Dimensions and facts: not very compressible depending on level of normalization. Should generally (not in your case) result in fewer bytes being transferred over the network. Can directly be converted to an in-memory data model and should be both fast and not consume as much memory.

You'll have to try it out and find the sweet spot between total denormalization and excessive normalization, taking into account the processing power at the source, the available memory on the desktop and in the service, as well as network performance.

And this all before you even start considering the ETL cost of Power Query and the measures cost in DAX 🙂 ...