Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Find everything you need to get certified on Fabric—skills challenges, live sessions, exam prep, role guidance, and more. Get started

Reply
Anders_Olsen
New Member

Data modeling (star schema) to a big student survey

Hello

I am working on a big student survey with 60 000 participants. The survey has 70 questions in 7 themes like IT, the physical environment, friendship and so on. The result must be shown in Power BI. How do I manage the data modelling into a star schema?

The dimensions are degree, faculty, town, education and year.

The facts are the 70 questions.

This means the fact table are very broad and deep and the performance in power BI is very poor. I guess I have to cut the fact table into themes and unpivot the columns. Does it sound right?

But if I cut the big fact table into 7 smaller tables then I get 5 dimension tables and 7 fact tables. This does not sound as a regular star schema model.

I would greatly appreciate any help

Happy new year!

Anders 

4 REPLIES 4
v-rzhou-msft
Community Support
Community Support

Hi @Anders_Olsen 

I think you don't need to cut your fact table into 7 smaller fact tables.

For star schema you can refer to this blog: Understand star schema and the importance for Power BI

For Optimization guide for Power BI you can refer to this blog:  Optimization guide for Power BI

 

Best Regards,

Rico Zhou

 

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly. 

parry2k
Super User
Super User

@Anders_Olsen you don't need to separate fact by themes but surely do unpivot. Fact tables are usually long and skinny in nature. 

 

Once it is unpivoted, everything should work fine. I work with very large tables, having a proper model (star schema) and well written DAX, you can get amazing performance.

 

Good luck!

 

Check my latest blog post Year-2020, Pandemic, Power BI and Beyond to get a summary of my favourite Power BI feature releases in 2020

I would  Kudos if my solution helped. 👉 If you can spend time posting the question, you can also make efforts to give Kudos to whoever helped to solve your problem. It is a token of appreciation!

Visit us at https://perytus.com, your one-stop-shop for Power BI-related projects/training/consultancy.



Subscribe to the @PowerBIHowTo YT channel for an upcoming video on List and Record functions in Power Query!!

Learn Power BI and Fabric - subscribe to our YT channel - Click here: @PowerBIHowTo

If my solution proved useful, I'd be delighted to receive Kudos. When you put effort into asking a question, it's equally thoughtful to acknowledge and give Kudos to the individual who helped you solve the problem. It's a small gesture that shows appreciation and encouragement! ❤


Did I answer your question? Mark my post as a solution. Proud to be a Super User! Appreciate your Kudos 🙂
Feel free to email me with any of your BI needs.

lbendlin
Super User
Super User

Why would the performance be poor? 60K rows is very small for the student dimension (and your other dimension tables sound like they have low granularity too). Even multiplying 60K by 70 still only gives you 4.2M rows - nothing to worry about.

 

Star/Snowflake schemas are recommended, but they are not the law.  Technically none of the BI solutions insist on them. It's more for the users to have a familiar concept to hold onto.

 

Do your design with a fact table that pretty much only has index columns pointing to the dimensions (plus the actual answers to the question) and create your dimensions as you see fit.  Again, use index/primary key columns (integer!) in those dimension tables if you want to feel good about the design.

I know this is old but in case someone comes across it....as @lbendlin pointed out, 60K is very small so if you're having performance problems it's your design, not the data volume that's the issue.  Few red flags jumped out at me were....why isn't the list of questions a dimension?  Theme is merely an attribute on the question dimension.  Degree is attribute of student.  You might also have a hierarchy that supports the question but that's just attributes of that dimension. 

 

This ties back to someone providing link to star schema design as there is clearly something fundamentally wrong.  You mentioned faculty as a dimension which if these are students then faculty as a dimension doesn't make any sense.  That tells me you're not providing everything involved here.  Focus on the fact grain - always start there. Is it question - student - geography (aka town), date/time?

 

FYI - here's a starter star schema where you could havest ideas for pieces of what you need...Outcomes Survey Star Schema (blackboard.com)

Helpful resources

Announcements
Sept PBI Carousel

Power BI Monthly Update - September 2024

Check out the September 2024 Power BI update to learn about new features.

September Hackathon Carousel

Microsoft Fabric & AI Learning Hackathon

Learn from experts, get hands-on experience, and win awesome prizes.

Sept NL Carousel

Fabric Community Update - September 2024

Find out what's new and trending in the Fabric Community.