Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Join us at FabCon Atlanta from March 16 - 20, 2026, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM. Register now.

Reply
bexbissell
Frequent Visitor

Data Model Design

Hi folks,

 

Power BI novice at work here! I am looking for some feedback on how to design a data model / transform data to meet my reqiurements.

 

Background

Every year we produce data that is a snapshot of our industry: there are Accounts (companies), the roles they play (Administrator, Custodian, Legal Adviser etc) and the Groups they service - here is a simplified data model:
Physical Data Model.png

So, this means that from year-to-year the data can look like:

Year 1

Physical Object Model - Y1.png

Year 2

....and next year:

Physical Object Model - Y1.png


Every year we will ingest this full dataset to the model to be able to generate reports such as 'Top Administrators' for that year or 'Largest Group' but I would also like to see the evolution of a particular Account or Group over time - won business through new Groups or lost business.

 

Solution?

I have followed the 'Star Schema' pattern and consolodated imported tables to reduce the number of Fact tables (not shown):

Star Schema.png

 

Questions

1. Year-on-year I will be appending a new dataset to the Fact tables - is that a good approach, any tips here?

2. Assuming yes above there will be duplication of the Account, Service Provider and Groups records. In the Transform should I create a composite key for each record each year e.g. based upon the AccountID, ServiceProviderID and GroupID combined with the Reporting Date and use that in the Fact table relationships? - Does that make sense or is there a better way?

 

Your feedback is very much appreciated.

Many thanks

1 ACCEPTED SOLUTION

Hi @bexbissell ,

 

For the optimization of the model, I can also provide some suggestions.

If it is Direct Query connection mode: you can optimize your data model using following tips:

  • Remove unused tables or columns, where possible. 
  • Avoid distinct counts on fields with high cardinality – that is, millions of distinct values.  
  • Take steps to avoid fields with unnecessary precision and high cardinality. For example, you could split highly unique datetime values into separate columns – for example, month, year, date, and so on. Or, where possible, use rounding on high-precision fields to lower cardinality – (for example, 13.29889 -> 13.3).
  • Use integers instead of strings, where possible.
  • Be wary of DAX functions, which need to test every row in a table – for example, RANKX – in the worst case, these functions can exponentially increase run-time and memory requirements given linear increases in table size.
  • When connecting to data sources via DirectQuery, consider indexing columns that are commonly filtered or sliced again. Indexing greatly improves report responsiveness.  

 

For more model optimization guidelines, you can refer to the following document links: Optimization guide for Power BI - Power BI | Microsoft Docs


Looking forward to your feedback.

Best Regards,
Henry

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

View solution in original post

3 REPLIES 3
amitchandak
Super User
Super User

@bexbissell , the model looks good.

Numeric Key for Composite keys is good whenever needed. But if you are doing that in power bi it will add a lot of cost at the time of loading.

 

 

Share with Power BI Enthusiasts: Full Power BI Video (20 Hours) YouTube
Microsoft Fabric Series 60+ Videos YouTube
Microsoft Fabric Hindi End to End YouTube

Thanks for the feedback on the model schema @amitchandak - I've learnt a lot about model schemas, good modelling practice and model relationships, now I'm just trying to put it into practice.

The dataset is relatively small so generating the compostite keys on load/transform does not take long (20 secs). 

Hi @bexbissell ,

 

For the optimization of the model, I can also provide some suggestions.

If it is Direct Query connection mode: you can optimize your data model using following tips:

  • Remove unused tables or columns, where possible. 
  • Avoid distinct counts on fields with high cardinality – that is, millions of distinct values.  
  • Take steps to avoid fields with unnecessary precision and high cardinality. For example, you could split highly unique datetime values into separate columns – for example, month, year, date, and so on. Or, where possible, use rounding on high-precision fields to lower cardinality – (for example, 13.29889 -> 13.3).
  • Use integers instead of strings, where possible.
  • Be wary of DAX functions, which need to test every row in a table – for example, RANKX – in the worst case, these functions can exponentially increase run-time and memory requirements given linear increases in table size.
  • When connecting to data sources via DirectQuery, consider indexing columns that are commonly filtered or sliced again. Indexing greatly improves report responsiveness.  

 

For more model optimization guidelines, you can refer to the following document links: Optimization guide for Power BI - Power BI | Microsoft Docs


Looking forward to your feedback.

Best Regards,
Henry

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Helpful resources

Announcements
FabCon Global Hackathon Carousel

FabCon Global Hackathon

Join the Fabric FabCon Global Hackathon—running virtually through Nov 3. Open to all skill levels. $10,000 in prizes!

October Power BI Update Carousel

Power BI Monthly Update - October 2025

Check out the October 2025 Power BI update to learn about new features.

FabCon Atlanta 2026 carousel

FabCon Atlanta 2026

Join us at FabCon Atlanta, March 16-20, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.