Microsoft is giving away 50,000 FREE Microsoft Certification exam vouchers!
Enter the sweepstakes now!Preparing for a certification exam? Ask exam experts all your questions on May 15th. Register now.
Hi
I have a customer dimension with a few demographics columns that will filter my products fact table. I will also report the counts of the demographics in the dimension. The problem comes when a demographic can have more than one response. I want to count these values just like the single response demographics so I dont want to store them as delimited text values in a column within the customer dimsion.
Adding an extra dimension, for each multiple response demographic, to the customer dimension seems like a simple fix to this but I keep reading the we should try to avoid snowflaking, and so am now wondering if having these multiple response demographics as separate fact tables is a better.
I've tried to show how the snowflake might look below, where demographics 1 - 5 are single response and 6,7,8 are multiple response...
Is this a valid case for snowflaking or should I seek an alternative design?
Many Thanks
Luke
Solved! Go to Solution.
Hi @_luke_ ,
Yes, this is absolutely a valid case for snowflaking. In fact, it’s one of the few scenarios where snowflaking makes your model more efficient, not less. You have demographics that are single-response (Demog1 to Demog5), which fit nicely into your main dimCustomer table. But for Demog6, Demog7, and Demog8 — where a customer can have multiple values — modeling them as separate dimension tables linked to dimCustomer is the cleanest and most scalable approach.
Trying to cram multi-response values into delimited text columns in the customer dimension would create headaches for filtering and aggregation. Power BI doesn’t handle delimited fields natively — so filtering on "contains" or splitting them out dynamically for proper visual use becomes a nightmare. A better solution is to create bridge tables for these multi-response demographics.
For example, for dimDemog6, you can have a table structured like this:
CustomerId | Demog6_Value |
C001 | Vegan |
C001 | Gluten-Free |
C002 | Vegetarian |
C003 | Vegan |
Then, your dimDemog6 dimension might look like:
Demog6_Value |
Vegan |
Gluten-Free |
Vegetarian |
This allows customers to be associated with multiple Demog6 values without repeating rows in the customer table or using delimited strings.
Once you've linked it up:
Now in Power BI, if you want to count how many customers selected each Demog6 value, you can use:
Customer Count (Demog6) =
CALCULATE(
DISTINCTCOUNT(dimCustomer[CustomerId]),
TREATAS(VALUES(dimDemog6[Demog6_Value]), BridgeDemog6[Demog6_Value])
)
Or, if you're using bi-directional filtering between the bridge table and dimCustomer, you can simplify it to:
Customer Count (Demog6) = DISTINCTCOUNT(dimCustomer[CustomerId])
when filtered by a slicer from dimDemog6.
So yes, even if snowflaking has a reputation problem, in this context it's doing exactly what it should: allowing for multi-response attributes to be modeled cleanly and queried effectively without delimited strings, wide tables, or DAX acrobatics. You’re modeling reality more faithfully, and Power BI will thank you for it.
Best regards,
Hi @_luke_
Thank you for reaching out microsoft fabric community forum.
May I ask if you have resolved this issue? If so, please mark the helpful reply and accept it as the solution. This will be helpful for other community members who have similar problems to solve it faster.
Thank you.
Hi @_luke_ ,
Yes, this is absolutely a valid case for snowflaking. In fact, it’s one of the few scenarios where snowflaking makes your model more efficient, not less. You have demographics that are single-response (Demog1 to Demog5), which fit nicely into your main dimCustomer table. But for Demog6, Demog7, and Demog8 — where a customer can have multiple values — modeling them as separate dimension tables linked to dimCustomer is the cleanest and most scalable approach.
Trying to cram multi-response values into delimited text columns in the customer dimension would create headaches for filtering and aggregation. Power BI doesn’t handle delimited fields natively — so filtering on "contains" or splitting them out dynamically for proper visual use becomes a nightmare. A better solution is to create bridge tables for these multi-response demographics.
For example, for dimDemog6, you can have a table structured like this:
CustomerId | Demog6_Value |
C001 | Vegan |
C001 | Gluten-Free |
C002 | Vegetarian |
C003 | Vegan |
Then, your dimDemog6 dimension might look like:
Demog6_Value |
Vegan |
Gluten-Free |
Vegetarian |
This allows customers to be associated with multiple Demog6 values without repeating rows in the customer table or using delimited strings.
Once you've linked it up:
Now in Power BI, if you want to count how many customers selected each Demog6 value, you can use:
Customer Count (Demog6) =
CALCULATE(
DISTINCTCOUNT(dimCustomer[CustomerId]),
TREATAS(VALUES(dimDemog6[Demog6_Value]), BridgeDemog6[Demog6_Value])
)
Or, if you're using bi-directional filtering between the bridge table and dimCustomer, you can simplify it to:
Customer Count (Demog6) = DISTINCTCOUNT(dimCustomer[CustomerId])
when filtered by a slicer from dimDemog6.
So yes, even if snowflaking has a reputation problem, in this context it's doing exactly what it should: allowing for multi-response attributes to be modeled cleanly and queried effectively without delimited strings, wide tables, or DAX acrobatics. You’re modeling reality more faithfully, and Power BI will thank you for it.
Best regards,
Check out the April 2025 Power BI update to learn about new features.
Explore and share Fabric Notebooks to boost Power BI insights in the new community notebooks gallery.
User | Count |
---|---|
76 | |
72 | |
71 | |
48 | |
41 |
User | Count |
---|---|
54 | |
48 | |
33 | |
32 | |
28 |