Find everything you need to get certified on Fabric—skills challenges, live sessions, exam prep, role guidance, and more.
Get startedGrow your Fabric skills and prepare for the DP-600 certification exam by completing the latest Microsoft Fabric challenge.
Hello,
I have a bit of an issue I'm trying to solve...
I have data which is for all users of a platform, at daily granularity. The users have a GUID as their ID, which even on it's own is large. Then when combining with the date to make a primary key (which I need as the user attributes can change on any day) this ends up making an absolutely massive column, which then has to sit on every related table! Ouch!
My initial thought was to index all users and replace the GUID on other tables with their index, and add the date as a decimal of the index. This obviously keeps the cardinality high but reduces the dictionary size.
Is this a good plan, or am I missing a brilliant idea? How have others handled such an issue?
Thanks!
Charley
Solved! Go to Solution.
Not exactly. Both the Date and Person tables would relate directly to the fact table. But you could also replace your GUID string with an integer index in both tables for the relationship.
Pat
Why not just have user and date dimension tables and keep those columns separate in your fact table(s) with relationships to those dimensions?
Pat
Do you mean like,
It doesn't help with that massive GUID column... we have several fact tables (they don't just eat apples, lol!) so with that column on every table it ends up being huge.
There are also other attributes on the Person ID table we would need to filter by - e.g. person 6A7811FD-AF14-4A90-8026-A3FD92E69726 doesn't always like apples 🤔
Not exactly. Both the Date and Person tables would relate directly to the fact table. But you could also replace your GUID string with an integer index in both tables for the relationship.
Pat