The ultimate Microsoft Fabric, Power BI, Azure AI, and SQL learning event! Join us in Stockholm, Sweden from September 24-27, 2024.
2-for-1 sale on June 20 only!
Find everything you need to get certified on Fabric—skills challenges, live sessions, exam prep, role guidance, and more. Get started
One of the major trends in data warehousing right now is the avoidance of integer surrogate keys because the eliminate idempotence (i.e. you have to reload the tables to regeneate them). As an alternative, and especially in Snowflake, Redshift, and BigQuery, people are increasingly using a hash of the natural keys in order to join data warehouse tables.
Does anyone know how hashed keys perform on large tables (e.g. 1 billion row fact, 1 million row dimension) in Power BI?
I know that Power BI builds a hash-integer lookup table itself, which takes up memory. I'm wondering how much it takes up, and what effect this has on load and user performance?
Thanks!
Solved! Go to Solution.
You may want to read a couple of the articles that discuss Vertipaq compression techniques and their performance, like this one
Inside VertiPaq - Compress for success - Data Mozart (data-mozart.com)
What a helpful article, thanks! I spoke to a friend who shared that he's tried this and using hashed keys instead of integer keys doesn't make a huge difference. This seems to fit with what the article describes. Thanks!
You may want to read a couple of the articles that discuss Vertipaq compression techniques and their performance, like this one
Inside VertiPaq - Compress for success - Data Mozart (data-mozart.com)
Join the community in Stockholm for expert Microsoft Fabric learning including a very exciting keynote from Arun Ulag, Corporate Vice President, Azure Data.
Check out the June 2024 Power BI update to learn about new features.
User | Count |
---|---|
51 | |
42 | |
23 | |
23 | |
18 |