Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Enhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends August 31st. Request your voucher.

Reply
sun-and-sea
Frequent Visitor

How to create dimension table for item without primary key

Hi everyone,

 

I'm trying to create a dimension table for MailingCity that will be linked to my fact table. The goal for my fact table is to have ContactID and all other dimensions indexed so they link to their respective dimension tables (i.e. MailingCityID in fact table will link with MailingCityID in the MailingCity dimension table, etc). 

 

MailingCity (and other dimensions associated with Contact) does not already have an ID, hence for each dimension, I'm creating a separate dimension table with that dimension being indexed. Only thing is I would have to merge the index tables back to the fact table using the qualitative dimension as the common denominator (i.e. MailingCity index table with Contact fact table using the MailingCity value). And repeat the merges for all other dimensions...which have been very costing on performance for me (Mashup container Out of Memory issues - cannot even open the old file now).

 

Hence, I'm trying to avoid merging dimension indexes back to the fact table to optimize model performance. So now I'm trying to create an index within dimension tables that contain both the ContactID and the MailingCity value itself.

 

In this screenshot below, many contacts can have a MailingCity (MC) value of London, Toronto, etc. So for each of these repeating MC values, I want to do a repeating index value for it as well. i.e. 1 for London, 2 for Toronto, etc.

 

sunandsea_0-1643657538507.png

 

Initially I followed RADACAD's tutorial but the result wasn't what I'm looking for: Create Row Number for Each Group in Power BI using Power Query - RADACAD

sunandsea_1-1643658546392.png

Result should be 1 for London, 2 for Toronto, etc. 

 

Ideally it will say 0 for null or invalid MC values (i.e. like "--", etc), and start in chronological order with MC values beginning with A as 1, etc.

 

Also, please feel free to give me feedback on my approach to creating the dimension tables & fact table this way.

 

Thank you!

 

4 REPLIES 4
edhans
Super User
Super User

Don't use IDs or indexes. your mailing city table should be on the MailingCity text value. That will filter the MailingCity field in your fact table.

In the background, the Vertipaq engine will substitute an index for you and create a data dictionary. All behind the scenes, all automatically, and smoking fast.

What you are talking about makes sense in a SQL database, but not a tabular model in Analysis Services. I see this all of the time and the first thing I do is get rid of all of that. Unnecessary, and make the model much larger.



Did I answer your question? Mark my post as a solution!
Did my answers help arrive at a solution? Give it a kudos by clicking the Thumbs Up!

DAX is for Analysis. Power Query is for Data Modeling


Proud to be a Super User!

MCSA: BI Reporting

Hi edhans,

 

Thank you for your quick reply. I really appreciate it!

 

If I'm trying to create a fact table with less columns (as per the star schema), I don't think it's good practice to leave all the contact details - including MailingCity in the fact table?

 

I had left all contact details in one table in the beginning and I still ended up with a very slow dashboard.

 

So do you suggest in my MailingCity dimension table I would just need the ContactID and MailingCity text value, same goes for the other dimensions with text values?

 

Based in your image I thought Mailing City was the dimension. If contact is the dimension, then you just need your Contact ID (contact number, name, whatever) that is your unique key in the Contact table also be the foreign key in the FACT table. Don't create arbitrary IDs. I cannot figure out why you have an ID next to the mailing city. Maybe post more info instead of tiny snippets that make it hard to see the overall goal.

But you are right, if contact is the DIM table, then nothing from contacts should be in the fact table except the contact key itself, which should be the key from your system, not a new key you create in Power BI - not an index, integer, anything. Just what comes from your system.



Did I answer your question? Mark my post as a solution!
Did my answers help arrive at a solution? Give it a kudos by clicking the Thumbs Up!

DAX is for Analysis. Power Query is for Data Modeling


Proud to be a Super User!

MCSA: BI Reporting

So the Contacts table is a dimension, and MailingCity is one of the contact's snowflake dimensions. The goal of the initial index ID I created is a form of foreign key between both the dimension tables because except for ContactID, MailingCity and other snowflake attributes don't have their own key.

 

The bigger problem I'm trying to solve here is I have two other fact tables - newsletter & newsletter subscribers. Newsletter table has dimensions such as newsletterID,  newsletter URL, author, topic, date published, author department, # of subscribers and # of clicks. Subscribers have dimensions such as date clicked (can be null if not clicked), boolean value for clicked, contactID that subscribed and newsletterID that they subscribed to. The Contact dimension and its snowflake dimensions are used to identify for those that clicked, which company they're from, are they a client or prospect, etc. Came across this RADACAD article about combining dimensions... Not sure how big my contact dimension table should be? I'm avoiding doing a transactional SQL like structure since my data is CRM and trying to make it digestible for a reporting data model: Combining Dimension Tables in Power BI using Power Query: Basics and Foundations of Modeling - RADAC...

 

The complication is one author can write multiple articles, one articles can be co-authored by multiple authors. Hence, one article can fall under many topics and department (for multiple authors). So for these attributes, I followed RADACAD's recommendation to create their own surrogate key (for topic, department and date) instead of using the newsletter ID (Data Preparation; First and Foremost Important Task in Power BI - RADACAD). But this means I have to built-in these keys into the subscriber table as well. Because I want to see which newsletter URLs were clicked on and how many clicks did they get - So the subscriber fact table needs to be able to be filtered by the newsletter fact table's dimensions as well.

 

Here's the "star schema" I have right now - it is currently only applied to the newsletter fact table and its dimensions.

 

sunandsea_1-1645026278213.png

 

Here is what the current visualization look like:

sunandsea_3-1645026795820.png

 

Currently, the filters (newsletter dimensions) on the left are not connected correctly to the subscriber fact table (which is the above table widget on the bottom right). I would like to be able to fix this.

 

And also, please confirm if the surrogate key approach I did is better than using the newsletterID as the primary key...?

 

In addition, I want the above newsletter filters to be able to filter by each other. So say if I select 'Banking and Financial Services' in the Department filter, I should be able to see authors from that Department in the Author filter above, and only the Names of the articles that they wrote.

 

 

 

 

 

 

Helpful resources

Announcements
July PBI25 Carousel

Power BI Monthly Update - July 2025

Check out the July 2025 Power BI update to learn about new features.

Join our Fabric User Panel

Join our Fabric User Panel

This is your chance to engage directly with the engineering team behind Fabric and Power BI. Share your experiences and shape the future.

June 2025 community update carousel

Fabric Community Update - June 2025

Find out what's new and trending in the Fabric community.