Join us for an expert-led overview of the tools and concepts you'll need to pass exam PL-300. The first session starts on June 11th. See you there!
Get registeredJoin us at FabCon Vienna from September 15-18, 2025, for the ultimate Fabric, Power BI, SQL, and AI community-led learning event. Save €200 with code FABCOMM. Get registered
Hi,
We are encountering duplicate entries in our Power BI report and would like guidance on how to load data into dimension and fact tables, as well as how to manage this in the Power BI report.
For example, consider the address type dimension, which has 20 different addresses, all of which are active. How can we effectively display these addresses in the fact table and the Power BI table report? we are getting corssjoin report in the report.
Here’s an example of our current data structure:
Customer Dimension (Parent Table):
cust_sk, cust_id, cust_name, start_date, end_date, current_flag
1, 10, 'ABC', 03/25/2025, null, 'Y'
2, 11, 'TDH', 03/25/2025, null, 'Y'
Product Dimension (Child Table):
prod_sk, cust_sk, prod_id, prod_name, start_date, end_date, current_flag
80, 1, 20, 'Dove', 03/25/2025, null, 'Y'
81, 2, 21, 'RAM', 03/25/2025, null, 'Y'
address Dimension (Child Table):
address_sk, cust_sk, address_id, address_name,address_type, start_date, end_date, current_flag
91, 1, 30, 'test1 street', 'P',03/25/2025, null, 'Y'
91, 1, 30, 'test2 street', 'T',03/25/2025, null, 'Y'
91, 1, 30, 'test3 street', 'S',03/25/2025, null, 'Y'
91, 1, 30, 'test4 street', 'M',03/25/2025, null, 'Y'
91, 1, 30, 'test5 street', 'O',03/25/2025, null, 'Y'
91, 2, 30, 'test street', 'P',03/25/2025, null, 'Y'
Sales Fact (Fact Table):
sales_sk, cust_sk, prod_sk, address_sk, start_date, end_date, current_flag
61, 1, 80, 91, 03/25/2025, null, 'Y'
62, 1, 80, 91, 03/25/2025, null, 'Y'
63, 1, 80, 91, 03/25/2025, null, 'Y'
64, 1, 80, 91, 03/25/2025, null, 'Y'
65, 1, 80, 91, 03/25/2025, null, 'Y'
66, 2, 81, 92, 03/25/2025, null, 'Y'
Thank you
Thank you
Solved! Go to Solution.
Hi @fabric_no1 ,
Thanks for using Fabric Community and providing the detailed follow-up - that really helps clarify the issue. From what you've described, it sounds like the duplicate entries you're seeing in your report are due to multiple active records in your dimension tables (like address or phone), combined with how those records are being joined to the fact table.
Here’s how you can handle this more effectively:
1. Make sure each row in your dimension tables is uniquely identifiable.
If you're dealing with multiple active addresses or phone numbers for the same customer (for example, both permanent and temporary addresses), then make sure each of those records has its own unique key - even if the address_id is the same.
You can do this by adding an Index column (or surrogate key) during ETL or using Power Query. That way, each unique combination (like address ID + type) will have its own distinct identifier, which you can then reference in the fact table. This avoids duplication and cross-joins in the visuals.
2. Use a proper star schema structure.
Make sure your model follows a star schema pattern - where the fact table sits at the center and connects to dimension tables using one-to-many relationships. Try to avoid making one dimension depend on another (e.g having the product dimension as a child of the customer as mentioned by @lbendlin ). Each dimension should relate directly to the fact table.
Here are some additional threads and documentation that align closely with your scenario:
1. Solved: Power BI is creating duplicate entries - Microsoft Fabric Community
2. Solved: Re: Star Schema cross-join - Microsoft Fabric Community
3. Understand star schema and the importance for Power BI - Power BI | Microsoft Learn: This guide explains the benefits of using a star schema and how surrogate keys help keep relationships clean and efficient.
Please reach out for further assistance.
If this post helps, then please consider to Accept it as the solution to help the other members find it more quickly and a kudos would be appreciated.
Best regards,
Vinay.
Fabric Community Support
Hi @fabric_no1 ,
Following up to see if your query has been resolved. If any of the responses helped, please consider marking the relevant reply as the 'Accepted Solution' to assist others with similar questions.
If you're still facing issues, feel free to reach out.
Thank you.
Hi, are you using a Fabric capacity? If so, you might want to consider implementing a medallion architecture so that you can remove duplicates in an earlier layer before it gets to the layer that you load your Dim and Fact tables in:
Implement medallion lakehouse architecture in Fabric - Microsoft Fabric | Microsoft Learn
Hi @fabric_no1 ,
Just checking in to see if you query is resolved and if any responses were helpful. If so, kindly consider marking the helpful reply as 'Accepted Solution' to help others with similar queries.
Otherwise, feel free to reach out for further assistance.
Thank you.
Hi @fabric_no1 ,
We haven’t heard from you since our last response and wanted to check if the provided solutions met your needs. If they did, please consider accepting the solution to help others benefit. Otherwise, feel free to reach out for further assistance.
Thank you.
Hi @fabric_no1 ,
Thanks for using Fabric Community and providing the detailed follow-up - that really helps clarify the issue. From what you've described, it sounds like the duplicate entries you're seeing in your report are due to multiple active records in your dimension tables (like address or phone), combined with how those records are being joined to the fact table.
Here’s how you can handle this more effectively:
1. Make sure each row in your dimension tables is uniquely identifiable.
If you're dealing with multiple active addresses or phone numbers for the same customer (for example, both permanent and temporary addresses), then make sure each of those records has its own unique key - even if the address_id is the same.
You can do this by adding an Index column (or surrogate key) during ETL or using Power Query. That way, each unique combination (like address ID + type) will have its own distinct identifier, which you can then reference in the fact table. This avoids duplication and cross-joins in the visuals.
2. Use a proper star schema structure.
Make sure your model follows a star schema pattern - where the fact table sits at the center and connects to dimension tables using one-to-many relationships. Try to avoid making one dimension depend on another (e.g having the product dimension as a child of the customer as mentioned by @lbendlin ). Each dimension should relate directly to the fact table.
Here are some additional threads and documentation that align closely with your scenario:
1. Solved: Power BI is creating duplicate entries - Microsoft Fabric Community
2. Solved: Re: Star Schema cross-join - Microsoft Fabric Community
3. Understand star schema and the importance for Power BI - Power BI | Microsoft Learn: This guide explains the benefits of using a star schema and how surrogate keys help keep relationships clean and efficient.
Please reach out for further assistance.
If this post helps, then please consider to Accept it as the solution to help the other members find it more quickly and a kudos would be appreciated.
Best regards,
Vinay.
Fabric Community Support
You will want to refactor your data model. Why would the product dimension be a child of the customer? What addresses are these? Wher is your calendar dimension?
Apologies for using random dimensions in the example earlier.
Here’s the actual issue: If the dimension table (e.g., address) contains duplicates based on address type (permanent and temporary), and both are active in the current business scenario, the address table's natural key is a combination of address_id and address_type. When loading data into the fact table, we are encountering duplicate records. I'm unsure how to handle multiple records in the dimension table when populating the fact table and what the appropriate data model should be. What is the best practice to handle this situation.
We have a similar situation in the customer phone table, where both work and cell phones are active.
Additionally, we also have a date dimension, which is not included in the example below.
Customer Dimension (Parent Table):
cust_sk, cust_id, cust_name, start_date, end_date, current_flag
1, 10, 'ABC', 03/25/2025, null, 'Y'
2, 11, 'TDH', 03/25/2025, null, 'Y'
Customer Phone Dimension (Child Table):
cust_ph_sk, cust_sk, cust_ph_id, cust_ph_no, start_date, end_date, current_flag
80, 1, 20, '111-111-1111', 03/25/2025, null, 'Y'
81, 2, 21, '222-2223-3333', 03/25/2025, null, 'Y'
address Dimension (Child Table):
address_sk, cust_sk, address_id, address_name,address_type, start_date, end_date, current_flag
91, 1, 30, 'test1 street', 'P',03/25/2025, null, 'Y'
91, 1, 30, 'test2 street', 'T',03/25/2025, null, 'Y'
91, 2, 30, 'test street', 'P',03/25/2025, null, 'Y'
Sales Fact (Fact Table):
sales_sk, cust_sk, cust_ph_sk, address_sk, start_date, end_date, current_flag
61, 1, 80, 91, 03/25/2025, null, 'Y'
62, 1, 80, 91, 03/25/2025, null, 'Y'
63, 2, 81, 92, 03/25/2025, null, 'Y'
Thank you
User | Count |
---|---|
2 | |
1 | |
1 | |
1 | |
1 |
User | Count |
---|---|
5 | |
3 | |
3 | |
3 | |
2 |