Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Get Fabric Certified for FREE during Fabric Data Days. Don't miss your chance! Learn more

Reply
piotr_pbi
Frequent Visitor

Duplicates in fact table

Hello,


please take a look into following fact table. The data is normally on Customer, Material, InvoiceNo, Date level.

 

screen1.PNG

 

After ETL processes there are SalesReps assigned to each line (basing on Customer, Product combination) which generates  duplicates in the fact table. In the real-life case there are some more hierarchy levels regarding SalesReps and therefore there might more than 1 duplicates for each combination - see R1 and R2).

 

screen2.PNG

 

An assumption has been made that for every original line every single SalesRep gets the same Quantity Sold assigned equally for his KPIs. The total flag indicates the unique row to be taken for the purpose of calculations on the level of detail upper than SalesRep and has been assigned in a RANDOM way. It is also important that the data can be filtered by all dimensions in the example as well as by single SalesRep.

 

My formula looks like this and does not need to use the TotalFlag at all:

 

VAR table1 =
ADDCOLUMNS (
SUMMARIZE (
data2,
data2[Customer],
data2[Product],
data2[InvoiceNo],
data2[Date]
),
"selectedvalue quantity", CALCULATE ( SELECTEDVALUE ( data2[QUANTITY SOLD] ) )
)

RETURN
SUMX ( table1, [selectedvalue quantity] )

 

which gives following reults:

 

screen3.PNG

but seems to have bad performance on the original dataset (around 100M rows in fact table).

Does anyone have any better solution?

1 ACCEPTED SOLUTION
ValtteriN
Super User
Super User

Hi,

As a general rule in this sort of case I would think how to improve the ETL process if that causes this issue. As for a Dax Approach one tip that I have noticed is that MAX often has better results than SELECTEDVALUE in this sort of calculation. 





Did I answer your question? Mark my post as a solution!

Proud to be a Super User!




View solution in original post

2 REPLIES 2
v-yadongf-msft
Community Support
Community Support

Hi @piotr_pbi ,

 

Has your problem solved? If solved, please consider Accept it as the solution to help the other members find it more quickly.

ValtteriN
Super User
Super User

Hi,

As a general rule in this sort of case I would think how to improve the ETL process if that causes this issue. As for a Dax Approach one tip that I have noticed is that MAX often has better results than SELECTEDVALUE in this sort of calculation. 





Did I answer your question? Mark my post as a solution!

Proud to be a Super User!




Helpful resources

Announcements
Fabric Data Days Carousel

Fabric Data Days

Advance your Data & AI career with 50 days of live learning, contests, hands-on challenges, study groups & certifications and more!

October Power BI Update Carousel

Power BI Monthly Update - October 2025

Check out the October 2025 Power BI update to learn about new features.

FabCon Atlanta 2026 carousel

FabCon Atlanta 2026

Join us at FabCon Atlanta, March 16-20, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.