Advance your Data & AI career with 50 days of live learning, dataviz contests, hands-on challenges, study groups & certifications and more!
Get registeredJoin us at FabCon Atlanta from March 16 - 20, 2026, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM. Register now.
Hi everyone,
I have a problem finding a proper dax measure for a specific type of duplicates in a fact table
Uploading a snippet with few dummy data and the image with formatted background for duplicates
What is my goal:
I need to find duplicate Import batches for Semi Octa - but duplicate condition is met only when all [Lia] values for each [Freq] in a batch are the same with another batch
for instance - Serial number 190092PL has 2 duplicate batches :
ImportbatchID - 190092PL_1 and 190092PL_91 - same [lia] values for all frequencies (50,100,200) in a batch
190092PL_13 and 190092PL_1 - same lia for freq 50,100, there is no 200 freq in that import batch
| Serial_Number | ImportbatchID | Test | SelectedMeasurement | freq_Band | Freq | Lia |
| 190092PL | 190092PL_1 | 1 | wert1 | Semi Octa | 50 | 44,9888 |
| 190092PL | 190092PL_1 | 1 | wert1 | Semi Octa | 100 | 99,0903 |
| 190092PL | 190092PL_1 | 1 | wert1 | Semi Octa | 200 | 34,9983 |
| 190092PL | 190092PL_8 | 1 | Qa89 | Semi Octa | 50 | 43,90432 |
| 190092PL | 190092PL_9 | 2 | OP09e | Semi Octa | 50 | 21,434 |
| 190092PL | 190092PL_9 | 2 | OP09e | Semi Octa | 100 | 45,9034 |
| 190092PL | 190092PL_9 | 2 | OP09e | Semi Octa | 200 | 7483,8933 |
| 190092PL | 190092PL_13 | 2 | Xt56 | Semi Octa | 50 | 21,73773 |
| 190092PL | 190092PL_13 | 2 | Xt56 | Semi Octa | 100 | 56,98322 |
| 190092PL | 190092PL_91 | 2 | tr452w | Semi Octa | 200 | 34,9983 |
| 190092PL | 190092PL_17 | 1 | Semi Octa | 100 | 21,32 | |
| 190092PL | 190092PL_17 | 1 | Semi Octa | 200 | 31,123 | |
| 190092PL | 190092PL_17 | 1 | Semi Octa | 50 | 44,9888 | |
| 190092PL | 190092PL_91 | 1 | tr452w | Semi Octa | 100 | 99,0903 |
| 190092PL | 190092PL_92 | 1 | wsx73 | Semi Octa | 200 | 59,09815 |
| 190092PL | 190092PL_91 | 1 | tr452w | Semi Octa | 50 | 44,9888 |
| 190092PL | 190092PL_15 | 2 | cj9 | Semi Octa | 50 | 21,73773 |
| 190092PL | 190092PL_15 | 2 | cj9 | Semi Octa | 100 | 56,98322 |
| 9930PL32 | 9930PL32_34 | 1 | wrr23 | Semi Octa | 50 | 44,93 |
| 9930PL32 | 9930PL32_34 | 1 | wrr23 | Semi Octa | 100 | 45,932 |
| 9930PL32 | 9930PL32_34 | 1 | wrr23 | Semi Octa | 200 | 71,093 |
| 9930PL32 | 9930PL32_44 | 1 | fd42 | Semi Octa | 100 | 45,932 |
| 9930PL32 | 9930PL32_34 | 2 | 1 Octa | 100 | 54,32 | |
| 9930PL32 | 9930PL32_34 | 2 | 1 Octa | 50 | 22,23 | |
| 9930PL32 | 9930PL32_63 | 3 | Semi Octa | 50 | 45,65 | |
| 9930PL32 | 9930PL32_63 | 3 | Semi Octa | 100 | 32,43 | |
| 9930PL32 | 9930PL32_49 | 1 | Semi Octa | 200 | 59,09815 | |
| 9930PL32 | 9930PL32_20 | 4 | Semi Octa | 50 | 44,9821103 | |
| 9930PL32 | 9930PL32_20 | 4 | Semi Octa | 100 | 40,09921 | |
| 9930PL32 | 9930PL32_20 | 4 | Semi Octa | 200 | 22,01101 | |
| 9930PL32 | 9930PL32_19 | 5 | Semi Octa | 50 | 32,42332 | |
| 9930PL32 | 9930PL32_19 | 5 | Semi Octa | 100 | 2,90019013 | |
| 9930PL32 | 9930PL32_44 | 1 | fd42 | Semi Octa | 50 | 44,93 |
| 9930PL32 | 9930PL32_44 | 1 | fd42 | Semi Octa | 200 | 71,093 |
i used a measure with distinctcount[lia] - count[lia] to compare the values, but i dont know how to use it for group of frequencies
Does anyone have any idea?
Hi @MattReckoner ,
Is this correct understanding:
If Serial_Number and Frequency and Lia is the same and there is any second ImportbatchID for this combination, then this is a duplicate, right?
Any third, fourth, fifth, ... ImportbatchID is another duplicate?
Then this is what you need:
- Group by Serial_Number, Freq, Lia
- For each distinct combination of above, count occurences
- If there is more than one occurence, these are duplicates
Count of duplicate batches =
VAR _DuplicatesPerSerialNumberFrequencyLia =
ADDCOLUMNS (
// Group by Serial_Number, Freq, Lia
SUMMARIZECOLUMNS (
'Table'[Serial_Number],
'Table'[Freq],
'Table'[Lia]
),
"@CountOfDuplicates",
// For each distinct combination of above, count occurences
VAR _CountOfBatch = CALCULATE ( COUNTROWS ( 'Table' ) )
RETURN
// If there is more than one occurence, these are duplicates
IF (
_CountOfBatch > 1,
_CountOfBatch - 1 // each first instance is not a duplicate
)
)
RETURN
// Return total number of duplicates in current context
SUMX ( _DuplicatesPerSerialNumberFrequencyLia, [@CountOfDuplicates] )
This measure produces this result from your sample data:
Duplicate batches
BR
Martin
Hi Martin_D
Thanks for reply! its not exactly what i wanted
As your dax finds duplicates for each frequency separately.
I've created a measure with virtual table where i pivoted freq column. Made columns freq_50, freq_100, freq_200 and created uniqueKey as a combination of serial numer & freq_50 & freq_100 & freq_200 & lia rest is similar, countows and sumx.
| User | Count |
|---|---|
| 8 | |
| 6 | |
| 3 | |
| 3 | |
| 3 |
| User | Count |
|---|---|
| 11 | |
| 9 | |
| 8 | |
| 7 | |
| 6 |