March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount! Early bird discount ends December 31.
Register NowBe one of the first to start using Fabric Databases. View on-demand sessions with database experts and the Microsoft product team to learn just how easy it is to get started. Watch now
I'm having an issue trying to group the following requirement.
Source | FK | Note | Created |
a | 1 | abc | 14-Mar |
a | 1 | abc | 14-Mar |
b | 1 | abc | 14-Mar |
b | 1 | xyz | 14-Mar |
c | 1 | abc | 14-Mar |
c | 1 | klm | 14-Mar |
I have data coming from various sources (.csv files), and records are not necessarily unique. So in the above table, the 1st and 2nd record are absolutely separate, even though they contain the same data because they were retrieved from the same source "a". The 3rd and 4th row are from last week's source "b" which may contain the same data as this week, and may contain data which was deleted.
Above, the 3rd row looks identical to either the 1st or the 2nd row - so a duplicate, and the 4th row is missing from the new source - so a deletion. The 5th row is also a duplicate of either the 1st or 2nd row, while the 6th is, again a unique record.
I have no reason to keep duplicate data, but do want to retain what appear to be duplicates from the same source and any new data. So, how would i keep the 1st, 2nd, 4th and 6th row?
Well, with that dearth of replies, I've decided to do the following
Hi @hansei ,
First create an index column for later use to get the latest source:
Next, create a column filter. Its value judgment logic is that when the value of Note is duplicated and the value of Source is different from the latest source, it returns 1; otherwise, it returns 0. Finally create a calculated table to filter the table with filter equal to 0:
Table 2 =
VAR f =
ADDCOLUMNS (
'Table',
"filter",
VAR a = 'Table'[Index]
VAR b =
CALCULATETABLE (
DISTINCT ( 'Table'[Note] ),
FILTER ( 'Table', 'Table'[Index] < a )
)
VAR c =
CALCULATE ( MAX ( 'Table'[Source] ), FILTER ( 'Table', 'Table'[Index] = 0 ) )
RETURN
IF ( 'Table'[Note] IN b && 'Table'[Source] <> c, 1, 0 )
)
RETURN
FILTER ( f, [filter] = 0 )
Please refer to the pbix file: https://qiuyunus-my.sharepoint.com/:u:/g/personal/pbipro_qiuyunus_onmicrosoft_com/ESq2wtkC4XFMhZcUOn...
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
Best Regards,
Dedmon Dai
I cannot have a static solution based on a,b,c. There may be hundreds of sources.
March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!
Arun Ulag shares exciting details about the Microsoft Fabric Conference 2025, which will be held in Las Vegas, NV.
User | Count |
---|---|
125 | |
85 | |
69 | |
54 | |
45 |
User | Count |
---|---|
204 | |
105 | |
98 | |
65 | |
54 |