Check your eligibility for this 50% exam voucher offer and join us for free live learning sessions to get prepared for Exam DP-700.
Get StartedDon't miss out! 2025 Microsoft Fabric Community Conference, March 31 - April 2, Las Vegas, Nevada. Use code MSCUST for a $150 discount. Prices go up February 11th. Register now.
Hi All,
I have data from multiple sources of Excel files, which the data is updated manually for each column depending on the system cut-off time. some time complete files or only provide certain columns and I am using folder updates to merge and transform.
Its impact on the data that is duplicated for each column.
I am trying to create unique value for each event data from the latest recent data. I expect the most effective way to create the new table since the system is given that make me can not adjust the source 😞
Thank you in advance!
Here is the case
sales table :
event_date | sold1 | sold2 | paid1 | paid2 | update_date |
1-Jan-22 | 10 | 10 | 5 | 5 | 1-Jan-22 |
1-Jan-22 | 11 | 6 | 2-Jan-22 | ||
1-Jan-22 | 12 | 7 | 3-Jan-22 | ||
1-Jan-22 | 15 | 4-Jan-22 | |||
1-Jan-22 | 12 | 5-Jan-22 | |||
1-Jan-22 | 3 | 6-Jan-22 | |||
2-Jan-22 | 10 | 10 | 5 | 5 | 2-Jan-22 |
2-Jan-22 | 1 | 1 | 3-Jan-22 | ||
2-Jan-22 | 2 | 4-Jan-22 | |||
1-Jan-22 | 15 | 10 | 5-Jan-22 | ||
3-Jan-22 | 30 | 1 | 3-Jan-22 | ||
3-Jan-22 | 10 | 5 | 3-Jan-22 | ||
3-Jan-22 | 2 | 19 | 4-Jan-22 | ||
3-Jan-22 | 2 | 10 | 4-Jan-22 |
there are some rules applied to the output
1. The unique key is the event_date
2. If the latest update_date give the blank value
--> use the previous value from the previous update_date
--> case event_date 1 Jan 2022, latest_update date is 6 Jan. But Only has the paid2 value.
---> other column value fulfill from the previous update_date
3. if the latest update_date has duplicate rows then take the highest value.
new-table
Expected output:
event_date | sold1 | sold2 | paid1 | paid2 | update_date |
1-Jan-22 | 15 | 10 | 7 | 3 | 6-Jan-22 |
2-Jan-22 | 1 | 10 | 5 | 2 | 4-Jan-22 |
3-Jan-22 | 2 | 19 | 1 | 5 | 4-Jan-22 |
Solved! Go to Solution.
Hi @bacuy ,
I have two solutions here for your reference,please check.
Measure paid1=
VAR _date =
CALCULATE (
LASTNONBLANK ( 'sales table'[update_date], TRUE () ),
FILTER (
'sales table',
[event_date] = MAX ( 'sales table'[event_date] )
&& [paid1] <> BLANK ()
)
)
RETURN
CALCULATE (
MAX ( 'sales table'[paid1] ),
FILTER (
'sales table',
'sales table'[update_date] = _date
&& 'sales table'[event_date] = MAX ( 'sales table'[event_date] )
)
)
Measure paid2=
VAR _date =
CALCULATE (
LASTNONBLANK ( 'sales table'[update_date], TRUE () ),
FILTER (
'sales table',
[event_date] = MAX ( 'sales table'[event_date] )
&& [paid2] <> BLANK ()
)
)
RETURN
CALCULATE (
MAX ( 'sales table'[paid2] ),
FILTER (
'sales table',
'sales table'[update_date] = _date
&& 'sales table'[event_date] = MAX ( 'sales table'[event_date] )
)
)
Measure sold1=
VAR _date =
CALCULATE (
LASTNONBLANK ( 'sales table'[update_date], TRUE () ),
FILTER (
'sales table',
[event_date] = MAX ( 'sales table'[event_date] )
&& [sold1] <> BLANK ()
)
)
RETURN
CALCULATE (
MAX ( 'sales table'[sold1] ),
FILTER (
'sales table',
'sales table'[update_date] = _date
&& 'sales table'[event_date] = MAX ( 'sales table'[event_date] )
)
)
Measure sold2=
VAR _date =
CALCULATE (
LASTNONBLANK ( 'sales table'[update_date], TRUE () ),
FILTER (
'sales table',
[event_date] = MAX ( 'sales table'[event_date] )
&& [sold2] <> BLANK ()
)
)
RETURN
CALCULATE (
MAX ( 'sales table'[sold2] ),
FILTER (
'sales table',
'sales table'[update_date] = _date
&& 'sales table'[event_date] = MAX ( 'sales table'[event_date] )
)
)
Output:
New Table=
ADDCOLUMNS (
DISTINCT ( 'sales table'[event_date] ),
"sold1",
VAR _date =
CALCULATE (
LASTNONBLANK ( 'sales table'[update_date], TRUE () ),
FILTER (
'sales table',
[event_date] = EARLIER ( [event_date] )
&& [sold1] <> BLANK ()
)
)
RETURN
CALCULATE (
MAX ( 'sales table'[sold1] ),
FILTER (
'sales table',
'sales table'[event_date] = EARLIER ( [event_date] )
&& 'sales table'[update_date] = _date
)
),
"sold2",
VAR _date =
CALCULATE (
LASTNONBLANK ( 'sales table'[update_date], TRUE () ),
FILTER (
'sales table',
[event_date] = EARLIER ( [event_date] )
&& [sold2] <> BLANK ()
)
)
RETURN
CALCULATE (
MAX ( 'sales table'[sold2] ),
FILTER (
'sales table',
'sales table'[event_date] = EARLIER ( [event_date] )
&& 'sales table'[update_date] = _date
)
),
"paid1",
VAR _date =
CALCULATE (
LASTNONBLANK ( 'sales table'[update_date], TRUE () ),
FILTER (
'sales table',
[event_date] = EARLIER ( [event_date] )
&& [paid1] <> BLANK ()
)
)
RETURN
CALCULATE (
MAX ( 'sales table'[paid1] ),
FILTER (
'sales table',
'sales table'[event_date] = EARLIER ( [event_date] )
&& 'sales table'[update_date] = _date
)
),
"paid2",
VAR _date =
CALCULATE (
LASTNONBLANK ( 'sales table'[update_date], TRUE () ),
FILTER (
'sales table',
[event_date] = EARLIER ( [event_date] )
&& [paid2] <> BLANK ()
)
)
RETURN
CALCULATE (
MAX ( 'sales table'[paid2] ),
FILTER (
'sales table',
'sales table'[event_date] = EARLIER ( [event_date] )
&& 'sales table'[update_date] = _date
)
),
"update_date",
CALCULATE (
MAX ( 'sales table'[update_date] ),
ALLEXCEPT ( 'sales table', 'sales table'[event_date] )
)
)
Output:
Best Regards,
Jinwei Li
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
Hi @v-jinweili-msft,
I'm sorry... I Have a problem here...
When I try to make the total sold from the new table, the measurement is not what I expected.
kindly need your help
Hi @bacuy ,
I have two solutions here for your reference,please check.
Measure paid1=
VAR _date =
CALCULATE (
LASTNONBLANK ( 'sales table'[update_date], TRUE () ),
FILTER (
'sales table',
[event_date] = MAX ( 'sales table'[event_date] )
&& [paid1] <> BLANK ()
)
)
RETURN
CALCULATE (
MAX ( 'sales table'[paid1] ),
FILTER (
'sales table',
'sales table'[update_date] = _date
&& 'sales table'[event_date] = MAX ( 'sales table'[event_date] )
)
)
Measure paid2=
VAR _date =
CALCULATE (
LASTNONBLANK ( 'sales table'[update_date], TRUE () ),
FILTER (
'sales table',
[event_date] = MAX ( 'sales table'[event_date] )
&& [paid2] <> BLANK ()
)
)
RETURN
CALCULATE (
MAX ( 'sales table'[paid2] ),
FILTER (
'sales table',
'sales table'[update_date] = _date
&& 'sales table'[event_date] = MAX ( 'sales table'[event_date] )
)
)
Measure sold1=
VAR _date =
CALCULATE (
LASTNONBLANK ( 'sales table'[update_date], TRUE () ),
FILTER (
'sales table',
[event_date] = MAX ( 'sales table'[event_date] )
&& [sold1] <> BLANK ()
)
)
RETURN
CALCULATE (
MAX ( 'sales table'[sold1] ),
FILTER (
'sales table',
'sales table'[update_date] = _date
&& 'sales table'[event_date] = MAX ( 'sales table'[event_date] )
)
)
Measure sold2=
VAR _date =
CALCULATE (
LASTNONBLANK ( 'sales table'[update_date], TRUE () ),
FILTER (
'sales table',
[event_date] = MAX ( 'sales table'[event_date] )
&& [sold2] <> BLANK ()
)
)
RETURN
CALCULATE (
MAX ( 'sales table'[sold2] ),
FILTER (
'sales table',
'sales table'[update_date] = _date
&& 'sales table'[event_date] = MAX ( 'sales table'[event_date] )
)
)
Output:
New Table=
ADDCOLUMNS (
DISTINCT ( 'sales table'[event_date] ),
"sold1",
VAR _date =
CALCULATE (
LASTNONBLANK ( 'sales table'[update_date], TRUE () ),
FILTER (
'sales table',
[event_date] = EARLIER ( [event_date] )
&& [sold1] <> BLANK ()
)
)
RETURN
CALCULATE (
MAX ( 'sales table'[sold1] ),
FILTER (
'sales table',
'sales table'[event_date] = EARLIER ( [event_date] )
&& 'sales table'[update_date] = _date
)
),
"sold2",
VAR _date =
CALCULATE (
LASTNONBLANK ( 'sales table'[update_date], TRUE () ),
FILTER (
'sales table',
[event_date] = EARLIER ( [event_date] )
&& [sold2] <> BLANK ()
)
)
RETURN
CALCULATE (
MAX ( 'sales table'[sold2] ),
FILTER (
'sales table',
'sales table'[event_date] = EARLIER ( [event_date] )
&& 'sales table'[update_date] = _date
)
),
"paid1",
VAR _date =
CALCULATE (
LASTNONBLANK ( 'sales table'[update_date], TRUE () ),
FILTER (
'sales table',
[event_date] = EARLIER ( [event_date] )
&& [paid1] <> BLANK ()
)
)
RETURN
CALCULATE (
MAX ( 'sales table'[paid1] ),
FILTER (
'sales table',
'sales table'[event_date] = EARLIER ( [event_date] )
&& 'sales table'[update_date] = _date
)
),
"paid2",
VAR _date =
CALCULATE (
LASTNONBLANK ( 'sales table'[update_date], TRUE () ),
FILTER (
'sales table',
[event_date] = EARLIER ( [event_date] )
&& [paid2] <> BLANK ()
)
)
RETURN
CALCULATE (
MAX ( 'sales table'[paid2] ),
FILTER (
'sales table',
'sales table'[event_date] = EARLIER ( [event_date] )
&& 'sales table'[update_date] = _date
)
),
"update_date",
CALCULATE (
MAX ( 'sales table'[update_date] ),
ALLEXCEPT ( 'sales table', 'sales table'[event_date] )
)
)
Output:
Best Regards,
Jinwei Li
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
User | Count |
---|---|
117 | |
73 | |
58 | |
49 | |
48 |
User | Count |
---|---|
171 | |
122 | |
60 | |
59 | |
56 |