The ultimate Fabric, Power BI, SQL, and AI community-led learning event. Save €200 with code FABCOMM.
Get registeredEnhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends August 31st. Request your voucher.
Hi Power BI Community,
I need help with a power query logic.
I currently have a table loaded in power bi which looks something like this:
Candidate Creation Date | Candidate name | Platform id | Platform Access Date | |
abc@gmail.com | 3/5/2023 | abc | ID | 3/4/2023 |
abc@gmail.com | 3/5/2023 | abc | ID | 3/7/2023 |
abc@gmail.com | 3/5/2023 | abc | CB | 3/8/2023 |
abc@gmail.com | 3/5/2023 | abc | CB | 3/9/2023 |
xyz@gmail.com | 6/7/2023 | xyz | CB | 6/10/2023 |
xyz@gmail.com | 6/7/2023 | xyz | AF | 6/6/2023 |
tuv@gmail.com | 4/5/2023 | tuv | ID | 4/1/2023 |
tuv@gmail.com | 4/5/2023 | tuv | ID | 4/3/2023 |
tuv@gmail.com | 4/5/2023 | tuv | ID | 5/4/2023 |
Now I am trying to create this new max date column which would be based on the following logic:
Get me the max date from 'PLatform access date' on a group by over 'Email' and 'Platform Id' but this max date should be lower than 'Candidate Creation date'. Here if there is no 'Platform access date' lower than the 'Candidate creation date', then take the max date from 'Platform creation date'
over that group by.
The new table would look something like this:
Candidate Creation Date | Candidate name | Platform id | Platform Access Date | Max Date | |
abc@gmail.com | 3/5/2023 | abc | ID | 3/4/2023 | 3/4/2023 |
abc@gmail.com | 3/5/2023 | abc | ID | 3/7/2023 | 3/4/2023 |
abc@gmail.com | 3/5/2023 | abc | CB | 3/8/2023 | 3/9/2023 |
abc@gmail.com | 3/5/2023 | abc | CB | 3/9/2023 | 3/9/2023 |
xyz@gmail.com | 6/7/2023 | xyz | CB | 6/10/2023 | 6/10/2023 |
xyz@gmail.com | 6/7/2023 | xyz | AF | 6/6/2023 | 6/6/2023 |
tuv@gmail.com | 4/5/2023 | tuv | ID | 4/1/2023 | 4/3/2023 |
tuv@gmail.com | 4/5/2023 | tuv | ID | 4/3/2023 | 4/3/2023 |
tuv@gmail.com | 4/5/2023 | tuv | ID | 5/4/2023 | 4/3/2023 |
Let me explain this further case by case:
For Candidate abc: For the abc_ID group, there is only one platform access date that is lower than the candidate creation date so we take 3/4/2023. But then for the abc_CB group, there are no platform access dates that are lower than the candidate creation date so we take the max date possible so 3/9/2023 (max of 3/82023 and 3/9/2023).
For Candidate xyz: For the xyz_CB group, there is no platform access date before the candidate creation date so we get the max of the 'platform creation date' for this group which is 6/10/23. Now for the xyz_AF group there is a platform access date lower than the candidate creation date. So I would want the max of all dates before the candidate creation date and in this case since we have only one such date we take 6/6/2023.
For Candidate tuv: For the tuv_ID group, we have two platform access dates that are are lower than the candidate creation date and so we take the max between them which comes to be 4/3/2023. One thing to notice here is that we do in fact have a platform access date (5/4/2023) that is greater than this 4/3/2023 but we know our first check is to make sure that the platform access date is lower or equal to the candidate creation date so 4/3/2023 gets picked as compared to 5/4/2023 which is the max across all dates for this group.
I hope this explains my problem. Let me know if you have further questions.
Thanks!
Solved! Go to Solution.
HI @vedantsri,
You can try to use the following M query formula to create a new column with custom function to loop table records with condition:
#"Added Custom" = Table.AddColumn(
#"Changed Type",
"Custom",
each
let
email = [Email], cDate = [Candidate Creation Date], tb =Table.SelectRows(#"Changed Type", each [Email] = email)
in
let
filtered = Table.SelectRows(tb, each [Platform Access Date] < cDate),
result =
if Table.RowCount(filtered) > 0 then
List.Max(filtered[Platform Access Date])
else
List.Max(tb[Platform Access Date])
in
result
)
Full query:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WSkxKdkjPTczM0UvOz1XSUTLWN9U3MjAyBjKBUkDS0wUsagIRjdUhXos5CVqcncCiFqRrsURoqaisQtFiBnOCDkgKpsVM39CABD2ObmBRM4SWktIyFC0mCJcBpWD+N9E3JF2LMalaTOEREwsA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Email = _t, #"Candidate Creation Date" = _t, #"Candidate name" = _t, #"Platform id" = _t, #"Platform Access Date" = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Email", type text}, {"Candidate Creation Date", type date}, {"Candidate name", type text}, {"Platform id", type text}, {"Platform Access Date", type date}}),
#"Added Custom" = Table.AddColumn(
#"Changed Type",
"Custom",
each
let
email = [Email] , cDate = [Candidate Creation Date], tb =Table.SelectRows(#"Changed Type", each [Email] = email)
in
let
filtered = Table.SelectRows(tb, each [Platform Access Date] < cDate),
result =
if Table.RowCount(filtered) > 0 then
List.Max(filtered[Platform Access Date])
else
List.Max(tb[Platform Access Date])
in
result
)
in
#"Added Custom"
BTW, I think Dax expression should more suitable for these type of calculations. If you create a calculated column in data mode table, they will simply than M query formulas and spend less resource.
Regards,
Xiaoxin Sheng
HI @vedantsri,
You can try to use the following M query formula to create a new column with custom function to loop table records with condition:
#"Added Custom" = Table.AddColumn(
#"Changed Type",
"Custom",
each
let
email = [Email], cDate = [Candidate Creation Date], tb =Table.SelectRows(#"Changed Type", each [Email] = email)
in
let
filtered = Table.SelectRows(tb, each [Platform Access Date] < cDate),
result =
if Table.RowCount(filtered) > 0 then
List.Max(filtered[Platform Access Date])
else
List.Max(tb[Platform Access Date])
in
result
)
Full query:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WSkxKdkjPTczM0UvOz1XSUTLWN9U3MjAyBjKBUkDS0wUsagIRjdUhXos5CVqcncCiFqRrsURoqaisQtFiBnOCDkgKpsVM39CABD2ObmBRM4SWktIyFC0mCJcBpWD+N9E3JF2LMalaTOEREwsA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Email = _t, #"Candidate Creation Date" = _t, #"Candidate name" = _t, #"Platform id" = _t, #"Platform Access Date" = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Email", type text}, {"Candidate Creation Date", type date}, {"Candidate name", type text}, {"Platform id", type text}, {"Platform Access Date", type date}}),
#"Added Custom" = Table.AddColumn(
#"Changed Type",
"Custom",
each
let
email = [Email] , cDate = [Candidate Creation Date], tb =Table.SelectRows(#"Changed Type", each [Email] = email)
in
let
filtered = Table.SelectRows(tb, each [Platform Access Date] < cDate),
result =
if Table.RowCount(filtered) > 0 then
List.Max(filtered[Platform Access Date])
else
List.Max(tb[Platform Access Date])
in
result
)
in
#"Added Custom"
BTW, I think Dax expression should more suitable for these type of calculations. If you create a calculated column in data mode table, they will simply than M query formulas and spend less resource.
Regards,
Xiaoxin Sheng
Hi Please can someone help me with this?