The ultimate Fabric, Power BI, SQL, and AI community-led learning event. Save €200 with code FABCOMM.
Get registeredCompete to become Power BI Data Viz World Champion! First round ends August 18th. Get started.
I want to use "Group By" in Power Query, but I don't want to aggregate anything. A loan number can have several rows of old data in a table before it gets to the current row with the most recent status, and if the most recent row shows "PAIDINFULL", I want ALL the rows associated with that loan number filtered out. How can I group the loan numbers together by the most recent status?
(I'm loosely following this approach: https://community.fabric.microsoft.com/t5/Desktop/Filter-and-remove-data-in-Power-Query/m-p/2429554#...)
Sample Data:
Loan | Status | Accounting Date |
1 | REPAYMENT | 4/30/2019 |
1 | REPAYMENT | 7/31/2020 |
1 | PAIDINFULL | 12/31/2024 |
2 | REPAYMENT | 6/30/2019 |
2 | REPAYMENT | 7/31/2019 |
2 | REPAYMENT | 1/31/2022 |
2 | PAIDINFULL | 7/31/2024 |
3 | REPAYMENT | 3/31/2020 |
3 | REPAYMENT | 8/31/2020 |
4 | REPAYMENT | 11/30/2021 |
4 | REPAYMENT | 6/30/2023 |
4 | REPAYMENT | 5/31/2024 |
Edit: for some reason my table looks like sh#t. here's a screenshot:
Thanks.
Solved! Go to Solution.
Try this Power Query solution. The original query is named Raw.
Create query MaxDate:
let
Source = Raw,
GroupRows = Table.Group(
Source,
{"Loan"},
{
{"MaxDate", each List.Max([Accounting Date]), type nullable date},
{
"All",
each _,
type table [Loan = nullable number, Status = nullable text, Accounting Date = nullable date]
}
}
),
ExpandTable = Table.ExpandTableColumn(GroupRows, "All", {"Status"}, {"Status"}),
FilterPaidInFull = Table.SelectRows(ExpandTable, each ([Status] = "PAIDINFULL"))
in
FilterPaidInFull
Create query below. This query uses a left anti-join to exclude any loans that are in MaxDate (these are the PAIDINFULL loans). You can disable the load of queries Raw and MaxDate so only the final query will appear in your model.
let
Source = Table.NestedJoin(Raw, {"Loan"}, MaxDate, {"Loan"}, "MaxDate", JoinKind.LeftAnti),
RemoveMaxDate = Table.RemoveColumns(Source, {"MaxDate"})
in
RemoveMaxDate
Proud to be a Super User!
Try this Power Query solution. The original query is named Raw.
Create query MaxDate:
let
Source = Raw,
GroupRows = Table.Group(
Source,
{"Loan"},
{
{"MaxDate", each List.Max([Accounting Date]), type nullable date},
{
"All",
each _,
type table [Loan = nullable number, Status = nullable text, Accounting Date = nullable date]
}
}
),
ExpandTable = Table.ExpandTableColumn(GroupRows, "All", {"Status"}, {"Status"}),
FilterPaidInFull = Table.SelectRows(ExpandTable, each ([Status] = "PAIDINFULL"))
in
FilterPaidInFull
Create query below. This query uses a left anti-join to exclude any loans that are in MaxDate (these are the PAIDINFULL loans). You can disable the load of queries Raw and MaxDate so only the final query will appear in your model.
let
Source = Table.NestedJoin(Raw, {"Loan"}, MaxDate, {"Loan"}, "MaxDate", JoinKind.LeftAnti),
RemoveMaxDate = Table.RemoveColumns(Source, {"MaxDate"})
in
RemoveMaxDate
Proud to be a Super User!
Thank-you! I got it to work! (although I had to change Loan to nullable text). Sweet!
EDIT: I just checked, and my table isn't dynamic, isn't hasn't updated with the 5 payouts done this morning, so instead of Duplicating LOANHIST to create MaxDate, I should probably Reference it, right?
Correct, use "Reference".
Proud to be a Super User!