Microsoft Fabric Community Conference 2025, March 31 - April 2, Las Vegas, Nevada. Use code MSCUST for a $150 discount.
Register nowThe Power BI DataViz World Championships are on! With four chances to enter, you could win a spot in the LIVE Grand Finale in Las Vegas. Show off your skills.
Hi,
I am trying to get the min date, max date and last status per project. I would love to be able to do this with columns so that I should shrink my data set by removing duplicates but a measure will work . I can get the min and max dates but the status is not as straightforward. Is there a way to accomplish this with columns or at least with measures?
Project | Created | Status | |
CITEAM-434 | 1/1/2022 | Backlog | |
CITEAM-577 | 1/1/2023 | Backlog | |
CITEAM-434 | 6/1/2022 | In Progress | |
CITEAM-577 | 5/1/2022 | In Progress | |
CITEAM-434 | 1/1/2023 | Testing | |
CITEAM-577 | 9/1/2023 | QA | |
CITEAM-434 | 8/1/2023 | Done | |
Results | |||
Project | Min Date | Max Date | Last Status |
CITEAM-434 | 1/1/2022 | 8/1/2023 | Done |
CITEAM-577 | 5/1/2022 | 9/1/2023 | QA |
Solved! Go to Solution.
If your dataset has duplicate records (same Project, Date, and Status), you may want to remove them. In Power Query, you can do this by selecting Remove Duplicates on the ribbon.
Then you can group the data by Project:
- Click on the Project column.
- Go to the Home tab, and click Group By.
- In the Group By dialog:
- Group by: Project
- New column name: Min Date → Operation: Minimum → Column: Created
- New column name: Max Date → Operation: Maximum → Column: Created
- Click OK to apply the groupings. At this point, you will have the minimum and maximum dates for each project.
Now you need to merge the original table to get the last status:
- Now, you need to retrieve the status corresponding to the maximum date. To do this, merge the grouped table with the original table.
- Click on Home > Merge Queries.
- Merge the grouped table with the original table on:
- The Project column in both tables.
- The Max Date column (from the grouped table) and the Created column (from the original table).
- Choose a Left Join (default).
- After merging, expand the Status column from the original table to bring in the status corresponding to the maximum date.
After expanding, you'll have the Status for the max date. You can rename the columns appropriately, such as changing the new Status column to Last Status.
Don't forget to remove any unnecessary columns if needed.
let
Source = Excel.CurrentWorkbook(){[Name="Table"]}[Content],
RemovedDuplicates = Table.Distinct(Source),
GroupedRows = Table.Group(RemovedDuplicates, {"Project"},
{
{"Min Date", each List.Min([Created]), type date},
{"Max Date", each List.Max([Created]), type date}
}),
MergedTable = Table.NestedJoin(GroupedRows, {"Project", "Max Date"}, RemovedDuplicates, {"Project", "Created"}, "MergedData", JoinKind.LeftOuter),
ExpandedTable = Table.ExpandTableColumn(MergedTable, "MergedData", {"Status"}, {"Last Status"}),
CleanedTable = Table.SelectColumns(ExpandedTable, {"Project", "Min Date", "Max Date", "Last Status"})
in
CleanedTable
Wow. That is great. For my own benefit, is there a way to do that in Power Query? Removing duplicates after that would be ideal for me.
Thanks
If your dataset has duplicate records (same Project, Date, and Status), you may want to remove them. In Power Query, you can do this by selecting Remove Duplicates on the ribbon.
Then you can group the data by Project:
- Click on the Project column.
- Go to the Home tab, and click Group By.
- In the Group By dialog:
- Group by: Project
- New column name: Min Date → Operation: Minimum → Column: Created
- New column name: Max Date → Operation: Maximum → Column: Created
- Click OK to apply the groupings. At this point, you will have the minimum and maximum dates for each project.
Now you need to merge the original table to get the last status:
- Now, you need to retrieve the status corresponding to the maximum date. To do this, merge the grouped table with the original table.
- Click on Home > Merge Queries.
- Merge the grouped table with the original table on:
- The Project column in both tables.
- The Max Date column (from the grouped table) and the Created column (from the original table).
- Choose a Left Join (default).
- After merging, expand the Status column from the original table to bring in the status corresponding to the maximum date.
After expanding, you'll have the Status for the max date. You can rename the columns appropriately, such as changing the new Status column to Last Status.
Don't forget to remove any unnecessary columns if needed.
let
Source = Excel.CurrentWorkbook(){[Name="Table"]}[Content],
RemovedDuplicates = Table.Distinct(Source),
GroupedRows = Table.Group(RemovedDuplicates, {"Project"},
{
{"Min Date", each List.Min([Created]), type date},
{"Max Date", each List.Max([Created]), type date}
}),
MergedTable = Table.NestedJoin(GroupedRows, {"Project", "Max Date"}, RemovedDuplicates, {"Project", "Created"}, "MergedData", JoinKind.LeftOuter),
ExpandedTable = Table.ExpandTableColumn(MergedTable, "MergedData", {"Status"}, {"Last Status"}),
CleanedTable = Table.SelectColumns(ExpandedTable, {"Project", "Min Date", "Max Date", "Last Status"})
in
CleanedTable
Yes, there is a function in Power query called remove duplicates but I think what you truly want in Power query is a "group by". In Transformations if you select "group by" you can go to advanced and it will let you pick more than one column. (The basic just lets you pick one column) You'd want to pick project and status. Then in the bottom pick one aggregation, pick a max on Created date.
There is also a remove duplicates function in the power query transformation but that will not necessarily get you the latest date.
Start by creating a CC :
Min Date =
CALCULATE(
MIN('Table'[Created]),
ALLEXCEPT('Table', 'Table'[Project])
)
and one for the max :
Max Date =
CALCULATE(
MAX('Table'[Created]),
ALLEXCEPT('Table', 'Table'[Project])
)
and one for the last status :
Last Status =
CALCULATE(
LASTNONBLANK('Table'[Status], 1),
FILTER('Table', 'Table'[Created] = CALCULATE(MAX('Table'[Created]), ALLEXCEPT('Table', 'Table'[Project])))
)
and if you want a calculated table :
SummaryTable =
SUMMARIZE(
'Table',
'Table'[Project],
"Min Date", MINX(FILTER('Table', 'Table'[Project] = EARLIER('Table'[Project])), 'Table'[Created]),
"Max Date", MAXX(FILTER('Table', 'Table'[Project] = EARLIER('Table'[Project])), 'Table'[Created]),
"Last Status",
CALCULATE(
MAX('Table'[Status]),
FILTER('Table',
'Table'[Created] =
MAXX(FILTER('Table', 'Table'[Project] = EARLIER('Table'[Project])), 'Table'[Created]))
)
)
I have a clarifying question. There is a way to do this in DAX but if you're trying to shrink the dataset you would want to move the logic back either to the source or to the power query. Are you needing the other dates and statuses at all in the report or do you truly just want the latest for any and all visuals?
Truly the latest but I could see them wanting something for the last status of each quarter for those projects that hang out there for a time. Thanks
User | Count |
---|---|
120 | |
69 | |
68 | |
57 | |
50 |
User | Count |
---|---|
166 | |
82 | |
68 | |
65 | |
54 |