Power BI is turning 10! Tune in for a special live episode on July 24 with behind-the-scenes stories, product evolution highlights, and a sneak peek at what’s in store for the future.
Save the dateEnhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends August 31st. Request your voucher.
Hi there,
We use a lot of COUNTIFS() in Excel, because we have to extract unique occurrences of employees in order to make sure they don't appear multiple times.
For example, in this table:
Employee | Counter | Days |
A | 1 | 1 |
A | 2 | 5 |
B | 1 | 6 |
B | 2 | 1 |
C | 1 | 1 |
It should output for example:
Number of employees ==> 3
Number of different counters ==> 2
Sum of days == >14
So we use ye olde method of 1/COUNTIFS() and then sum on that column. I know, not ideal, but it works.
I found a method in M language on this forum and StackOverflow, based on this code:
List.Count(
Table.SelectRows(
#"PREVIOUS_STEP",
(C) => [Matricule] = C[VALUE_1] and [VALUE_2] = (C)[VALUE_2]
)[VALUE_1]
)
Problem is, it is very, very slow.
Isn't there any way to do it faster and in a simpler way in Power Query? We do not want to use DAX for this, as we need the data in Excel.
Thanks in advance for any help.
Solved! Go to Solution.
OK, so I started from scratch just to make sure I was working through the issue correctly. I've attached the working PBIX at the bottom so I won't post each individual bit of code here.
- Your Days table I didn't do anything with directly, but is used later in a crossjoin.
- Your Events table I did some pre-prep on to get it looking like this:
- Your Employees table I used as the base for the final process, crossjoining the Dates table and merging the prepped Events table, before double-grouping to generate the daily stats.
Here's the output:
I think you'd got most of the way here already, so it's probably just the final groupings in the Employees table that are relevant. The first grouping just gets you to a single record per employee/date so you're not double-counting. The second grouping uses custom List.Counts to pick out the stats you're after.
Pete
Proud to be a Datanaut!
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WclTSUTIE41gdCM8IiE3BPCeonBmcZwRX6YzQFwsA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Employee = _t, Counter = _t, Days = _t]),
ChangedType = Table.TransformColumnTypes(Source,{{"Days", Int64.Type}}),
DistEmployee = List.Count(List.Distinct(ChangedType[Employee])),
DistCounter = List.Count(List.Distinct(ChangedType[Counter])),
SumOfDays = List.Sum(ChangedType[Days]),
Result = Table.FromRecords({[NoEmployee = DistEmployee, NoCounter = DistCounter, SumOfDays = SumOfDays]})
in
Result
*EDIT* Ignore this post, check out my reply to this post below - much simpler and maintains query folding
------------------------------------------------------------------------------------------
Cool, thanks for that.
Try this out:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WclTSUTJUitWBsEzBLCcgywzOgsg6Q1ixAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Employee = _t, Days = _t]),
chgTypes = Table.TransformColumnTypes(Source,{{"Employee", type text}, {"Days", Int64.Type}}),
groupEmployeeData = Table.Group(chgTypes, {"Employee"}, {{"data", each _, type table [Employee=nullable text, Counter=nullable number, Days=nullable number]}}),
addNestedCounter = Table.TransformColumns(groupEmployeeData, {"data", each Table.AddIndexColumn(_, "Counter", 1, 1)}),
expandDataCol = Table.ExpandTableColumn(addNestedCounter, "data", {"Days", "Counter"}, {"Days", "Counter"}),
addGroupMe = Table.AddColumn(expandDataCol, "GroupMe", each "X"),
groupRows = Table.Group(addGroupMe, {"GroupMe"}, {{"NoEmployee", each List.Count(List.Distinct(_[Employee])), Int64.Type}, {"NoCounter", each List.Max([Counter]), type number}, {"SumOfDays", each List.Sum([Days]), type number}})
in
groupRows
Summary:
groupEmployeeData = Nest a table of data for each employee
addNestedCounter = Add an Index column to each employee table for the counter
expandDataCol = Reinstate nested tables/columns to main table
addGroupMe = Add a single-value column on which to group the whole table
groupRows = Group no single-value column using MAX of [Counter], SUM of [Days], and a custom group using List.Count(List.Distinct(... on [Employee]
It *should* be faster than a custom function, but too many variables to know for sure.
Pete
Proud to be a Datanaut!
Hello everyone,
Sorry for my lack of precise information.
You will find here an Excel file with what I want to do (the final query is the one named Employees).
In this very simple example, I have a list of employees who will be away during the week of the 26/12, and I need to know how many. In the events' list, there are some of them valued as 1/2 days, so when I merge everything I have multiple rows for a single employee, so I need a COUNTIFS() to bypass this issue and count them as 1 per day.
Thanks in advance!
OK, so a textbook example of an XY Problem.
On your shared file, which of the tables is our starting point i.e. which tables are the closest to your raw source?
On your Events table, what do the different [Types] mean? Do these signify multiple-day, single-day, and part-day absence?
Regarding the endpoint, I guess you want the date, number of employees (headcount), and number of days' worth of absence?
Pete
Proud to be a Datanaut!
Sorry, I did that in a rush.
1. All dates from the Days table are developed in the Employees table (with a simple Key = 1)
2. Because there is a start and end date, all the events from the table Events are developed (eg. if there is RTJ from 26/12 to 29/12, 4 rows are created). These events mean that employees are away from office
3. If there is M (morning) or A (afternoon) in the Type column of the event, it means it is valued as half a day of leave, so it's valued as 0.5
4. These days are merged with the days of each employee, in order to know if they are here or not on the aforementioned days. If there is an event without type then 1, with a type then 0.5, no event then 0
And yes, your endpoint is right. Thanks!!
OK, so I started from scratch just to make sure I was working through the issue correctly. I've attached the working PBIX at the bottom so I won't post each individual bit of code here.
- Your Days table I didn't do anything with directly, but is used later in a crossjoin.
- Your Events table I did some pre-prep on to get it looking like this:
- Your Employees table I used as the base for the final process, crossjoining the Dates table and merging the prepped Events table, before double-grouping to generate the daily stats.
Here's the output:
I think you'd got most of the way here already, so it's probably just the final groupings in the Employees table that are relevant. The first grouping just gets you to a single record per employee/date so you're not double-counting. The second grouping uses custom List.Counts to pick out the stats you're after.
Pete
Proud to be a Datanaut!
Ignore post above, you can do this with just a double-group, which will also maintain query folding:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WclTSUTJUitWBsEzBLCcgywzOgsg6Q1ixAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Employee = _t, Days = _t]),
chgTypes = Table.TransformColumnTypes(Source,{{"Employee", type text}, {"Days", Int64.Type}}),
groupEmployeeData = Table.Group(chgTypes, {"Employee"}, {{"NoCounter", each Table.RowCount(_), Int64.Type}, {"SumOfDays", each List.Sum([Days]), type nullable number}}),
addGroupMe = Table.AddColumn(groupEmployeeData, "GroupMe", each "X"),
groupTable = Table.Group(addGroupMe, {"GroupMe"}, {{"NoEmployee", each Table.RowCount(_), Int64.Type}, {"NoCounter", each List.Max([NoCounter]), type number}, {"SumOfDays", each List.Sum([SumOfDays]), type nullable number}})
in
groupTable
Pete
Proud to be a Datanaut!
Hi @Spigaw ,
The counter can be done quickly using a nested index, but what was it you wanted done with the days?
Can you give a small example of your original data, and an example of the expected output please?
Pete
Proud to be a Datanaut!
Check out the July 2025 Power BI update to learn about new features.