Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

v-rzhou-msft

How to do cumulative counting by group in Power Query

In this article, I will introduce two scenarios for cumulative counts in Power Query. Scenario 1 is to perform grouping and accumulative counting according to other columns. Scenario 2 is an advanced calculation based on scenario 1, which not only performs grouped cumulative counting, but also requires conditional judgement. Generally speaking, both scenarios are cumulatively counted in Power Query. However, the scenario 1 is for the overall situation, and scenario 2 is for specific parts that meet the conditions.

 

Sample Data:

RicoZhou_0-1655889829381.png

 

Scenario 1: 

To get the cumulative count in Power Query according to Id, SubId, Name and time order.

Expected output:

RicoZhou_1-1655889829384.png

 

In this scenario, let’s take Id=1 as an example,

For Id=1, SubId=A, Name=AAA, there are 4 rows, so Expected output returns cumulative counts in time order, 1,2,3,4 (in blue boxes)

For Id=1, SubId=A, Name=BBB, there are 3 rows, so Expected output returns cumulative counts in time order, 1,2,3 (in red boxes)

 

Operations:

Open Advanced Editor, then copy the code in the red box into it.

RicoZhou_2-1655889829387.png

 

The following is the code interpretation

Step 1

First, we filter the table by Id, SubId, Name and Date on the basis of #"Sorted Rows" step, and then get the partial table that meet the conditions.

 

 Table.SelectRows(

                        #"Sorted Rows",

                        each [Id] = r[Id] and [SubId]=r[SubId] and [Name]=r[Name] and [Date] <= r[Date]

                    )

 

 

Step 2

Then, get the [Name] column from the table

 

Table.SelectRows(

                        #"Sorted Rows",

                        each [Id] = r[Id] and [SubId]=r[SubId] and [Name]=r[Name] and [Date] <= r[Date]

                    )[Name]

 

 

Step 3

Then count the column obtained in step 2

 

List.Count(                   

                    Table.SelectRows(

                        #"Sorted Rows",

                        each [Id] = r[Id] and [SubId]=r[SubId] and [Name]=r[Name] and [Date] <= r[Date]

                    )[Name]

                )

 

 

Step 4

Finally, use the Table.AddColumn() function to add a column named "Count" to the table.

 

Table.AddColumn(#"Sorted Rows", "Count",  (r) =>

        List.Count(                   

                    Table.SelectRows(

                        #"Sorted Rows",

                        each [Id] = r[Id] and [SubId]=r[SubId] and [Name]=r[Name] and [Date] <= r[Date]

                    )[Name]

                )

            )

 

 

Step 5

Return the table by #"Add Column" step

 

in

    #"Add Column"

 

 

Output:

RicoZhou_3-1655889829388.png


Scenario 2: 

In Power Query, first determine whether Name is "AAA", if it is, then count cumulatively according to Id, SubId, Name and Date. Otherwise return 0.

Expected output:

RicoZhou_4-1655889829389.png

 

In this scenario, let’s take Id=1 as an example,

For Id=1, SubId=A, Name=AAA, there are 4 rows, so Expected output returns cumulative counts in time order, 1,2,3,4 (in red boxes).

 

Operations:

Open Advanced Editor, then copy the code in the red box into it.

RicoZhou_5-1655889829390.png

 

The following is the code interpretation

Step 1

First, we filter the table by Id, SubId, Name and Date on the basis of #"Sorted Rows" step, and then get the partial table that meet the conditions.

 

Table.SelectRows(

                            #"Sorted Rows",

                            each [Id] = r[Id] and [SubId]=r[SubId] and [Date] <= r[Date]

                        )

 

 

Step 2

Then, get the [Name] column from the table

 

Table.SelectRows(

                            #"Sorted Rows",

                            each [Id] = r[Id] and [SubId]=r[SubId] and [Date] <= r[Date]

                        )[Name]

 

 

Step 3

Then, filter [Name] column where value="AAA"

 

List.LastN(

                        Table.SelectRows(

                            #"Sorted Rows",

                            each [Id] = r[Id] and [SubId]=r[SubId] and [Date] <= r[Date]

                        )[Name],

                        each _ = "AAA"

                    )

 

 

Step 4

Then count the column obtained in step 3

 

List.Count(

                    List.LastN(

                        Table.SelectRows(

                            #"Sorted Rows",

                            each [Id] = r[Id] and [SubId]=r[SubId] and [Date] <= r[Date]

                        )[Name],

                        each _ = "AAA"

                    )

                )

 

 

Step 5

Finally, use the Table.AddColumn() function to add a column named "Count" to the table. If the value= "AAA", then it will accumulate the count, otherwise it will return 0.

 

Table.AddColumn(#"Sorted Rows", "Count",  (r) => if r[Name] = "AAA"

           then List.Count(

                    List.LastN(

                        Table.SelectRows(

                            #"Sorted Rows",

                            each [Id] = r[Id] and [SubId]=r[SubId] and [Date] <= r[Date]

                        )[Name],

                        each _ = "AAA"

                    )

                )

           else 0)

 

 

Step 6

Return the table by #"Add Column" step

 

in

    #"Add Column"

 

 

Output:

RicoZhou_6-1655889829391.png

 

Related links: 

Table.LastN(): Table.LastN - PowerQuery M | Microsoft Docs

Table.SelectRows(): Table.SelectRows - PowerQuery M | Microsoft Docs

List.Count(): List.Count - PowerQuery M | Microsoft Docs

Table.AddColumn(): Table.AddColumn - PowerQuery M | Microsoft Docs

Solved: Consecutive Count by CustomerId, SubscriptionId & ... - Microsoft Power BI Community

 

Author: Xiaojie Tang

Reviewer: Kerry Wang & Ula Huang

Comments