Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Grow your Fabric skills and prepare for the DP-600 certification exam by completing the latest Microsoft Fabric challenge.

Reply
JPNK
Frequent Visitor

Count each occurrence

Hello all, 

I've been trying for days to get my problem solved. I've looked at several tutorials, seen many YT videos, searched the Community Forum and nothing. There are similar things to what I need but they don't produce the result I need. I don't know if this is very specific, or if I just didn't use the correct key words in my search. Anyway, after days of searching and not finding anything, I'm posting here to see if anyone can help me.

 

I have this table in Power BI called login_list

 

   file_date        Option            email                                 sorted_list
10-Jan-2022        O1       aaa@mail.com          10-Jan-2022-O1-aaa@mail.com
10-Jan-2022        O2       aaa@mail.com          10-Jan-2022-O2-aaa@mail.com
10-Jan-2022        O1       bbb@mail.com         10-Jan-2022-O1-bbb@mail.com
10-Jan-2022        O1       ccc@mail.com          10-Jan-2022-O1-ccc@mail.com
02-Mar-2022       O1       bbb@mail.com         02-Mar-2022-O1-bbb@mail.com
02-Mar-2022       O2       bbb@mail.com         02-Mar-2022-O2-bbb@mail.com
02-Mar-2022       O2       ddd@mail.com         02-Mar-2022-O2-ddd@mail.com
02-Mar-2022       O1       eee@mail.com          02-Mar-2022-O1-eee@mail.com
06-Jul-2022         O1       aaa@mail.com          06-Jul-2022-O1-aaa@mail.com
06-Jul-2022         O1       bbb@mail.com         06-Jul-2022-O1-bbb@mail.com
06-Jul-2022         O1       ccc@mail.com          06-Jul-2022-O1-ccc@mail.com
06-Jul-2022         O2       aaa@mail.com          06-Jul-2022-O2-aaa@mail.com
06-Jul-2022         O2       bbb@mail.com         06-Jul-2022-O2-bbb@mail.com
06-Jul-2022         O2       ddd@mail.com         06-Jul-2022-O2-ddd@mail.com

 

login_list[file_date] = date of login

login_list[option] = system option of login

login_list[email] = user login name

login_list[sorted_list] = "just some concatenation of all the data in a single place"

 

Now the part that I'm trying to figure out is how to count the nth time a user has logged on, let me give you an example of what I mean, I want to create a Count column (see example below) that would have these type of results.

 

   file_date        Option            email                                  sorted_list                      Count
10-Jan-2022         O1       aaa@mail.com         10-Jan-2022-O1-aaa@mail.com         1
10-Jan-2022         O2       aaa@mail.com         10-Jan-2022-O2-aaa@mail.com         2
10-Jan-2022         O1       bbb@mail.com        10-Jan-2022-O1-bbb@mail.com        1
10-Jan-2022         O1       ccc@mail.com         10-Jan-2022-O1-ccc@mail.com          1
02-Mar-2022        O1       bbb@mail.com        02-Mar-2022-O1-bbb@mail.com       2
02-Mar-2022        O2       bbb@mail.com        02-Mar-2022-O2-bbb@mail.com       3
02-Mar-2022        O2       ddd@mail.com        02-Mar-2022-O2-ddd@mail.com       1
02-Mar-2022        O1       eee@mail.com         02-Mar-2022-O1-eee@mail.com        1
06-Jul-2022          O1       aaa@mail.com         06-Jul-2022-O1-aaa@mail.com          3
06-Jul-2022          O1       bbb@mail.com        06-Jul-2022-O1-bbb@mail.com         4
06-Jul-2022          O1       ccc@mail.com         06-Jul-2022-O1-ccc@mail.com           2
06-Jul-2022          O2       aaa@mail.com         06-Jul-2022-O2-aaa@mail.com          4
06-Jul-2022          O2       bbb@mail.com        06-Jul-2022-O2-bbb@mail.com         5
06-Jul-2022          O2       ddd@mail.com        06-Jul-2022-O2-ddd@mail.com         2

 

So as you can see from the login_list[Count] column, the value 1 means that this is the 1st time this user logged on, value 2 is 2nd time, 3 is 3rd, and so on and so forth.

At the end of the day I want to go to a specific line and say, for instance, that on July 6th the user aaa@mail.com logged in the application for the 4th time or that bbb logged in for the 5th time and both used authentication method 2.

 

What is the simplest way to achieve this? (to get the login_list[Count] column)

Thank you 

1 ACCEPTED SOLUTION
JPNK
Frequent Visitor

Actually just figured out another way based on what @MarkLaf just said. It's not a one step method as you explained but I believe it's simpler to wrap your head around it.

So after trying to sort it unsuccessfully, I figured that the first thing to do was to actually sort the table correctly.

1 - I split the login_list[file_date] using the "/" delimiter and renamed the outputs to "file_date.year","file_date.month","file_date.day"

2 - Deleted the current  login_list[sorted_list] 

3 - Created a new login_list[sorted_list] in PowerQuery editor by concatenating login_list[file_date.year]&login_list[file_date.month]&login_list[file_date.day]&login_list[option]&login_list[email]

4 - Sorted by login_list[sorted_list]

6 - Added an index to the table

7 - Closed PowerQuery editor and in the data view I created the new "count" field:

count = CALCULATE(COUNT(login_list[email]),FILTER(login_list,login_list[email] = EARLIER(login_list[email]) && login_list[Index] <= EARLIER(login_list[Index])))

 

 

And it appears to be working...

 

Thanks for your suggestions

 

NOTE: props to @v-deddai1-msft for his suggestion on https://community.powerbi.com/t5/Desktop/Nth-Occurrence-of-ID-by-date-or-Index/td-p/1169286

 

View solution in original post

6 REPLIES 6
JPNK
Frequent Visitor

Actually just figured out another way based on what @MarkLaf just said. It's not a one step method as you explained but I believe it's simpler to wrap your head around it.

So after trying to sort it unsuccessfully, I figured that the first thing to do was to actually sort the table correctly.

1 - I split the login_list[file_date] using the "/" delimiter and renamed the outputs to "file_date.year","file_date.month","file_date.day"

2 - Deleted the current  login_list[sorted_list] 

3 - Created a new login_list[sorted_list] in PowerQuery editor by concatenating login_list[file_date.year]&login_list[file_date.month]&login_list[file_date.day]&login_list[option]&login_list[email]

4 - Sorted by login_list[sorted_list]

6 - Added an index to the table

7 - Closed PowerQuery editor and in the data view I created the new "count" field:

count = CALCULATE(COUNT(login_list[email]),FILTER(login_list,login_list[email] = EARLIER(login_list[email]) && login_list[Index] <= EARLIER(login_list[Index])))

 

 

And it appears to be working...

 

Thanks for your suggestions

 

NOTE: props to @v-deddai1-msft for his suggestion on https://community.powerbi.com/t5/Desktop/Nth-Occurrence-of-ID-by-date-or-Index/td-p/1169286

 

MarkLaf
Solution Sage
Solution Sage

Nested join the tabe to itself, filter the nested tables based off criteria, then perform a row count on each filtered nested table:

let
    RelatedAll = Table.NestedJoin(login_list, "email", login_list, "email", "Count"),
    RelatedPrevious = 
    Table.ReplaceValue( 
        RelatedAll, 
        each [Count], 
        (row)=> Table.SelectRows(
            row[Count], 
            each [file_date] < row[file_date] 
            or [file_date] = row[file_date] and [Option] <= row[Option]
        ), 
        Replacer.ReplaceValue, 
        {"Count"} 
    ),
    Types = Value.ReplaceType( RelatedPrevious, Value.Type( RelatedAll ) ),
    CountRelatedPrevious = Table.TransformColumns(Types, {"Count", Table.RowCount, Int64.Type})
in
    CountRelatedPrevious

 

Output:

MarkLaf_0-1677519781233.png

 

mussaenda
Super User
Super User

Hi @JPNK ,

 

I just dont understand the sorting of your rows.

on Jan 10, 10-Jan-2022-O2-aaa@mail.com is on 2nd row while  10-Jan-2022-O1-bbb@mail.com is on 3rd row

 

but on 06 Jul, 06-Jul-2022-O1-ccc@mail.com  coame first before 06-Jul-2022-O2-aaa@mail.com  

 

I am able to create your request but I have the discrepancy on your count 

mussaenda_0-1677507040371.png

 

 

Hi @mussaenda thanks for your feedback but the table isn't sorted. The data is as ingested from the csv files, the reason why the "sorted_list" was created was to sort the table later on.

mussaenda
Super User
Super User

Hi @JPNK ,

 

I just dont understand the sorting of your rows.

on Jan 10, 10-Jan-2022-O2-aaa@mail.com is on 2nd row while  10-Jan-2022-O1-bbb@mail.com is on 3rd row

 

but on 06 Jul, 06-Jul-2022-O1-ccc@mail.com  coame first before 06-Jul-2022-O2-aaa@mail.com  

 

I am able to create your request but I have the discrepancy on your count 

mussaenda_0-1677507040371.png

 

 

Hi @mussaenda thanks for your feedback but the table isn't sorted. The data is as ingested from the csv files, the reason why the "sorted_list" was created was to sort the table later on.

Helpful resources

Announcements
RTI Forums Carousel3

New forum boards available in Real-Time Intelligence.

Ask questions in Eventhouse and KQL, Eventstream, and Reflex.

MayPowerBICarousel

Power BI Monthly Update - May 2024

Check out the May 2024 Power BI update to learn about new features.

Top Solution Authors
Top Kudoed Authors