Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
LindaJonesCT
Helper I
Helper I

Latest Status (for Candidate Tracking)

I am working on a candidate tracking report and am struggling with the logic of how to report on the number of candidates in each status  (as of a selected date)

 

The data looks like this  (simplified)

CandidateDateStatus
4501/15/23A
4501/15/23B
4501/31/23C
50012/31/22A

 

If the selected date on the report is 12/31/22  - then 1 Candidate is in Status A

on 1/15/23 1 is in Status A and 1  is in Status B

 

(Max status is used so for the status change on 1/15 "B" should be used

 

Thanks in advance.

1 ACCEPTED SOLUTION
Martin_D
Super User
Super User

Hi LindaJonesCT,

 

This scenario is a typical application of a snapshot fact table. That means, you would transform your status event table into a table that contains for all dates that you want to analyze, all candidates and all their states that were valid at that date. You could create these snapshots for every date, once a week, once a month, at every date when a new status was entered, whatever you prefer. Then you could draw a visual by status and just filter by date and show distinct count of candidates as values.

 

But you can also use your transactional table structure with a measure.

 

To do so, first, you need to make sure that selecting a date does not also implicitly select the states, otherwise you are not able to see values for states that have not been entered by any candidate at the selected date. You can achieve this by creating a date dimension table, or a status dimension table, or both. I'd start with a date table, because this also allows you to analyze dates at which no new states were assigned. The date table is just a disconnected Table containing all the dates that you want to analyze.

 

The simplest way to create the date table is to create a calculated table using the formular:

Dates = CALENDARAUTO()

Then change the datatype of the Date column in the new table from date/time to date.

 

Alternatively, you can also create a custom date table in PowerQuery.

 

Finally, doublecheck in data model view that there is no relationship between the new date table und your existing table.

 

Next you need to create a measure:

Count of Candidate with Date Table = 

    // Get selected date
    VAR _Date = MAX ( 'Date'[Date] )

    // Get selected states
    VAR _States = VALUES ( 'Candidate Status Changes'[Status] )

    // Per candidate, get the latest date of new stats change
    VAR _CandidatesAndDates = 
        CALCULATETABLE (
            ADDCOLUMNS (
                SUMMARIZECOLUMNS (
                    'Candidate Status Changes'[Candidate]
                ),
                "@LastDate",
                CALCULATE (
                    MAX ( 'Candidate Status Changes'[Date] )
                )
            ),
            'Candidate Status Changes'[Date] <= _Date,
            ALL ( 'Candidate Status Changes'[Status] )
        )

    // Per candidate, get max. status set at last date
    VAR _CandidatesAndStatus =
        ADDCOLUMNS (
            _CandidatesAndDates,
            "@MaxStatus",
            VAR _Date = [@LastDate]
            RETURN
            CALCULATE (
                MAX ( 'Candidate Status Changes'[Status] ),
                'Candidate Status Changes'[Date] = _Date,
                ALL ( 'Candidate Status Changes'[Status] )
            )
        )

    RETURN

    COUNTROWS ( FILTER ( _CandidatesAndStatus, [@MaxStatus] IN _States ) )

Now you can create a visual using status and this measure and add a date filter (from the date table) to your page and select your date of interest.

 

In this file you can explore both approaches, the one with the snapshot fact table and the one with the measure for your table as described above.

View solution in original post

4 REPLIES 4
Martin_D
Super User
Super User

Hi LindaJonesCT,

 

This scenario is a typical application of a snapshot fact table. That means, you would transform your status event table into a table that contains for all dates that you want to analyze, all candidates and all their states that were valid at that date. You could create these snapshots for every date, once a week, once a month, at every date when a new status was entered, whatever you prefer. Then you could draw a visual by status and just filter by date and show distinct count of candidates as values.

 

But you can also use your transactional table structure with a measure.

 

To do so, first, you need to make sure that selecting a date does not also implicitly select the states, otherwise you are not able to see values for states that have not been entered by any candidate at the selected date. You can achieve this by creating a date dimension table, or a status dimension table, or both. I'd start with a date table, because this also allows you to analyze dates at which no new states were assigned. The date table is just a disconnected Table containing all the dates that you want to analyze.

 

The simplest way to create the date table is to create a calculated table using the formular:

Dates = CALENDARAUTO()

Then change the datatype of the Date column in the new table from date/time to date.

 

Alternatively, you can also create a custom date table in PowerQuery.

 

Finally, doublecheck in data model view that there is no relationship between the new date table und your existing table.

 

Next you need to create a measure:

Count of Candidate with Date Table = 

    // Get selected date
    VAR _Date = MAX ( 'Date'[Date] )

    // Get selected states
    VAR _States = VALUES ( 'Candidate Status Changes'[Status] )

    // Per candidate, get the latest date of new stats change
    VAR _CandidatesAndDates = 
        CALCULATETABLE (
            ADDCOLUMNS (
                SUMMARIZECOLUMNS (
                    'Candidate Status Changes'[Candidate]
                ),
                "@LastDate",
                CALCULATE (
                    MAX ( 'Candidate Status Changes'[Date] )
                )
            ),
            'Candidate Status Changes'[Date] <= _Date,
            ALL ( 'Candidate Status Changes'[Status] )
        )

    // Per candidate, get max. status set at last date
    VAR _CandidatesAndStatus =
        ADDCOLUMNS (
            _CandidatesAndDates,
            "@MaxStatus",
            VAR _Date = [@LastDate]
            RETURN
            CALCULATE (
                MAX ( 'Candidate Status Changes'[Status] ),
                'Candidate Status Changes'[Date] = _Date,
                ALL ( 'Candidate Status Changes'[Status] )
            )
        )

    RETURN

    COUNTROWS ( FILTER ( _CandidatesAndStatus, [@MaxStatus] IN _States ) )

Now you can create a visual using status and this measure and add a date filter (from the date table) to your page and select your date of interest.

 

In this file you can explore both approaches, the one with the snapshot fact table and the one with the measure for your table as described above.

Thanks Martin - I had no idea you could do that kind of work in Power Query.  (I am used to doing the heavy lifting in SQL, but that's not an option for this task)  The data I am working on is already several million rows - so I'm going to try the measure version first.  I will let you know how I make out later today.

Yes, the size of the table can grow a lot with the snapshot structure. Therefore the DAX code is much simpler and faster. You have to expect number of candidates x number of dates to analyze rows. There is no upper row limit in Power BI, so you can always give it a try. It all depends on how well the data can be compressed.
The first step to reduce the size would usually be to reduce to snapshot frequency: Instead of once a day, maybe once a week or once a month is sufficient. Next consideration could be to limit the history: Maybe the past two or three years are sufficient. It all depends on your requirements.

 

I'd really be curious to hear from you whether the snapshot approach worked with your amount of data - loading duration wise as well as file size wise.

As time permits I will try the snapshot approach and post an update.  Thank you so much for your help.  

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.