Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
E_K_
Helper II
Helper II

How to calculate percentile values per month for a datetime column that spans several years

I am trying to get 75th, 90th, xth percentile vlues for the distinct count of [ID] in table 'Incidents' per month. Dates are stored in the [Start Time] column (date time) - these values span several years. I have another column in 'Incidents' [Month Year] that formats these as MMMM YYYY if needed.

 

I also have a connected table with a relationship for [Start Time] for Dates called 'Dates' that has a row per day of year and columns split up for month, year, etc. DISTINCTCOUNT needs to be used for [ID] as there are non-distinct values as [ID] is split up by [Client].

 

An [ID] can impact several values in [Client] as I want 2 outcomes:

 

1. Get the 75th, 90th, 95th, 99th, 99.5th percentiles of [ID] per month

2. Rank clients against these monthly percentile values (distinct count of [ID] for [Client] per month against these values to show if they fall in or around those percentile value categories.)

 

I've tried a few strategies eg summarize then percentile functions or percentilex.inc (see the below) however I can never get the measures to return anything other than 1, or the dinstinctcount of [ID] per month

Some attempts, they all return garbage:

75th Percentile =
PERCENTILEX.INC(
   KEEPFILTERS(VALUES('Incidents'[Month Year])),
        CALCULATE(DISTINCTCOUNT(Incidents[ID])
    ),
    0.75
)
 
75th Percentile Volume by Month =
VAR SummaryTable =
    ADDCOLUMNS(
        SUMMARIZE(
            'Incidents',
            'Incidents'[Month Year],
            'Incidents'[ID]
        ),
        "Volume", 1
    )
VAR Percentile = 0.75
RETURN
    PERCENTILEX.INC(
        FILTER(
            SUMMARIZE(
                SummaryTable,
                'Incidents'[Month Year],
                "Volume", SUM('SummaryTable'[Volume])
            ),
            'Volume' > 0
        ),
        'Volume',
        Percentile
    )
 
90th p = PERCENTILEX.EXC(SUMMARIZE('Incidents', 'Incidents'[ID],'Dates'[Year-Month Code]), 'Dates'[Year-Month Code], 0.9)
 
RANK_CLIENT =
VAR ClientIncidents = CALCULATE(DISTINCTCOUNT(('Incidents'[ID])), ALLSELECTED('Incidents'[Client]))
VAR TotalIncidents = CALCULATE([Sev 1-3])
VAR TotalClients = DISTINCTCOUNT('Incidents'[Client])
VAR Percentile = RANKX(ALL(Incidents), ClientIncidents) / TotalClients
RETURN
Percentile
 
 
3 REPLIES 3
E_K_
Helper II
Helper II

Thanks for your response - I could have sworn the first measure worked for a split second but then it stopped working after I played around the the visualisation type a bit, and am unsure how to get back to that stage where it worked. I'm also keen to get this as a lcombined bar and line graph with [ID] volume as a y axis and the month and year as the x axis.

 

Currently the measure provided only wants to provide either the calculation on a year or month level, but never both at one. (i.e. just the distinctcount of [ID]) .

 

Any ideas?

 

-----
Works either for Year:

E_K__0-1678969000938.png

Or for Month, but never both at once

E_K__1-1678969086272.png

 

 

 



@Sahir_Maharaj  are you able to help?

Sahir_Maharaj
Super User
Super User

Hello @E_K_,

 

Can you please try this:

 

1. 75th Percentile of [ID] per Month:

75th Percentile = 
VAR SummaryTable =
    SUMMARIZE (
        'Incidents',
        'Incidents'[Month Year],
        "Distinct ID Count", DISTINCTCOUNT ('Incidents'[ID])
    )
RETURN
    PERCENTILEX.INC (
        SummaryTable,
        [Distinct ID Count],
        0.75
    )

 

2. Ranking of Clients against 75th Percentile:

75th Percentile Rank = 
VAR ClientTable =
    SUMMARIZE (
        'Incidents',
        'Incidents'[Month Year],
        'Incidents'[Client],
        "Distinct ID Count", DISTINCTCOUNT ('Incidents'[ID])
    )
VAR PercentileThreshold =
    [75th Percentile]
VAR ClientRankTable =
    ADDCOLUMNS (
        ClientTable,
        "75th Percentile Rank", IF (
            [Distinct ID Count] >= PercentileThreshold,
            RANKX (
                FILTER ( ClientTable, [Month Year] = EARLIER ( [Month Year] ) ),
                [Distinct ID Count],
                ,
                DESC
            ),
            BLANK ()
        )
    )
RETURN
    AVERAGEX (
        FILTER ( ClientRankTable, NOT ISBLANK ( [75th Percentile Rank] ) ),
        [75th Percentile Rank]
    )

 

You can use similar measures for calculating the 90th, 95th, 99th, and 99.5th percentile values and ranking clients against those percentile values.

 

Hope this helps!


Did I answer your question? Mark my post as a solution, this will help others!

If my response(s) assisted you in any way, don't forget to drop me a "Kudos" 🙂

Kind Regards,
Sahir Maharaj
Data Scientist | Data Engineer | Data Analyst | AI Engineer
P.S. Want me to build your Power BI solution?
➤ Lets connect on LinkedIn: Join my network of 13K+ professionals
➤ Join my free newsletter: Data Driven: From 0 to 100
➤ Website: https://sahirmaharaj.com
➤ Email: sahir@sahirmaharaj.com
➤ Want me to build your Power BI solution? Lets chat about how I can assist!
➤ Join my Medium community of 30K readers! Sharing my knowledge about data science and artificial intelligence
➤ Explore my latest project (350K+ views): Wordlit.net
➤ 100+ FREE Power BI Themes: Download Now
LinkedIn Top Voice in Artificial Intelligence, Data Science and Machine Learning

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.