Solved: Matrix with Percentile Values in the Columns for P...

Anonymous · ‎01-13-2022

Greetings Community!

I have been stuck on this problem for a while now and cannot seem to figure this out.

Situation:

I want to show a matrix with Product Category in the rows and the Min, 25th Percentile, 50th Percentile, 75th Percentile and Max Percentile in the columns and the user can slice the values by Rate.

Example: the user wants to see the Min, 25th Percentile, 50th Percentile, 75th Percentile and Max Percentile for the Gross Sales rate for product categories A, B, C, D, E, F, etc for Fiscal Year 2021. (see below)

**Note: my numbers are on a per pound basis meaning, for example, the 25th Percentile rates are calculated by dividing aggregate amounts for the 25th Percentile Total Gross Sales and the 25th Percentile Total Volume.**

Here is what I have so far and what the problem looks like:

As you can see, the measures are clearly not calculating correctly. For example, row 2, you can see that the 50th Percentile value is larger than the Max value. Another example, row 6, you can see that the Min value is larger than the 50th Percentile value. Both examples are clearly incorrect. These are just two examples, but if you look at each row, you will notice the same issues.

My Set Up in Power BI Desktop:

Data Model:

Business Unit Dimension Table Structure:

Fact Table Structure:

**Note: My fact table has 4.2 million rows in it. The screenshot above only shows "Combined EBIT" in the Line-Item column, but all remaining Line-Items (Gross Sales, Inventory COGS, etc) are present as you scroll down.**

Percentile Rate Slicer Table and Fields Pane:

DAX Measures:

Below is the "Selected Pct Rate" DAX:

Selected Pct Rate = MIN('Percentile Rates'[Pct Rate_ID])

Percentile Measure Formula:

The Min, 25th Percentile, 50th Percentile, 75th Percentile and Max DAX code is the same. The only difference, the "K" value is 0, .25, .50, .75, and 1, respectively.

Below is what the "25th Percentile" DAX looks like:

25th Percentile = 

VAR Volume = 
COALESCE(CALCULATE((
PERCENTILE.INC('Fact Table'[Actual Values], .25)), ('Fact Table'[Line Item] = "Volume")), 0)

VAR GrossSales = 
DIVIDE(
CALCULATE((
PERCENTILE.INC('Fact Table'[Actual Values], .25)), ('Fact Table'[Line Item] = "Gross Sales")),
Volume, 0)

VAR NetSales = 
DIVIDE(
CALCULATE((
PERCENTILE.INC('Fact Table'[Actual Values], .25)), ('Fact Table'[Line Item] = "Net Sales")),
Volume, 0)

VAR InventoryCOGS = 
DIVIDE(
CALCULATE((
PERCENTILE.INC('Fact Table'[Actual Values], .25)), ('Fact Table'[Line Item] = "Inventory")),
Volume, 0)

VAR OtherCOGS = 
DIVIDE(
CALCULATE((
PERCENTILE.INC('Fact Table'[Actual Values], .25)), ('Fact Table'[Line Item] = "Other COGS")),
Volume, 0)

VAR Freight = 
DIVIDE(
CALCULATE((
PERCENTILE.INC('Fact Table'[Actual Values], .25)), ('Fact Table'[Line Item] = "Freight")),
Volume, 0)

VAR Warehouse = 
DIVIDE(
CALCULATE((
PERCENTILE.INC('Fact Table'[Actual Values], .25)), ('Fact Table'[Line Item] = "Warehouse")),
Volume, 0)

VAR GrossMargin = 
DIVIDE(
CALCULATE((
PERCENTILE.INC('Fact Table'[Actual Values], .25)), ('Fact Table'[Line Item] = "Gross Margin")),
Volume, 0)

VAR CombinedEBIT = 
DIVIDE(
CALCULATE((
PERCENTILE.INC('Fact Table'[Actual Values], .25)), ('Fact Table'[Line Item] = "Combined EBIT")),
Volume, 0)

//Beginning of VCM Component Calcs

VAR Slotting = 
DIVIDE(
CALCULATE((
PERCENTILE.INC('Fact Table'[Actual Values], .25)), ('Fact Table'[Line Item] = "Slotting")),
Volume, 0)

VAR ConsumerPromoIncentives = 
DIVIDE(
CALCULATE((
PERCENTILE.INC('Fact Table'[Actual Values], .25)), ('Fact Table'[Line Item] = "Consumer Promo Incentives")),
Volume, 0)

VAR FixedOH = 
DIVIDE(
CALCULATE((
PERCENTILE.INC('Fact Table'[Actual Values], .25)), ('Fact Table'[Line Item] = "Fixed OH")),
Volume, 0)

VAR Brokerage = 
DIVIDE(
CALCULATE((
PERCENTILE.INC('Fact Table'[Actual Values], .25)), ('Fact Table'[Line Item] = "Brokerage")),
Volume, 0)

//End of VCM Component Calcs

VAR VariableCM = 
GrossMargin + Slotting + ConsumerPromoIncentives + FixedOH + OtherCOGS - Brokerage

RETURN
SWITCH('Percentile Rates'[Selected Pct Rate],
   1, FORMAT(Volume, "#,##0"),
   2, GrossSales,
   3, NetSales,
   4, InventoryCOGS,
   5, OtherCOGS,
   6, Freight,
   7, Warehouse,
   8, GrossMargin,
   9, CombinedEBIT,
   10, VariableCM)

**Note: that the Variable CM (Pct Rate_ID #10) is a calculation of four separate Line-Items that do not need to be shown in the slicer.**

The reason I wrote my formulas this way is because I didn't want to write ten individual measures per Percentile which would require fifty measure total. I have also tried to do this with a calculated group, and I am still not able to get the correct results.

So, community, what am I doing wrong, and how can I solve this problem?

As always, thank you for taking the time to read my post and helping me with my question. I appreciate your assitance!

OwenAuger · ‎01-16-2022

Hi @Anonymous

I started looking at your question yesterday so just wanted to add my 2 cents 🙂

Consider this a guide as to how you might approach this rather than a "solution".

1. Regarding the strange percentile results, my guess is that is partly related to taking the ratio of two percentiles.

To take an example, if Gross Revenue and Volume each had a uniform distribution, but Gross Revenue's range was 1,000-2,000 and Volume's range was 100-500, the ratios of the percentiles would be decreasing:

Min: 1,000/100 = 10.00
25th: 1,250/200 = 6.25
50th: 1,500/300 = 5.00
75th: 1,750/400 = 4.38
Max: 2,000/50 = 4.00

Would you possibly want to calculate the ratio first, then derive percentiles of that? For example, create a measure such as [Gross Revenue per pound] and derive percentiles of that?

2. On the percentile calculation itself, in my experience it's more common to define the "granularity" of the data points used in the percentile calculation, by using an iterator such as PERCENTILEX.INC with a table computed to give you the desired granularity.

For example, if we wanted to force the granularity of the data points for the percentile calculation to be Year Month Key and Business Hierarchy ID, we could calculate the 25th percentile as this:

Sample 25th Percentile =
PERCENTILEX.INC (
    SUMMARIZE (
        'Fact Table',
        'Date'[Year Month Key],
        'Business Hierarchy'[Business Hierarchy ID]
    ),
    [Your Measure],
    0.25
)

In this example, the SUMMARIZE function produces a table with one row per existing combination of Year Month Key and Business Hierarchy ID, and the measure is calculated for each row of this table.

3. To make your life easier when building the calculations, I would recommend parameterising both the Measure & Percentile selections separately. (Your code above already does this for measure selection.)

There are variations on how to implement this, but I would personally suggest:

Parameter table for Percentile selection, with the actual calculation defined in a Calculation Group, so that it can apply to any measure.
Calculation Group for Measure selection, with one Calculation Item per measure required.

You could also use pure measures and avoid Calculation Groups entirely, but Calculation Groups seem convenient in this situation.

I have attached an example with some Contoso data that I hope is a useful guide based on how I would approach it.

The steps I took to create the attached example:

1. Created a Percentiles table with two columns, Percentile and Percentile Name. Percentile Name is set to sort by Percentile.

2. Created a Calculation Group called Percentile Calculation Group that applies the percentile calculation to the selected measure. In my example, I created a single Calculation Item called Percentile by Product and Month, defined as follows:

VAR K = 
    SELECTEDVALUE ( 'Percentiles'[Percentile] )
RETURN
    IF (
        NOT ISBLANK ( K ),
        PERCENTILEX.INC (
            SUMMARIZE (
                Sales,
                'Date'[Start of Month],
                'Product'[ProductKey] 
            ),
            SELECTEDMEASURE (),
            K
        )
    )

You could created further Calculation Items for other varieties of percentile calculation if needed.

3. Created some measures that the user might want to select (Discount Amount, Sales Amount, Sales Quantity, Sales per Unit)

4. Created a Calculation Group called Measure Selection with Calculation Items corresponding to the measures. Each Calculation Item directly references a measure.
5. Created a sample report page with

Slicers for 'Measure Selecton'[Measure] and 'Date'[Year]
A matrix with 'Product'[Product Category Name] and 'Measure Selection'[Measure] on the rows, and Percentiles[Percentile Name] on the columns. Added an arbitrary measure (to be overridden by calculation groups).
Applied a visual level filter to the matrix: 'Percentile Calculation Group'[Percentile Calculation] = "Percentile by Product and Month"
This appears as follows:

This is just an example, but hopefully useful in figuring it out in your model.

Regards,

Owen 🙂

Owen Auger
Did I answer your question? Mark my post as a solution!
Blog
LinkedIn

View solution in original post

OwenAuger · ‎02-25-2022

Hi again @Anonymous

I'm well thanks, hope you are too 🙂

First of all, just restating the Sample 25th Percentile measure for reference:

Sample 25th Percentile =
PERCENTILEX.INC (
    SUMMARIZE (
        'Fact Table',
        'Date'[Year Month Key],
        'Business Hierarchy'[Business Hierarchy ID]
    ),
    [Your Measure],
    0.25
)

I'm referring to columns/measures mentioned in this code, but please replace them with the actual columns/measures if different 🙂

I think the most likely cause is that PERCENTILEX.INC includes blank values when calculating percentiles (treating them as equal to zero). So the blank results could be caused by blank values being returned for some rows of the table iterated by PERCENTILEX.INC.

Looking at the Sample 25th Percentile measure, the table iterated by PERCENTILEX.INC is 'Fact Table' summarized by Year Month Key and Business Hierarchy ID, so my guess is that there are combinations of Year Month Key and Business Hierarchy ID which exist in the fact table but for which the measure ([Your Measure] in the sample measure) is blank.

You could verify this by creating a simple measure

# Rows Fact Table = 
COUNTROWS ( 'Fact Table' )

then place [# Rows Fact Table] and [Your Measure] in a table visual, grouped by Year Month Key and Business Hierarchy ID.

If you filter on one particular row of the visual in your screenshot, then sort the new table by [Your Measure] ascending, it should be obvious if there are blanks.

The question is then whether it is valid to include the blanks or not.

To completely exclude blanks, the measure could be changed to:

Sample 25th Percentile =
VAR SummarizedTable =
    ADDCOLUMNS (
        SUMMARIZE (
            'Fact Table',
            'Date'[Year Month Key],
            'Business Hierarchy'[Business Hierarchy ID]
        ),
        "@YourMeasure", [Your Measure]
    )
VAR SummarizedTableNoBlanks =
    FILTER (
        SummarizedTable,
        NOT ISBLANK ( [@YourMeasure] )
    )
RETURN
    PERCENTILEX.INC (
        SummarizedTableNoBlanks,
        [@YourMeasure],
        0.25
    )

Alternatively, blanks could be converted to zero (in this case this is acceptable since the number of rows we are iterating is limited by combinations existing in 'Fact Table'):

Sample 25th Percentile =
PERCENTILEX.INC (
    SUMMARIZE (
        'Fact Table',
        'Date'[Year Month Key],
        'Business Hierarchy'[Business Hierarchy ID]
    ),
    COALESCE ( [Your Measure], 0),
    0.25
)

Hopefully that's of some help!

Regards,

Owen

Owen Auger
Did I answer your question? Mark my post as a solution!
Blog
LinkedIn

View solution in original post

OwenAuger · ‎01-16-2022

Hi @Anonymous

I started looking at your question yesterday so just wanted to add my 2 cents 🙂

Consider this a guide as to how you might approach this rather than a "solution".

1. Regarding the strange percentile results, my guess is that is partly related to taking the ratio of two percentiles.

To take an example, if Gross Revenue and Volume each had a uniform distribution, but Gross Revenue's range was 1,000-2,000 and Volume's range was 100-500, the ratios of the percentiles would be decreasing:

Min: 1,000/100 = 10.00
25th: 1,250/200 = 6.25
50th: 1,500/300 = 5.00
75th: 1,750/400 = 4.38
Max: 2,000/50 = 4.00

Would you possibly want to calculate the ratio first, then derive percentiles of that? For example, create a measure such as [Gross Revenue per pound] and derive percentiles of that?

2. On the percentile calculation itself, in my experience it's more common to define the "granularity" of the data points used in the percentile calculation, by using an iterator such as PERCENTILEX.INC with a table computed to give you the desired granularity.

For example, if we wanted to force the granularity of the data points for the percentile calculation to be Year Month Key and Business Hierarchy ID, we could calculate the 25th percentile as this:

Sample 25th Percentile =
PERCENTILEX.INC (
    SUMMARIZE (
        'Fact Table',
        'Date'[Year Month Key],
        'Business Hierarchy'[Business Hierarchy ID]
    ),
    [Your Measure],
    0.25
)

In this example, the SUMMARIZE function produces a table with one row per existing combination of Year Month Key and Business Hierarchy ID, and the measure is calculated for each row of this table.

3. To make your life easier when building the calculations, I would recommend parameterising both the Measure & Percentile selections separately. (Your code above already does this for measure selection.)

There are variations on how to implement this, but I would personally suggest:

Parameter table for Percentile selection, with the actual calculation defined in a Calculation Group, so that it can apply to any measure.
Calculation Group for Measure selection, with one Calculation Item per measure required.

You could also use pure measures and avoid Calculation Groups entirely, but Calculation Groups seem convenient in this situation.

I have attached an example with some Contoso data that I hope is a useful guide based on how I would approach it.

The steps I took to create the attached example:

1. Created a Percentiles table with two columns, Percentile and Percentile Name. Percentile Name is set to sort by Percentile.

2. Created a Calculation Group called Percentile Calculation Group that applies the percentile calculation to the selected measure. In my example, I created a single Calculation Item called Percentile by Product and Month, defined as follows:

VAR K = 
    SELECTEDVALUE ( 'Percentiles'[Percentile] )
RETURN
    IF (
        NOT ISBLANK ( K ),
        PERCENTILEX.INC (
            SUMMARIZE (
                Sales,
                'Date'[Start of Month],
                'Product'[ProductKey] 
            ),
            SELECTEDMEASURE (),
            K
        )
    )

You could created further Calculation Items for other varieties of percentile calculation if needed.

3. Created some measures that the user might want to select (Discount Amount, Sales Amount, Sales Quantity, Sales per Unit)

4. Created a Calculation Group called Measure Selection with Calculation Items corresponding to the measures. Each Calculation Item directly references a measure.
5. Created a sample report page with

Slicers for 'Measure Selecton'[Measure] and 'Date'[Year]
A matrix with 'Product'[Product Category Name] and 'Measure Selection'[Measure] on the rows, and Percentiles[Percentile Name] on the columns. Added an arbitrary measure (to be overridden by calculation groups).
Applied a visual level filter to the matrix: 'Percentile Calculation Group'[Percentile Calculation] = "Percentile by Product and Month"
This appears as follows:

This is just an example, but hopefully useful in figuring it out in your model.

Regards,

Owen 🙂

Owen Auger
Did I answer your question? Mark my post as a solution!
Blog
LinkedIn

Anonymous · ‎02-23-2022

Hi @OwenAuger

Hope all is well!

For some reason, I am starting to see some blank data in various places within my Matrix. I was wondering if you would be able to help me diagnose the problem. For all the percentile measures, I am using the "Sample 25th Percentile" you provided above.

Below is a screenshot of what it looks like.

OwenAuger · ‎02-25-2022

Hi again @Anonymous

I'm well thanks, hope you are too 🙂

First of all, just restating the Sample 25th Percentile measure for reference:

Sample 25th Percentile =
PERCENTILEX.INC (
    SUMMARIZE (
        'Fact Table',
        'Date'[Year Month Key],
        'Business Hierarchy'[Business Hierarchy ID]
    ),
    [Your Measure],
    0.25
)

I'm referring to columns/measures mentioned in this code, but please replace them with the actual columns/measures if different 🙂

I think the most likely cause is that PERCENTILEX.INC includes blank values when calculating percentiles (treating them as equal to zero). So the blank results could be caused by blank values being returned for some rows of the table iterated by PERCENTILEX.INC.

Looking at the Sample 25th Percentile measure, the table iterated by PERCENTILEX.INC is 'Fact Table' summarized by Year Month Key and Business Hierarchy ID, so my guess is that there are combinations of Year Month Key and Business Hierarchy ID which exist in the fact table but for which the measure ([Your Measure] in the sample measure) is blank.

You could verify this by creating a simple measure

# Rows Fact Table = 
COUNTROWS ( 'Fact Table' )

then place [# Rows Fact Table] and [Your Measure] in a table visual, grouped by Year Month Key and Business Hierarchy ID.

If you filter on one particular row of the visual in your screenshot, then sort the new table by [Your Measure] ascending, it should be obvious if there are blanks.

The question is then whether it is valid to include the blanks or not.

To completely exclude blanks, the measure could be changed to:

Sample 25th Percentile =
VAR SummarizedTable =
    ADDCOLUMNS (
        SUMMARIZE (
            'Fact Table',
            'Date'[Year Month Key],
            'Business Hierarchy'[Business Hierarchy ID]
        ),
        "@YourMeasure", [Your Measure]
    )
VAR SummarizedTableNoBlanks =
    FILTER (
        SummarizedTable,
        NOT ISBLANK ( [@YourMeasure] )
    )
RETURN
    PERCENTILEX.INC (
        SummarizedTableNoBlanks,
        [@YourMeasure],
        0.25
    )

Alternatively, blanks could be converted to zero (in this case this is acceptable since the number of rows we are iterating is limited by combinations existing in 'Fact Table'):

Sample 25th Percentile =
PERCENTILEX.INC (
    SUMMARIZE (
        'Fact Table',
        'Date'[Year Month Key],
        'Business Hierarchy'[Business Hierarchy ID]
    ),
    COALESCE ( [Your Measure], 0),
    0.25
)

Hopefully that's of some help!

Regards,

Owen

Owen Auger
Did I answer your question? Mark my post as a solution!
Blog
LinkedIn

Anonymous · ‎02-28-2022

Hi @OwenAuger

Thank you for the response! I appreciate it.

I ran your suggested test, and you are right - "there are combinations of Year Month Key and Business Hierarchy ID which exist in the fact table but for which the measure ([Your Measure] in the sample measure) is blank."

I am not sure if I mentioned this, but my measure is a measure slicer. Meaning it's based on a value selected from a parameter table using the SWITCH function. Example: the user wants to see percentile rates for Gross Sales, they would select Gross Sales from the slicer and that measure calculates Gross Sales. DAX is below:

CY Actual = 
SWITCH(
    [Selected Rate],
    0, [Gross Sales],
    1, [Net Sales],
    2, [Inventory COGS],
    3, [Other COGS],
    4, [Freight],
    5, [Warehouse],
    6, [Gross Margin]
)

The reason I am mentioning that is because my measure is showing blanks (using the test you suggested above) but when I do the same test using a simple SUM on the column of values there are no blank rows.

Do you think the problem is due my measure above? Any suggestions on how I could test for this?

OwenAuger · ‎03-01-2022

@Anonymous Thanks for the info 🙂

Yes, I believe you did mention this SWITCH measure earlier.

Hmm...it sounds like there might be an issue with something in the sub-measures referenced by CY Actual.

When you say a simple SUM didn't return blank, do you mean something like

SUM ( FactTable[Gross Sales Column] ) isn't blank, but [CY Actual] with Gross Sales (option 0) selected is blank? That would be weird, and indicates an issue with [Gross Sales] or one of the other measures.

What is the code for [Gross Sales] and the rest of the sub-measures (or a representative sample)?

Owen Auger
Did I answer your question? Mark my post as a solution!
Blog
LinkedIn

Anonymous · ‎03-01-2022

Hi @OwenAuger

First let's quickly start with how my Fact Table data looks in the system prior to my transformations in Powery Query. My Power Query transformations might be the source of this issue.

When the data is initially pulled into Power Query the columns selected are as follows:

Fact [Date]
Fact [Business ID]
Fact [Product ID]
Fact [Volume] --> whole number
Fact [Gross Sales] --> currency
Fact [Net Sales] --> currency
Fact [Inventory COGS] --> currency
Fact [Other COGS] --> currency
Fact [Freight] --> currency
Fact [Warehouse] --> currency
Fact [Gross Margin] --> currency

What my Fact Table looks like after transformation:

Fact [Date] --> this column is the relationship to my date table dimension - Date [Date]
Fact [Business ID] --> this column is the relationship to my Business table dimension - Business [Business ID]
Fact [Product ID] --> this column is the relationship to my Product table dimension - Product [Product ID]
Fact [Line Item] --> Volume, Gross Sales, Net Sales, Inventory COGS, Other COGS, Freight, Warehouse, and Gross Margin
Fact [Line Item Values] --> Decimal Number

To reduce the size of my Fact Table (it actually has 64 columns when its initially pulled in but for our purposes right now, we will only look at the ones relevant to my problem) I have Unpivoted the Fact [Date], Fact [Business ID], and Fact [Product ID] columns which reduces it to five columns (above) instead of eleven columns. (above) So instead of having one column per Line Item, I have Unpivoted them into one column with the Line Item Name and one column with the Line Item Value. The reason I did it this way is because my Fact Data is very large at first (7 million rows and 64 columns) and my visuals and calculations were very slow, so it was necessary to reduce the size of my table as much as possible.

Second, let's look at the underlying calculations:

I have one master measure which the other measures are based on. This is the SUM measure you are asking about above and the one I said did not return any blanks.

Actual Values = SUM(Fact[Line Item Values]

The sub-measures referenced by CY Actual are:

Volume = CALCULATE([Actual Values],'Fact'[Line Item] = "Volume"

Gross Sales = 
DIVIDE(
    CALCULATE([Actual Values],'Fact'[Line Item] = "Gross Sales"),
    [Volume])

Inventory COGS = 
DIVIDE(
    CALCULATE([Actual Values],'Fact'[Line Item] = "Inventory COGS"),
    [Volume])

etc....

Third, let's look at the Parameter Table:

Rate	Rate_ID
Gross Sales	0
Net Sales	1
Inventory COGS	2
Other COGS	3
Freight	4
Warehouse	5
Gross Margin	6

In the SWITCH measure, you see an expression named [Selected Rate]. The formula for that is as follows:

Selected Rate = MIN('Selected Rate'[Rate_ID])

Which finally brings us to the CY Actual measure:

CY Actual = 
SWITCH(
    [Selected Rate],
    0, [Gross Sales],
    1, [Net Sales],
    2, [Inventory COGS],
    3, [Other COGS],
    4, [Freight],
    5, [Warehouse],
    6, [Gross Margin]
)

Let me know what you think about this approach.

As always, thank you for your time and assitance. I really appreciate your insight.

-GC

OwenAuger · ‎03-03-2022

Thanks for the detail @Anonymous - great explanation of how the model's set up!

It seems likely that we need to change the table provided as the first argument of PERCENTILEX.INC, to ensure that it includes only the valid combinations we want to iterate over for the percentile calculation. Each row of this table corresponds to one data point for the percentile calculation.

Because of the "unpivoted" structure of 'Fact' with the Line Item column, the table produced by SUMMARIZE potentially contains rows that don't relate to the particular Rate selection.

The question then is how to produce the correct table to iterate over?

To clarify what's going on, I would suggest a variation of something you already tried: creating a matrix visual with:

Rows: Business[Business ID] & 'Date'[Year Month Key]
Columns: 'Selected Rate'[Rate]
Values: [CY Actual] & a measure Fact Rows = COUNTROWS ( 'Fact' )

This visual will then show all combinations of Business ID & Year Month Key that exist in Fact (this is equivalent to what SUMMARIZE currently produces), along with the value of CY Actual for each combination.

When blank values occur for [CY Actual] for a particular row & Rate column, we need to decide whether it is valid to include those values in the percentile calculation.

This is where I'm not sure what the right answer is. For example, if "Warehouse" doesn't appear for a particular Business ID & Year Month Key, but other Line Items do, should it be treated as blank (i.e. = 0) or not included at all?

Another question: is Business ID & Year/Month definitely the granularity you want for calculating the percentile?

I realise I've raised more questions than answers, but hopefully this helps get closer to a solution!

Regards,

Owen

Owen Auger
Did I answer your question? Mark my post as a solution!
Blog
LinkedIn

Anonymous · ‎03-03-2022

Hi @OwenAuger

You raise two very good questions and my response to each is below:

Q1) "When blank values occur for [CY Actual] for a particular row & Rate column, we need to decide whether it is valid to include those values in the percentile calculation."

Response: in general, blank values should be removed from all calculations.

Q2) "Another question: is Business ID & Year/Month definitely the granularity you want for calculating the percentile?"

Response: the goal is to see rates for each percentile. Based on our discussion so far, I think I did not clarify the granularity as best as I should have. If we back into what qualitative attributes produce a rate in the first place (example: selling a product) those would be:

Line Item
Line-Item Value
Date the product was sold.
Which business sold what product - that would produce the Business ID (comprised of a three-digit code concatenated into one ID based on the Business Group, Business Family, Business Unit, and Product Category - in that order) An example of that would be:
1. Business Group A: 123
2. Business Family A: 456
3. Business Unit A: 789
4. Product Category A: 101112
  1. Business ID: 123456789191112
Sales Channel the product was sold through.

Based on that, I think we need to increase the level of granularity.

With that said, I think the solution might be this which is based on the sample you provided that completely excludes blanks:

Sample 25th Percentile =
VAR SummarizedTable =
    ADDCOLUMNS (
        SUMMARIZE (
            'Fact Table',
            'Date'[Date],
            'Business'[Business ID],
            'Fact'[Sales Channel],
            'Fact'[Line Item]
        ),
        "@CYActual", [CY Actual]
    )
VAR SummarizedTableNoBlanks =
    FILTER (
        SummarizedTable,
        NOT ISBLANK ( [@CYActual] )
    )
RETURN
    PERCENTILEX.INC (
        SummarizedTableNoBlanks,
        [@CYActual],
        0.25
    )

Why do I think this might be the answer:

It excludes blanks.
It goes down to the level of granularity that produces our quantitative values.

What are your thoughts on this approach?

As always thank you for your time and assitance! I sincerely appreciate it!

OwenAuger · ‎03-04-2022

@Anonymous

It looks like you've answered the question to be honest 🙂

Your reasoning sounds good to me!

The key questions were the level of granularity and blank exclusion. From your description, that SUMMARIZEd table looks like a better representation of the granularity you should have for the percentile calculation.

If anything, I would run this past some of the end users, showing some percentile outputs along with a breakdown of the values feeding into the percentile calculation (based on the columns included in SUMMARIZE), just to confirm it makes sense to them too.

All the best!

Owen

Owen Auger
Did I answer your question? Mark my post as a solution!
Blog
LinkedIn

Anonymous · ‎03-04-2022

Hi @OwenAuger

I would agree with that. It looks like we found our solution!

This was such a great learning opportunity and I thank you for sticking it out with me until the end. I appreciate your time and assistance. Additionally, thank you for sharing your knowledge!

I wish you all the best!

-GC

Anonymous · ‎01-19-2022

Hi @OwenAuger

Thank you for the assistance!

I was unable to achieve the solution using a calculated group. However, I believe I was able to resolve my issue with individual calculations based on "Sample 25th Percentile" you provided above.

Here is what my matrix looks like now and how my measures are stored:

I am noticing a few product categories whose values are all zeros across the board. Not sure if this is due to my data model or simply the data itself. I'll need to do some investigative work on that.

Thank you (again) for your assistance! I really do appreciate you taking the time to help me with my question!

All the best,

Gordon

lbendlin · ‎01-15-2022

Try simplifying the issue by stripping away everything that is not relevant to the problem. (ie you provided too much details)

Please provide sanitized sample data that fully covers your issue, but not more. Paste the data into a table in your post or use one of the file services. Please show the expected outcome.

Matrix with Percentile Values in the Columns for Product Category

Helpful resources

FabCon Global Hackathon

Power BI Monthly Update - September 2025

FabCon Atlanta 2026

How to Get Your Question Answered Quickly

FabCon is coming to Atlanta

Matrix with Percentile Values in the Columns for Product Category

Helpful resources

FabCon Global Hackathon

Power BI Monthly Update - September 2025

FabCon Atlanta 2026

How to Get Your Question Answered Quickly