Scatter plot using percentile lines

Anonymous · ‎12-05-2023

I created scatter plot using Percentile lines (25th, 50th, 75th & 100th) for both X & Y axes in analytics pane option in PowerBI. Now my scatter plot is into 4*4 = 16 grids. In X-axis : Average of epk, Y-axis: Count of distinct date_key column(which is nothing but service operated days). In Values: Service(depot_id + service_code concatenation). By using this scatter plot we can see the high & low performing services based on avg Epk(Earning Per KM) & Service operated days.

Client requirement is based on avg epk & Service operated days they wanted to see each grid wise services (grid1,grid2,..grid16).

For this I have created a new Grid column using the following DAX

------------------------------------------------------------------------------------

Grid =

VAR X_25th_Percentile = PERCENTILEX.INC(ServiceSummary, [Average EPK], 0.25)

VAR Y_25th_Percentile = PERCENTILEX.INC(ServiceSummary, [Service Operated Days], 0.25)

VAR X_50th_Percentile = PERCENTILEX.INC(ServiceSummary, [Average EPK], 0.50)

VAR Y_50th_Percentile = PERCENTILEX.INC(ServiceSummary, [Service Operated Days], 0.50)

VAR X_75th_Percentile = PERCENTILEX.INC(ServiceSummary, [Average EPK], 0.75)

VAR Y_75th_Percentile = PERCENTILEX.INC(ServiceSummary, [Service Operated Days], 0.75)

VAR X_100th_Percentile = MAX(ServiceSummary[Average EPK])

VAR Y_100th_Percentile = MAX(ServiceSummary[Service Operated Days])

RETURN

SWITCH(

TRUE(),

[Service Average EPK] <= X_25th_Percentile && [Service Distinct Date Count] <= Y_25th_Percentile, "Grid 1",

[Service Average EPK] > X_25th_Percentile && [Service Average EPK] <= X_50th_Percentile && [Service Distinct Date Count] <= Y_25th_Percentile, "Grid 2",

[Service Average EPK] > X_50th_Percentile && [Service Average EPK] <= X_75th_Percentile && [Service Distinct Date Count] <= Y_25th_Percentile, "Grid 3",

[Service Average EPK] > X_75th_Percentile && [Service Average EPK] <= X_100th_Percentile && [Service Distinct Date Count] <= Y_25th_Percentile, "Grid 4",

[Service Average EPK] <= X_25th_Percentile && [Service Distinct Date Count] > Y_25th_Percentile && [Service Distinct Date Count] <= Y_50th_Percentile, "Grid 5",

[Service Average EPK] > X_25th_Percentile && [Service Average EPK] <= X_50th_Percentile && [Service Distinct Date Count] > Y_25th_Percentile && [Service Distinct Date Count] <= Y_50th_Percentile, "Grid 6",

[Service Average EPK] > X_50th_Percentile && [Service Average EPK] <= X_75th_Percentile && [Service Distinct Date Count] > Y_25th_Percentile && [Service Distinct Date Count] <= Y_50th_Percentile, "Grid 7",

[Service Average EPK] > X_75th_Percentile && [Service Average EPK] <= X_100th_Percentile && [Service Distinct Date Count] > Y_25th_Percentile && [Service Distinct Date Count] <= Y_50th_Percentile, "Grid 8",

[Service Average EPK] <= X_25th_Percentile && [Service Distinct Date Count] > Y_50th_Percentile && [Service Distinct Date Count] <= Y_75th_Percentile, "Grid 9",

[Service Average EPK] > X_25th_Percentile && [Service Average EPK] <= X_50th_Percentile && [Service Distinct Date Count] > Y_50th_Percentile && [Service Distinct Date Count] <= Y_75th_Percentile, "Grid 10",

[Service Average EPK] > X_50th_Percentile && [Service Average EPK] <= X_75th_Percentile && [Service Distinct Date Count] > Y_50th_Percentile && [Service Distinct Date Count] <= Y_75th_Percentile, "Grid 11",

[Service Average EPK] > X_75th_Percentile && [Service Average EPK] <= X_100th_Percentile && [Service Distinct Date Count] > Y_50th_Percentile && [Service Distinct Date Count] <= Y_75th_Percentile, "Grid 12",

[Service Average EPK] <= X_25th_Percentile && [Service Distinct Date Count] > Y_75th_Percentile && [Service Distinct Date Count] <= Y_100th_Percentile, "Grid 13",

[Service Average EPK] > X_25th_Percentile && [Service Average EPK] <= X_50th_Percentile && [Service Distinct Date Count] > Y_75th_Percentile && [Service Distinct Date Count] <= Y_100th_Percentile, "Grid 14",

[Service Average EPK] > X_50th_Percentile && [Service Average EPK] <= X_75th_Percentile && [Service Distinct Date Count] > Y_75th_Percentile && [Service Distinct Date Count] <= Y_100th_Percentile, "Grid 15",

[Service Average EPK] > X_75th_Percentile && [Service Average EPK] <= X_100th_Percentile && [Service Distinct Date Count] > Y_75th_Percentile && [Service Distinct Date Count] <= Y_100th_Percentile, "Grid 16",

BLANK()

)

----------------------------------------------------

I created New Table : service wise average EPK & Service wise distinct date_key count(Operated Days) using the below DAX:

ServiceSummary =

SUMMARIZE(

agg_route_trip,

agg_route_trip[Service],

"Average EPK", AVERAGE(agg_route_trip[epk]),

"Service Operated Days", DISTINCTCOUNT(agg_route_trip[date_key])

)

-----------------------------------------------------------------------------

I connected this table to agg_route_trip table via 1 to many relationship using Service column

Issue I am facing:

Grid is working fine on the entire data. But the client requirement is month wise each grid wise services they want to see. For example on selection of july month scatter plot will automatically adjust to july month with 16 grids. Now they want to see each grid wise services. This is where I am facing the issue. On selection of any month the grid is not correctly filtering the services/some grids are not showing in the grid column even though in scatter plot all 16 grids visible

PLEASE FIND THE SAMPLE FILES HERE:

https://docs.google.com/spreadsheets/d/1htO-yor7rDfpmeVCJSJt9v4A9OZ2NTtR/edit?usp=sharing&ouid=10810...

https://drive.google.com/file/d/15swrWPLmAgihwGV55O_4uE73SiXnRTqN/view?usp=sharing : Pbix File

mizan2390 · ‎05-06-2026

Since Calculated Columns only evaluate during data refresh, they ignore your month slicer. To get the scatter plot lines and grid assignments to recalculate instantly when a user selects a different month, you need a Dynamic Segmentation approach using DAX measures.

I attached a file in the link below so you can view a working example. Here is the step-by-step breakdown of how to build it:

1. Create Base Measures

First, write standard measures for your X and Y axes instead of using columns.

Average EPK Measure = AVERAGE(agg_route_trip[epk])
Operated Days Measure = DISTINCTCOUNT(agg_route_trip[date_key])

2. Create a Disconnected "Grid List" Table

You can't drop a measure directly into a slicer. Create a simple physical table containing your 16 grids to use as your slicer, and leave it completely disconnected from your main data model.

Grid Reference = 
DATATABLE(
    "Grid no", INTEGER,
    "Grid", STRING,
    {
        {1,"Grid 4"}, {2,"Grid 3"}, {3,"Grid 2"}, {4,"Grid 1"},
        {5,"Grid 8"}, {6,"Grid 7"}, {7,"Grid 6"}, {8,"Grid 5"},
        {9,"Grid 12"}, {10,"Grid 11"}, {11,"Grid 10"}, {12,"Grid 9"},
        {13,"Grid 16"}, {14,"Grid 15"}, {15,"Grid 14"}, {16,"Grid 13"}
    }
)

3. Write the Dynamic Grid Assignment Measure

To prevent your percentiles from shrinking or skewing when a user clicks a row in the table (which causes cross-filtering), wrap the percentile calculations in CALCULATE(..., ALL(YourTable[YourCategory])). This locks the boundaries to the overall month selection. I also wrapped the whole formula in ISINSCOPE to keep the table's total row blank.

Dynamic Grid Assignment =
-- This ensures the Total row is completely blank and doesn't try to assign a grid to the grand total
IF(NOT ISINSCOPE(agg_route_trip[Service]), BLANK(),
    VAR Current_EPK = [Average EPK Measure]
    VAR Current_Days = [Operated Days Measure]
    -- X-Axis Percentiles (EPK) locked using ALL()
    VAR X_25 = CALCULATE(PERCENTILEX.INC(FILTER(ALL(agg_route_trip[Service]), NOT ISBLANK([Average EPK Measure]) && NOT ISBLANK([Operated Days Measure])), [Average EPK Measure], 0.25), ALL(agg_route_trip[Service]))
    VAR X_50 = CALCULATE(PERCENTILEX.INC(FILTER(ALL(agg_route_trip[Service]), NOT ISBLANK([Average EPK Measure]) && NOT ISBLANK([Operated Days Measure])), [Average EPK Measure], 0.50), ALL(agg_route_trip[Service]))
    VAR X_75 = CALCULATE(PERCENTILEX.INC(FILTER(ALL(agg_route_trip[Service]), NOT ISBLANK([Average EPK Measure]) && NOT ISBLANK([Operated Days Measure])), [Average EPK Measure], 0.75), ALL(agg_route_trip[Service]))
    -- Y-Axis Percentiles (Operated Days) locked using ALL()
    VAR Y_25 = CALCULATE(PERCENTILEX.INC(FILTER(ALL(agg_route_trip[Service]), NOT ISBLANK([Average EPK Measure]) && NOT ISBLANK([Operated Days Measure])), [Operated Days Measure], 0.25), ALL(agg_route_trip[Service]))
    VAR Y_50 = CALCULATE(PERCENTILEX.INC(FILTER(ALL(agg_route_trip[Service]), NOT ISBLANK([Average EPK Measure]) && NOT ISBLANK([Operated Days Measure])), [Operated Days Measure], 0.50), ALL(agg_route_trip[Service]))
    VAR Y_75 = CALCULATE(PERCENTILEX.INC(FILTER(ALL(agg_route_trip[Service]), NOT ISBLANK([Average EPK Measure]) && NOT ISBLANK([Operated Days Measure])), [Operated Days Measure], 0.75), ALL(agg_route_trip[Service]))
    -- Assign Grid based on dynamic thresholds

    RETURN
    SWITCH(
        TRUE(),
        ISBLANK(Current_EPK) || ISBLANK(Current_Days), BLANK(),
        Current_EPK <= X_25 && Current_Days <= Y_25, "Grid 1",
        Current_EPK > X_25  && Current_EPK <= X_50 && Current_Days <= Y_25, "Grid 2",
        Current_EPK > X_50  && Current_EPK <= X_75 && Current_Days <= Y_25, "Grid 3",
        Current_EPK > X_75  && Current_Days <= Y_25, "Grid 4",
        Current_EPK <= X_25 && Current_Days > Y_25  && Current_Days <= Y_50, "Grid 5"
        Current_EPK > X_25  && Current_EPK <= X_50 && Current_Days > Y_25 && Current_Days <= Y_50, "Grid 6",
        Current_EPK > X_50  && Current_EPK <= X_75 && Current_Days > Y_25 && Current_Days <= Y_50, "Grid 7",
        Current_EPK > X_75  && Current_Days > Y_25  && Current_Days <= Y_50, "Grid 8",
        Current_EPK <= X_25 && Current_Days > Y_50  && Current_Days <= Y_75, "Grid 9",
        Current_EPK > X_25  && Current_EPK <= X_50 && Current_Days > Y_50 && Current_Days <= Y_75, "Grid 10",
        Current_EPK > X_50  && Current_EPK <= X_75 && Current_Days > Y_50 && Current_Days <= Y_75, "Grid 11",
        Current_EPK > X_75  && Current_Days > Y_50  && Current_Days <= Y_75, "Grid 12",
        Current_EPK <= X_25 && Current_Days > Y_75, "Grid 13",
        Current_EPK > X_25  && Current_EPK <= X_50 && Current_Days > Y_75, "Grid 14",
        Current_EPK > X_50  && Current_EPK <= X_75 && Current_Days > Y_75, "Grid 15",
        Current_EPK > X_75  && Current_Days > Y_75, "Grid 16",
        BLANK()
    )
)

Note: Watch out for exact value ties on your boundary lines. With small whole numbers like Operated Days, you might need to adjust the = sign on your operators depending on which box you want edge-case ties to fall into.

4. Build the Visual Filter Measure

Create a measure to act as a bridge between your disconnected grid table and your main data.

Grid Visual Filter = 
VAR SelectedGrids = VALUES('Grid Reference'[Grid])
RETURN
    IF([Dynamic Grid Assignment] IN SelectedGrids, 1, 0)

Add this measure to the "Filters on this visual" pane of your table and set it to is 1.

5. Add Conditional Formatting for Colors (Optional)

To make the scatter plot dots highlight based on the grid you select, create a color measure and apply it using the fx button under the scatter plot's Marker Colors.

Scatter Plot Color = 
VAR CurrentGrid = [Dynamic Grid Assignment]
VAR IsGridSelected = ISFILTERED('Grid Reference'[Grid])
RETURN
    IF(IsGridSelected && CurrentGrid IN VALUES('Grid Reference'[Grid]), "#118DFF", "#D3D3D3")

New_Report.pbix

If this solves your problem, please give me kudo and mark this as solution.

@meso that I don't lose the thread