- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Optimizing DAX that requires calculating 80th and 20th Percentile Value
Hi, I have a dataset of property sales. POWER BI File download link is available at the last.
I need to calculate the average property price and then calculate the YoY % change compared to last year. This needs to be shown in a line chart with years on the X-axis. We also need to create one line for each property type (another attribute in our dataset).
The problem is, there are a lot of outliers in the dataset. Therefore, it is required to exclude the top 20% and bottom 20% of prices from the dataset and only calculate the average property price for the middle 60% of values. This also needs to be dynamic and must be calculated dynamically for all the filters that we have in the report. For example, if a filter on region is selected, then the top 20% and bottom 20% prices of the filtered dataset will be excluded from the results.
I have created the following three DAX formulas to do this, but they are performing very slowly on my dataset. There are around 4+ million records in the dataset, which is probably causing the issue, but I am wondering if there's a better way to resolve this problem?
Average Property Price DAX
Average Property Price =
VAR _Bottom20 = PERCENTILEX.INC(properties_data, properties_data[price_paid], 0.2) -- 20th Percentile
VAR _Top20 = PERCENTILEX.INC(properties_data, properties_data[price_paid], 0.8) -- 80th Percentile
var _Result =
AVERAGEX(
FILTER(
properties_data,
properties_data[price_paid] > _Bottom20 &&
properties_data[price_paid] < _Top20
),
properties_data[price_paid]
)
return _Result
Average Property Price (Last Year)
Average Property Price (LY) =
var _LastYear = SELECTEDVALUE('Calendar'[Year], MAX('Calendar'[Year])) - 1
var _Month = SELECTEDVALUE('Calendar'[Month])
var _Result =
IF(
ISBLANK(_Month),
CALCULATE(
[Average Property Price],
'Calendar'[Year] = _LastYear,
REMOVEFILTERS('Calendar')
),
CALCULATE(
[Average Property Price],
'Calendar'[Year] = _LastYear,
'Calendar'[Month] = _Month,
REMOVEFILTERS('Calendar')
)
)
return _Result
YoY Change % in Average Property Price
YoY % Change in Average Property Price =
var _CurrentYear = [Average Property Price]
var _LastYear = [Average Property Price (LY)]
var _Result =
DIVIDE(
_CurrentYear - _LastYear,
_LastYear,
0
)
return _Result
I have also tried one more thing: I created a table containing pre-computed values for the 20th percentile and the 80th percentile for all possible combinations of my filters. However, I am still facing issues because it is possible that someone might select "multiple values" from one filter at the same time, which leads to problems with this approach.
You can see my PBI report in the Google Drive link below. In that report, there's a table named "lu_price_percentiles" containing pre-computed percentile values for each possible combination of my filters.
Can anyone provide guidance on how I can improve my DAX for this specific problem?
You can download Power BI Report from this link: https://drive.google.com/file/d/14Sr6IuNFX6sTrqLDlT2eCVBcokZWLYHe/view?usp=sharing
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @HassanAshas ,
Have you already solved the problem? If so, can you share your solution here and mark the correct answer as standard to help other members find it faster? Thank you very much for your co-operation!
Best Regards,
Clara Gong
If there is any post helps, then please consider Accept it as the solution to help the other members find it more quickly.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Simplifying the LY formula
Average Property Price (LY) =
CALCULATE([Average Property Price],SAMEPERIODLASTYEAR('Calendar'[Date]))
Checking the number of rows
Going through all rows of the fact table
Limiting to only the observed values
Average Property Price =
VAR _Bottom20 = PERCENTILEX.INC(VALUES(properties_data[price_paid]), [price_paid], 0.2) -- 20th Percentile
VAR _Top20 = PERCENTILEX.INC(VALUES(properties_data[price_paid]), [price_paid], 0.8) -- 80th Percentile
var _Result =
AVERAGEX(
FILTER(
properties_data,
[price_paid] > _Bottom20 &&
[price_paid] < _Top20
),
[price_paid]
)
return _Result
Confirming better query plan
Still need to get rid of the CallBackDataID
Refactoring to use CALCULATE
Average Property Price =
VAR _Bottom20 = PERCENTILEX.INC(VALUES(properties_data[price_paid]), [price_paid], 0.2) -- 20th Percentile
VAR _Top20 = PERCENTILEX.INC(VALUES(properties_data[price_paid]), [price_paid], 0.8) -- 80th Percentile
var _Result =
CALCULATE(AVERAGE(properties_data[price_paid]),
properties_data[price_paid] > _Bottom20,
properties_data[price_paid] < _Top20
)
return _Result
Looks good.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@lbendlin Oh, god. Thank you so much!
I really need to spend more time understanding DAX and optimizations like this.
There's just one problem, it seems to be giving incorrect result. Please see attached picture below, I added both measures (previous one and updated one) in different cards.
Not exactly sure what's the reason but I am trying to debug it.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I noticed that too. I don't know if the percentile calculation yields different results when using distinct sale prices. You may need to adjust that part back to how it was before.
I didn't understand your LY formula either.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@lbendlin Yes, that is correct.
I understand the issue now. VALUES function yield unique rows against the specified column. PERCENTILE function is now calculating the 20th and 80th percentile against those unique set of rows, instead of considering all rows (which it should).
Due to this reason, we can't really use VALUES in this case. And hence, we are back to the square one.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The other changes should still stick. Continue to evaluate the query plan and try other refactoring techniques.

Helpful resources
Join us at the Microsoft Fabric Community Conference
March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!
Power BI Monthly Update - February 2025
Check out the February 2025 Power BI update to learn about new features.

Subject | Author | Posted | |
---|---|---|---|
09-12-2024 08:07 AM | |||
06-14-2024 05:29 PM | |||
09-27-2024 04:38 AM | |||
06-23-2024 06:34 AM | |||
03-06-2024 01:20 PM |
User | Count |
---|---|
23 | |
12 | |
10 | |
10 | |
8 |
User | Count |
---|---|
16 | |
15 | |
15 | |
12 | |
10 |