Solved: Advice for Filtering Large Data

Anonymous · ‎02-25-2021

I created a measure which counts the most recent status for a transaction. With that measure, I created another measure which will calculate the total $$$ value using the first measure as filter context. This measure calculating value was successful with a tiny data set (Table = "Patterns" > roughly 2,800 rows). Now that I tried to use the same value measure with a larger data set (Table = "All Patterns" > roughly 350,000 rows), PBI desktop will continually process for an extremely long time without any output. Both data sets are from the same source.

What are your best practices to filter large data in your measures? Below are the measures that I am referencing and sample PBIX in GitHub for context. Any advice will be greatly appreciated.

PBIX in GitHub = Pattern Values Count PBIX

(Measure 1 = counts the most recent status for a transaction)

Pattern Count.ALLOppID = 
SUMX(
    SUMMARIZE(
       ALLSELECTED('All Patterns'),
        'All Patterns'[OPP_ID], "Date",MAX('All Patterns'[MIN_LAST_MODIFIED])
    ), IF(
        [Date] >= CALCULATE(MIN('All Patterns'[MIN_LAST_MODIFIED]) )
        && [Date] <= CALCULATE(MAX('All Patterns'[MIN_LAST_MODIFIED]) )
        ,1, BLANK()
    )
)

(Measure 2 = uses Measure 1 to calculate the total $$$ value using the first measure as filter context)

Pattern Value 2.ALL = 
CALCULATE(
    [Sum Value.ALL],
    FILTER(
        'All Patterns',
        [Pattern Count.ALLOppID]
    )
)

AlexisOlson · ‎02-25-2021

You are essentially using Measure 1 as a calculated column in Measure 2. A more efficient way to do this would be to define a calculated column with 1 if the transaction is the latest and blank otherwise. Storing this as a calculated column, we only have to do it once and can reuse it in multiple measures.

IsLastTransaction = 
IF (
    'All Patterns'[MIN_LAST_MODIFIED]
        = CALCULATE (
            MAX ( 'All Patterns'[MIN_LAST_MODIFIED] ),
            ALLEXCEPT ( 'All Patterns', 'All Patterns'[OPP_ID] )
        ),
    1
)

Now your measures can be greatly simplified. The first one doesn't even need to use the calculated column:

Pattern Count.ALLOppID = DISTINCTCOUNT ( 'All Patterns'[OPP_ID] )

The second one does use the calculated column:

Pattern Value 2.ALL = 
CALCULATE(
    [Sum Value.ALL],
    'All Patterns'[IsLastTransaction] = 1
)

View solution in original post

AlexisOlson · ‎02-25-2021

You are essentially using Measure 1 as a calculated column in Measure 2. A more efficient way to do this would be to define a calculated column with 1 if the transaction is the latest and blank otherwise. Storing this as a calculated column, we only have to do it once and can reuse it in multiple measures.

IsLastTransaction = 
IF (
    'All Patterns'[MIN_LAST_MODIFIED]
        = CALCULATE (
            MAX ( 'All Patterns'[MIN_LAST_MODIFIED] ),
            ALLEXCEPT ( 'All Patterns', 'All Patterns'[OPP_ID] )
        ),
    1
)

Now your measures can be greatly simplified. The first one doesn't even need to use the calculated column:

Pattern Count.ALLOppID = DISTINCTCOUNT ( 'All Patterns'[OPP_ID] )

The second one does use the calculated column:

Pattern Value 2.ALL = 
CALCULATE(
    [Sum Value.ALL],
    'All Patterns'[IsLastTransaction] = 1
)

Anonymous · ‎02-25-2021

@AlexisOlson Thank you! I am curious on how a calculated column worked so well vs. measure. I was always under the impression that measures should be considered first. Regardless, your advice helped and provided relief to a stressful problem. So much gratitude for your support!

AlexisOlson · ‎02-25-2021

Your impression is fairly accurate in general. Calculated columns are a good idea for situations where

The column values do not depend on filter context. (Calculated columns are not dynamic.)
The column values store values that would otherwise need to be repeatedly calculated.

With regard to #1, note that this approach would not work if you needed to filter your dates with a slicer and find the maximal date within that filter context. If you need that flexibility, you would need all measures (which could still be much more efficient than what you had).

JirkaZ · ‎02-25-2021

So basically you need to be wise about your DAX. The code that you're using is forcing the engine to evaluate stuff for each row individually (you can check that in DAX studio). So in general you want to avoid that with large datasets.

Anonymous · ‎02-25-2021

@JirkaZ Thank you for the advice. What I am struggling with is the current context of the requirements is that I take the most recent status. I understand that it is best practice not to evaluate each row individually but how can I structure the measure to do that? Apologies but I am so lost.

Advice for Filtering Large Data

Helpful resources

Power BI Dataviz World Championships

Power BI Monthly Update - December 2025

FabCon Atlanta 2026

How to Get Your Question Answered Quickly

FabCon is coming to Atlanta