Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Get Fabric Certified for FREE during Fabric Data Days. Don't miss your chance! Request now

Reply
Anonymous
Not applicable

Advice for Filtering Large Data

I created a measure which counts the most recent status for a transaction. With that measure, I created another measure which will calculate the total $$$ value using the first measure as filter context. This measure calculating value was successful with a tiny data set (Table = "Patterns" > roughly 2,800 rows). Now that I tried to use the same value measure with a larger data set (Table = "All Patterns" > roughly 350,000 rows), PBI desktop will continually process for an extremely long time without any output. Both data sets are from the same source.

What are your best practices to filter large data in your measures? Below are the measures that I am referencing and sample PBIX in GitHub for context. Any advice will be greatly appreciated.

 

PBIX in GitHub = Pattern Values Count PBIX 

(Measure 1 = counts the most recent status for a transaction)

Pattern Count.ALLOppID = 
SUMX(
    SUMMARIZE(
       ALLSELECTED('All Patterns'),
        'All Patterns'[OPP_ID], "Date",MAX('All Patterns'[MIN_LAST_MODIFIED])
    ), IF(
        [Date] >= CALCULATE(MIN('All Patterns'[MIN_LAST_MODIFIED]) )
        && [Date] <= CALCULATE(MAX('All Patterns'[MIN_LAST_MODIFIED]) )
        ,1, BLANK()
    )
)

(Measure 2 = uses Measure 1 to calculate the total $$$ value using the first measure as filter context)

Pattern Value 2.ALL = 
CALCULATE(
    [Sum Value.ALL],
    FILTER(
        'All Patterns',
        [Pattern Count.ALLOppID]
    )
)

 

1 ACCEPTED SOLUTION
AlexisOlson
Super User
Super User

You are essentially using Measure 1 as a calculated column in Measure 2. A more efficient way to do this would be to define a calculated column with 1 if the transaction is the latest and blank otherwise. Storing this as a calculated column, we only have to do it once and can reuse it in multiple measures.

 

IsLastTransaction = 
IF (
    'All Patterns'[MIN_LAST_MODIFIED]
        = CALCULATE (
            MAX ( 'All Patterns'[MIN_LAST_MODIFIED] ),
            ALLEXCEPT ( 'All Patterns', 'All Patterns'[OPP_ID] )
        ),
    1
)

 

 Now your measures can be greatly simplified. The first one doesn't even need to use the calculated column:

Pattern Count.ALLOppID = DISTINCTCOUNT ( 'All Patterns'[OPP_ID] )

 

The second one does use the calculated column:

Pattern Value 2.ALL = 
CALCULATE(
    [Sum Value.ALL],
    'All Patterns'[IsLastTransaction] = 1
)

View solution in original post

5 REPLIES 5
AlexisOlson
Super User
Super User

You are essentially using Measure 1 as a calculated column in Measure 2. A more efficient way to do this would be to define a calculated column with 1 if the transaction is the latest and blank otherwise. Storing this as a calculated column, we only have to do it once and can reuse it in multiple measures.

 

IsLastTransaction = 
IF (
    'All Patterns'[MIN_LAST_MODIFIED]
        = CALCULATE (
            MAX ( 'All Patterns'[MIN_LAST_MODIFIED] ),
            ALLEXCEPT ( 'All Patterns', 'All Patterns'[OPP_ID] )
        ),
    1
)

 

 Now your measures can be greatly simplified. The first one doesn't even need to use the calculated column:

Pattern Count.ALLOppID = DISTINCTCOUNT ( 'All Patterns'[OPP_ID] )

 

The second one does use the calculated column:

Pattern Value 2.ALL = 
CALCULATE(
    [Sum Value.ALL],
    'All Patterns'[IsLastTransaction] = 1
)
Anonymous
Not applicable

@AlexisOlson Thank you! I am curious on how a calculated column worked so well vs. measure. I was always under the impression that measures should be considered first. Regardless, your advice helped and provided relief to a stressful problem. So much gratitude for your support! 

Your impression is fairly accurate in general. Calculated columns are a good idea for situations where

 

  1. The column values do not depend on filter context. (Calculated columns are not dynamic.)
  2. The column values store values that would otherwise need to be repeatedly calculated.

With regard to #1, note that this approach would not work if you needed to filter your dates with a slicer and find the maximal date within that filter context. If you need that flexibility, you would need all measures (which could still be much more efficient than what you had).

JirkaZ
Solution Specialist
Solution Specialist

So basically you need to be wise about your DAX. The code that you're using is forcing the engine to evaluate stuff for each row individually (you can check that in DAX studio). So in general you want to avoid that with large datasets.

Anonymous
Not applicable

@JirkaZ Thank you for the advice. What I am struggling with is the current context of the requirements is that I take the most recent status. I understand that it is best practice not to evaluate each row individually but how can I structure the measure to do that? Apologies but I am so lost. 

Helpful resources

Announcements
November Power BI Update Carousel

Power BI Monthly Update - November 2025

Check out the November 2025 Power BI update to learn about new features.

Fabric Data Days Carousel

Fabric Data Days

Advance your Data & AI career with 50 days of live learning, contests, hands-on challenges, study groups & certifications and more!

FabCon Atlanta 2026 carousel

FabCon Atlanta 2026

Join us at FabCon Atlanta, March 16-20, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.

Top Solution Authors
Top Kudoed Authors