Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Grow your Fabric skills and prepare for the DP-600 certification exam by completing the latest Microsoft Fabric challenge.

Reply
adamlang
Helper III
Helper III

Survey Results Matched Pairs

Hi there,

 

I'm trying to create a measure that will calculate the average percentage change between two survey results.

 

I've got a .pbix file with my test data here:

https://www.dropbox.com/scl/fo/z8ml2mnjxvehbw8y4gahc/h?rlkey=jn3cijye52o8013sh2xmwz3eq&dl=0 

 

The data looks like this:

 

Enrollment IDSurvey TypeGen Well Being ScoreSurvey Date
ERL01Initial301/01/2023
ERL01Final601/06/2023
ERL02Initial 01/01/2023
ERL03Initial 02/01/2023
ERL03Final401/06/2023
ERL02Final 01/06/2023
ERL04Initial501/02/2023
ERL04Initial501/02/2023
ERL05Initial301/11/2022
ERL05Final401/02/2023

 

Its a bit messy, but I don't want to clean it up in Power Query because there's other survey question data in the same rows and I'd end up with a table for each of about 20 questions, and lots of repeated data.

 

I need the measure to:

 - Calulate the average percent change between surveys, in given date period (determined by the visualisation)

 - It must only look at enrollments that have a score for both the initial and final surveys

 - Blank/null values must be removed from both inital and final surveys.

 - The initial survey can sit outside of an applied date filter - i.e. if a table is filtered to look at the percentage change in 2023, it can do this by including initials surveys from before that year, basically the date context will be taken only from the final survey.

 

So in the data above there are only two valid matched pairs, ERL01 and ERL5. The average percent change therefore should be an 66.6% increase. ERL01 is a 100% increase, and ERL05 is a 33.3% increase.

 

I've tried the following measure adapted from a similar senario I found in these forums, but its just coming up with a blank value.

 

Percent change in Wellbeing Score (Clients with Initial and Final Survey scores) =
VAR ClientsWithTwoSurveys =
    // Varible to identify clients that have more than one response to the General Wellbeing question
    FILTER (
        DISTINCT ( 'Outcome Surveys'[Enrollment ID ] ),
        CALCULATE (
            DISTINCTCOUNT ( 'Outcome Surveys'[Enrollment ID ] ) >= 2,
            // identifies survey responses where there are more than one response
            'Outcome Surveys'[Gen Well Being Score] >= 0 // and where data exists for the General Wellbeing question
        )
    )
VAR AvgRating1 =
    CALCULATE (
        AVERAGE ( 'Outcome Surveys'[Gen Well Being Score] ),
        // finds the average wellbeing score
        ClientsWithTwoSurveys,
        // uses the variable above as a filter so that we only count completed surveys where a second survey for the same client also exists
        'Outcome Surveys'[Survey Type] = "Initial",
        // filters out final surveys
        REMOVEFILTERS ( 'Calendar'[Date] ) // This should remove any filter applied to a visualisation on the date, so that an inital survey that was carried out outside of a date filter are still counted
    )
VAR AvgRating2 =
    // As above with those surveys labelled as final, but date filter should be applied in visualisations, i.e. if a year or month filter is applied this will be based on the date of the final assessment only, and inital assessment will not be filtered.
    CALCULATE (
        AVERAGE ( 'Outcome Surveys'[Gen Well Being Score] ),
        ClientsWithTwoSurveys,
        'Outcome Surveys'[Survey Type] = "Final"
    )
VAR Result =
    // Basic percentage change calulation
    DIVIDE (
        AvgRating1 - AvgRating2,
        AvgRating1
    )
RETURN
    Result 

 

 

Any help would be much appreciated.

 

Many thanks,


Adam

1 ACCEPTED SOLUTION

@adamlang If your year filter is coming from something like a slicer outside the visual then you will need ALL instead. If the filter is coming from inside your visual, then ALLSELECTED can be used instead to preserve your external filters and remove the internal (visual) filters.

Average Percent Change =
    VAR __Valid =
        SELECTCOLUMNS(
            FILTER(
                SUMMARIZE(
                    FILTER( ALL('Table'), [Gen Well Being Score] <> BLANK() ),
                    [Enrollment ID],
                    "__Count", COUNTROWS( 'Table' )
                ),
                [__Count] = 2
            ),
            "__id", [Enrollment ID]
        )
    VAR __Table =
      ADDCOLUMNS(
        ADDCOLUMNS(
            ADDCOLUMNS(
                FILTER( 'Table', [Enrollment ID] IN __Valid && [Survey Type] = "Final" ),
                "__Prev",
                    VAR __id = [Enrollment ID]
                    VAR __Result = SUMX( FILTER( ALL('Table'), [Enrollment ID] = __id && [Survey Type] = "Initial" ), [Gen Well Being Score] )
                RETURN
                    __Result
            ),
            "__Diff", [Gen Well Being Score] - [__Prev]
        ),
        "__Percent", DIVIDE( [__Diff], [__Prev], 0 )
    )
    VAR __Result = AVERAGEX( __Table, [__Percent] )
RETURN
    __Result

Follow on LinkedIn
@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
The Definitive Guide to Power Query (M)

DAX is easy, CALCULATE makes DAX hard...

View solution in original post

10 REPLIES 10
v-xuxinyi-msft
Community Support
Community Support

Hi @adamlang 

 

You can create two measures as follow.

 

Percent =
VAR _fid =
    CALCULATE (
        MAX ( [Enrollment ID] ),
        FILTER (
            'Table',
            [Gen Well Being Score] <> BLANK ()
                && [Survey Type] = "Final"
                && MAX ( [Survey Date] ) >= MIN ( 'Date'[Date] )
                && MAX ( [Survey Date] ) <= MAX ( 'Date'[Date] )
        )
    )
VAR _i =
    CALCULATE (
        MAX ( [Enrollment ID] ),
        FILTER (
            'Table',
            [Gen Well Being Score] <> BLANK ()
                && [Survey Type] = "Initial"
                && YEAR ( [Survey Date] ) = YEAR ( MAX ( 'Date'[Date] ) )
                || YEAR ( [Survey Date] )
                    = YEAR ( MAX ( 'Date'[Date] ) ) - 1
        )
    )
VAR _fscore =
    CALCULATE (
        MAX ( [Gen Well Being Score] ),
        FILTER ( 'Table', [Enrollment ID] = _fid && [Survey Type] = "Final" )
    )
VAR _iscore =
    CALCULATE (
        MAX ( [Gen Well Being Score] ),
        FILTER ( 'Table', [Enrollment ID] = _i && [Survey Type] = "Initial" )
    )
RETURN
    IF (
        _fid = _i
            && _fid <> BLANK ()
            && _i <> BLANK (),
        ( _fscore - _iscore ) / _iscore,
        BLANK ()
    )

 

 

 

AveragePercent = AVERAGEX(VALUES('Table'[Enrollment ID]), 'Table'[Percent])

 

 

vxuxinyimsft_0-1708332576552.png

Is this the result you expect?

 

Best Regards,
Yulia Xu

 

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Hi @v-xuxinyi-msft,

 

Thanks for engaging with this problem.

 

I need to present the result in a slightly different way. As a table with the years (probably will be quarters in my actual dataset) as rows. With your measures the average for 2023 comes as 1 (100%) so its only looking at ERL01 in this context.  Dispite the 2023 table filter, I need it to include in the calculation the initial survey from 2022. The result of the average percent for 2023 should be 66%.

adamlang_0-1708338951727.png

Hope that makes sense. Sorry I wasn't as clear about that part as I should have been, most of the measures I use aren't as complex as this one.

 

I think the table for the Percent measure, produces a total, which seems to be the average percent, I just need that to filter correctly by year (within the table) based just on the final survey, and basically ingnoring the date of the inital survey.  Sorry no it doesn't that was Gregs measure above.

adamlang_1-1708339624890.png

Thanks again for responding.

 

Adam

 

Greg_Deckler
Super User
Super User

@adamlang How about this? PBIX is attached below sig.

Average Diff = 
    VAR __Valid = 
        SELECTCOLUMNS( 
            FILTER( 
                SUMMARIZE( 
                    FILTER( 'Table', [Gen Well Being Score] <> BLANK() ), 
                    [Enrollment ID], 
                    "__Count", COUNTROWS( 'Table' ) 
                ), 
                [__Count] = 2 
            ), 
            "__id", [Enrollment ID] 
        )
    VAR __Table = 
        ADDCOLUMNS(
            ADDCOLUMNS(
                FILTER( 'Table', [Enrollment ID] IN __Valid && [Survey Type] = "Final" ),
                "__Prev", 
                    VAR __id = [Enrollment ID]
                    VAR __Result = SUMX( FILTER( 'Table', [Enrollment ID] = __id && [Survey Type] = "Initial" ), [Gen Well Being Score] )
                RETURN
                    __Result
            ),
            "__Diff", [Gen Well Being Score] - [__Prev]
        )
    VAR __Result = AVERAGEX( __Table, [__Diff] )
RETURN
    __Result

Follow on LinkedIn
@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
The Definitive Guide to Power Query (M)

DAX is easy, CALCULATE makes DAX hard...

Thanks @Greg_Deckler

 

That's much closer to where I got, isn't it.

 

I think we just need to:

1. remove the date context somehow from the initial survey (so that an initial survey from a prior period will still be used in the calculation even if the visualisation has a date filter applied to it), e.g. if the table is showing the change for each year, I need the year to be determined by the final survey date, and to include an initial survey from a previous year in the calculation.

and,

2. the average difference is useful, but I’m not sure how in this coding to divide the difference by the initial survey score - to return the percentage change. Would that need to be a new column in the virtual table, and then the result is an average of that column?

Thanks again, its a big help.

 

Adam

@adamlang So for the 2nd question, I think this:

Average Diff = 
    VAR __Valid = 
        SELECTCOLUMNS( 
            FILTER( 
                SUMMARIZE( 
                    FILTER( 'Table', [Gen Well Being Score] <> BLANK() ), 
                    [Enrollment ID], 
                    "__Count", COUNTROWS( 'Table' ) 
                ), 
                [__Count] = 2 
            ), 
            "__id", [Enrollment ID] 
        )
    VAR __Table = 
      ADDCOLUMNS(
        ADDCOLUMNS(
            ADDCOLUMNS(
                FILTER( 'Table', [Enrollment ID] IN __Valid && [Survey Type] = "Final" ),
                "__Prev", 
                    VAR __id = [Enrollment ID]
                    VAR __Result = SUMX( FILTER( 'Table', [Enrollment ID] = __id && [Survey Type] = "Initial" ), [Gen Well Being Score] )
                RETURN
                    __Result
            ),
            "__Diff", [Gen Well Being Score] - [__Prev]
        )
        "__Percent", DIVIDE( [__Diff], [__Prev], 0 )
    )
    VAR __Result = AVERAGEX( __Table, [__Percent] )
RETURN
    __Result

Not sure I am following the first question, perhaps it has to do with some kind of context coming from the visual? You can use ALL or ALLEXCEPT or ALLSELECTED to remove context that you do not want.


Follow on LinkedIn
@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
The Definitive Guide to Power Query (M)

DAX is easy, CALCULATE makes DAX hard...

Thanks @Greg_Deckler 

 

That's great for the percentage change, I just added a comma to line 26 to fix a syntax error that was coming up.

 

In terms of the date, in this test example if I filter a visual by the year 2023, I end up with a percentage change of 100%, that's because its ignoring the ERL05 Initial Survey from 2022, and looking only at ERL01. I need it to include that survey (even if a 2023 filter is applied) because the Final survey took place in 2023. So basically filtering by 2023 should still give a percentage change of 66%, rather than 100%.

 

In my actual data, the table will have a row for every quarter (maybe month as well).

 

I tried adding in allselected, like this, but I get a syntax error, thinking that also I may need to add an ALLSELECTED to the _Valid variable as well, but not sure where to do that?

 

Average Percent Change =
    VAR __Valid =
        SELECTCOLUMNS(
            FILTER(
                SUMMARIZE(
                    FILTER( 'Table', [Gen Well Being Score] <> BLANK() ),
                    [Enrollment ID],
                    "__Count", COUNTROWS( 'Table' )
                ),
                [__Count] = 2
            ),
            "__id", [Enrollment ID]
        )
    VAR __Table =
      ADDCOLUMNS(
        ADDCOLUMNS(
            ADDCOLUMNS(
                FILTER( 'Table', [Enrollment ID] IN __Valid && [Survey Type] = "Final" ),
                "__Prev",
                    VAR __id = [Enrollment ID]
                    VAR __Result = CALCULATE(SUMX( FILTER( 'Table', [Enrollment ID] = __id && [Survey Type] = "Initial" ), [Gen Well Being Score] ), ALLSELECTED('Table'[Survey Date])
                RETURN
                    __Result
            ),
            "__Diff", [Gen Well Being Score] - [__Prev]
        ),
        "__Percent", DIVIDE( [__Diff], [__Prev], 0 )
    )
    VAR __Result = AVERAGEX( __Table, [__Percent] )
RETURN
    __Result
 
Thanks,
 
Adam

@adamlang If your year filter is coming from something like a slicer outside the visual then you will need ALL instead. If the filter is coming from inside your visual, then ALLSELECTED can be used instead to preserve your external filters and remove the internal (visual) filters.

Average Percent Change =
    VAR __Valid =
        SELECTCOLUMNS(
            FILTER(
                SUMMARIZE(
                    FILTER( ALL('Table'), [Gen Well Being Score] <> BLANK() ),
                    [Enrollment ID],
                    "__Count", COUNTROWS( 'Table' )
                ),
                [__Count] = 2
            ),
            "__id", [Enrollment ID]
        )
    VAR __Table =
      ADDCOLUMNS(
        ADDCOLUMNS(
            ADDCOLUMNS(
                FILTER( 'Table', [Enrollment ID] IN __Valid && [Survey Type] = "Final" ),
                "__Prev",
                    VAR __id = [Enrollment ID]
                    VAR __Result = SUMX( FILTER( ALL('Table'), [Enrollment ID] = __id && [Survey Type] = "Initial" ), [Gen Well Being Score] )
                RETURN
                    __Result
            ),
            "__Diff", [Gen Well Being Score] - [__Prev]
        ),
        "__Percent", DIVIDE( [__Diff], [__Prev], 0 )
    )
    VAR __Result = AVERAGEX( __Table, [__Percent] )
RETURN
    __Result

Follow on LinkedIn
@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
The Definitive Guide to Power Query (M)

DAX is easy, CALCULATE makes DAX hard...

Hi @Greg_Deckler 

 

Thanks for your help on this.  Would it be at all possbile to tweak the scenario a bit. Upon testing the dax measure you created it became apparent that there are instences where enrollments have more than two surveys - either more than one initial survey or more than one final survey, in such cases we need to take the earliest date and where there are multiple final survey we need to take the most recent date.

 

Not sure if I should post as a new thread.

 

Some updated test data would look like:

 

Enrollment ID

Survey Type

Gen Well Being Score

Survey Date

ERL01

Initial

3

01 January 2023

ERL01

Final

6

01 June 2023

ERL02

Initial

 

01 January 2023

ERL03

Initial

 

02 January 2023

ERL03

Final

4

01 June 2023

ERL02

Final

 

01 June 2023

ERL04

Initial

5

01 February 2023

ERL04

Initial

5

01 February 2023

ERL05

Initial

3

01 November 2022

ERL05

Final

4

01 February 2023

ERL01

Initial

2

20 December 2022

ERL05

Final

5

15 February 2023

  

The result would be a table as follows:

 

Year

Average Percent Change Wellbeing

2023

133.3%

 

The 133% come from:

ERL01 First Initial Survey = 2; Final Survey = 6 is a 200% increase (6-2 = 4, 4/2 = 2)

ERL05 Initial Survey = 3; lastest final survey = 5 is a 66.6% increase (5-3 = 2, 2/3 = 0.666)

 

The dax you wrote before filters out enrollments that don't have 2 valid survey responses, and therefore provides a blank for the new text data, where the only valid responses have 3 survey responses.

 

Obviously no obligation to respond, and happy to post as a new thread if that's prefered. You've obviously provided a solution to the question I asked originally.

 

Many thanks,

 

Adam

 

Thanks @Greg_Deckler 

 

Your code works as expected. With ALL as you provided.

 

Apoligies it took be a while to accept it as the solution. It was throwing up an error I couldn't understand for a few days, and then it clicked - my calendar table and the date field in the survey table were not formatted correctly so the relationship wasn't working. I worked it out eventually.

 

Upon further testing with my actual data it turns out that Enrollment IDs sometime have more than 2 completed responses - i.e. they might have two inital survey and one final, or multiple final surveys. I need to do a bit of thinking around what's needed, but its likely in such cases we want to only take the earliest dated initial survey, and the most recent/ latest dated final survey.

 

Thanks again for the help, very much appreciated. Thanks also to @v-xuxinyi-msft.

 

Adam 

Thanks @Greg_Deckler ,

 

Should be ALLSELECTED I need in that case, as the Years will be within the table. I tried just swapping the two 'ALL's to ALLSELECTED, but that just returns a zero.

 

Not sure if it matters but the Year is coming through a relationship between the [Survey Date] and my calendar table.

 

Thanks for all the help.

 

Love the book btw, clearly I need to read more of it.

 

Adam

Helpful resources

Announcements
RTI Forums Carousel3

New forum boards available in Real-Time Intelligence.

Ask questions in Eventhouse and KQL, Eventstream, and Reflex.

MayPowerBICarousel

Power BI Monthly Update - May 2024

Check out the May 2024 Power BI update to learn about new features.