cancel
Showing results for
Did you mean:

Earn a 50% discount on the DP-600 certification exam by completing the Fabric 30 Days to Learn It challenge.

Helper III

## Survey Results Matched Pairs

Hi there,

I'm trying to create a measure that will calculate the average percentage change between two survey results.

I've got a .pbix file with my test data here:

The data looks like this:

 Enrollment ID Survey Type Gen Well Being Score Survey Date ERL01 Initial 3 01/01/2023 ERL01 Final 6 01/06/2023 ERL02 Initial 01/01/2023 ERL03 Initial 02/01/2023 ERL03 Final 4 01/06/2023 ERL02 Final 01/06/2023 ERL04 Initial 5 01/02/2023 ERL04 Initial 5 01/02/2023 ERL05 Initial 3 01/11/2022 ERL05 Final 4 01/02/2023

Its a bit messy, but I don't want to clean it up in Power Query because there's other survey question data in the same rows and I'd end up with a table for each of about 20 questions, and lots of repeated data.

I need the measure to:

- Calulate the average percent change between surveys, in given date period (determined by the visualisation)

- It must only look at enrollments that have a score for both the initial and final surveys

- Blank/null values must be removed from both inital and final surveys.

- The initial survey can sit outside of an applied date filter - i.e. if a table is filtered to look at the percentage change in 2023, it can do this by including initials surveys from before that year, basically the date context will be taken only from the final survey.

So in the data above there are only two valid matched pairs, ERL01 and ERL5. The average percent change therefore should be an 66.6% increase. ERL01 is a 100% increase, and ERL05 is a 33.3% increase.

I've tried the following measure adapted from a similar senario I found in these forums, but its just coming up with a blank value.

Percent change in Wellbeing Score (Clients with Initial and Final Survey scores) =
VAR ClientsWithTwoSurveys =
// Varible to identify clients that have more than one response to the General Wellbeing question
FILTER (
DISTINCT ( 'Outcome Surveys'[Enrollment ID ] ),
CALCULATE (
DISTINCTCOUNT ( 'Outcome Surveys'[Enrollment ID ] ) >= 2,
// identifies survey responses where there are more than one response
'Outcome Surveys'[Gen Well Being Score] >= 0 // and where data exists for the General Wellbeing question
)
)
VAR AvgRating1 =
CALCULATE (
AVERAGE ( 'Outcome Surveys'[Gen Well Being Score] ),
// finds the average wellbeing score
ClientsWithTwoSurveys,
// uses the variable above as a filter so that we only count completed surveys where a second survey for the same client also exists
'Outcome Surveys'[Survey Type] = "Initial",
// filters out final surveys
REMOVEFILTERS ( 'Calendar'[Date] ) // This should remove any filter applied to a visualisation on the date, so that an inital survey that was carried out outside of a date filter are still counted
)
VAR AvgRating2 =
// As above with those surveys labelled as final, but date filter should be applied in visualisations, i.e. if a year or month filter is applied this will be based on the date of the final assessment only, and inital assessment will not be filtered.
CALCULATE (
AVERAGE ( 'Outcome Surveys'[Gen Well Being Score] ),
ClientsWithTwoSurveys,
'Outcome Surveys'[Survey Type] = "Final"
)
VAR Result =
// Basic percentage change calulation
DIVIDE (
AvgRating1 - AvgRating2,
AvgRating1
)
RETURN
Result

Any help would be much appreciated.

Many thanks,

1 ACCEPTED SOLUTION
Super User

@adamlang If your year filter is coming from something like a slicer outside the visual then you will need ALL instead. If the filter is coming from inside your visual, then ALLSELECTED can be used instead to preserve your external filters and remove the internal (visual) filters.

``````Average Percent Change =
VAR __Valid =
SELECTCOLUMNS(
FILTER(
SUMMARIZE(
FILTER( ALL('Table'), [Gen Well Being Score] <> BLANK() ),
[Enrollment ID],
"__Count", COUNTROWS( 'Table' )
),
[__Count] = 2
),
"__id", [Enrollment ID]
)
VAR __Table =
FILTER( 'Table', [Enrollment ID] IN __Valid && [Survey Type] = "Final" ),
"__Prev",
VAR __id = [Enrollment ID]
VAR __Result = SUMX( FILTER( ALL('Table'), [Enrollment ID] = __id && [Survey Type] = "Initial" ), [Gen Well Being Score] )
RETURN
__Result
),
"__Diff", [Gen Well Being Score] - [__Prev]
),
"__Percent", DIVIDE( [__Diff], [__Prev], 0 )
)
VAR __Result = AVERAGEX( __Table, [__Percent] )
RETURN
__Result``````

Become an expert!: Enterprise DNA
External Tools: MSHGQM
Latest book!:
The Definitive Guide to Power Query (M)

DAX is easy, CALCULATE makes DAX hard...
10 REPLIES 10
Community Support

You can create two measures as follow.

``````Percent =
VAR _fid =
CALCULATE (
MAX ( [Enrollment ID] ),
FILTER (
'Table',
[Gen Well Being Score] <> BLANK ()
&& [Survey Type] = "Final"
&& MAX ( [Survey Date] ) >= MIN ( 'Date'[Date] )
&& MAX ( [Survey Date] ) <= MAX ( 'Date'[Date] )
)
)
VAR _i =
CALCULATE (
MAX ( [Enrollment ID] ),
FILTER (
'Table',
[Gen Well Being Score] <> BLANK ()
&& [Survey Type] = "Initial"
&& YEAR ( [Survey Date] ) = YEAR ( MAX ( 'Date'[Date] ) )
|| YEAR ( [Survey Date] )
= YEAR ( MAX ( 'Date'[Date] ) ) - 1
)
)
VAR _fscore =
CALCULATE (
MAX ( [Gen Well Being Score] ),
FILTER ( 'Table', [Enrollment ID] = _fid && [Survey Type] = "Final" )
)
VAR _iscore =
CALCULATE (
MAX ( [Gen Well Being Score] ),
FILTER ( 'Table', [Enrollment ID] = _i && [Survey Type] = "Initial" )
)
RETURN
IF (
_fid = _i
&& _fid <> BLANK ()
&& _i <> BLANK (),
( _fscore - _iscore ) / _iscore,
BLANK ()
)``````

``AveragePercent = AVERAGEX(VALUES('Table'[Enrollment ID]), 'Table'[Percent])``

Is this the result you expect?

Best Regards,
Yulia Xu

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Helper III

Thanks for engaging with this problem.

I need to present the result in a slightly different way. As a table with the years (probably will be quarters in my actual dataset) as rows. With your measures the average for 2023 comes as 1 (100%) so its only looking at ERL01 in this context.  Dispite the 2023 table filter, I need it to include in the calculation the initial survey from 2022. The result of the average percent for 2023 should be 66%.

Hope that makes sense. Sorry I wasn't as clear about that part as I should have been, most of the measures I use aren't as complex as this one.

I think the table for the Percent measure, produces a total, which seems to be the average percent, I just need that to filter correctly by year (within the table) based just on the final survey, and basically ingnoring the date of the inital survey.  Sorry no it doesn't that was Gregs measure above.

Thanks again for responding.

Super User

``````Average Diff =
VAR __Valid =
SELECTCOLUMNS(
FILTER(
SUMMARIZE(
FILTER( 'Table', [Gen Well Being Score] <> BLANK() ),
[Enrollment ID],
"__Count", COUNTROWS( 'Table' )
),
[__Count] = 2
),
"__id", [Enrollment ID]
)
VAR __Table =
FILTER( 'Table', [Enrollment ID] IN __Valid && [Survey Type] = "Final" ),
"__Prev",
VAR __id = [Enrollment ID]
VAR __Result = SUMX( FILTER( 'Table', [Enrollment ID] = __id && [Survey Type] = "Initial" ), [Gen Well Being Score] )
RETURN
__Result
),
"__Diff", [Gen Well Being Score] - [__Prev]
)
VAR __Result = AVERAGEX( __Table, [__Diff] )
RETURN
__Result``````

Become an expert!: Enterprise DNA
External Tools: MSHGQM
Latest book!:
The Definitive Guide to Power Query (M)

DAX is easy, CALCULATE makes DAX hard...
Helper III

Thanks @Greg_Deckler

That's much closer to where I got, isn't it.

I think we just need to:

1. remove the date context somehow from the initial survey (so that an initial survey from a prior period will still be used in the calculation even if the visualisation has a date filter applied to it), e.g. if the table is showing the change for each year, I need the year to be determined by the final survey date, and to include an initial survey from a previous year in the calculation.

and,

2. the average difference is useful, but I’m not sure how in this coding to divide the difference by the initial survey score - to return the percentage change. Would that need to be a new column in the virtual table, and then the result is an average of that column?

Thanks again, its a big help.

Super User

@adamlang So for the 2nd question, I think this:

``````Average Diff =
VAR __Valid =
SELECTCOLUMNS(
FILTER(
SUMMARIZE(
FILTER( 'Table', [Gen Well Being Score] <> BLANK() ),
[Enrollment ID],
"__Count", COUNTROWS( 'Table' )
),
[__Count] = 2
),
"__id", [Enrollment ID]
)
VAR __Table =
FILTER( 'Table', [Enrollment ID] IN __Valid && [Survey Type] = "Final" ),
"__Prev",
VAR __id = [Enrollment ID]
VAR __Result = SUMX( FILTER( 'Table', [Enrollment ID] = __id && [Survey Type] = "Initial" ), [Gen Well Being Score] )
RETURN
__Result
),
"__Diff", [Gen Well Being Score] - [__Prev]
)
"__Percent", DIVIDE( [__Diff], [__Prev], 0 )
)
VAR __Result = AVERAGEX( __Table, [__Percent] )
RETURN
__Result``````

Not sure I am following the first question, perhaps it has to do with some kind of context coming from the visual? You can use ALL or ALLEXCEPT or ALLSELECTED to remove context that you do not want.

Become an expert!: Enterprise DNA
External Tools: MSHGQM
Latest book!:
The Definitive Guide to Power Query (M)

DAX is easy, CALCULATE makes DAX hard...
Helper III

Thanks @Greg_Deckler

That's great for the percentage change, I just added a comma to line 26 to fix a syntax error that was coming up.

In terms of the date, in this test example if I filter a visual by the year 2023, I end up with a percentage change of 100%, that's because its ignoring the ERL05 Initial Survey from 2022, and looking only at ERL01. I need it to include that survey (even if a 2023 filter is applied) because the Final survey took place in 2023. So basically filtering by 2023 should still give a percentage change of 66%, rather than 100%.

In my actual data, the table will have a row for every quarter (maybe month as well).

I tried adding in allselected, like this, but I get a syntax error, thinking that also I may need to add an ALLSELECTED to the _Valid variable as well, but not sure where to do that?

Average Percent Change =
VAR __Valid =
SELECTCOLUMNS(
FILTER(
SUMMARIZE(
FILTER( 'Table', [Gen Well Being Score] <> BLANK() ),
[Enrollment ID],
"__Count", COUNTROWS( 'Table' )
),
[__Count] = 2
),
"__id", [Enrollment ID]
)
VAR __Table =
FILTER( 'Table', [Enrollment ID] IN __Valid && [Survey Type] = "Final" ),
"__Prev",
VAR __id = [Enrollment ID]
VAR __Result = CALCULATE(SUMX( FILTER( 'Table', [Enrollment ID] = __id && [Survey Type] = "Initial" ), [Gen Well Being Score] ), ALLSELECTED('Table'[Survey Date])
RETURN
__Result
),
"__Diff", [Gen Well Being Score] - [__Prev]
),
"__Percent", DIVIDE( [__Diff], [__Prev], 0 )
)
VAR __Result = AVERAGEX( __Table, [__Percent] )
RETURN
__Result

Thanks,

Super User

@adamlang If your year filter is coming from something like a slicer outside the visual then you will need ALL instead. If the filter is coming from inside your visual, then ALLSELECTED can be used instead to preserve your external filters and remove the internal (visual) filters.

``````Average Percent Change =
VAR __Valid =
SELECTCOLUMNS(
FILTER(
SUMMARIZE(
FILTER( ALL('Table'), [Gen Well Being Score] <> BLANK() ),
[Enrollment ID],
"__Count", COUNTROWS( 'Table' )
),
[__Count] = 2
),
"__id", [Enrollment ID]
)
VAR __Table =
FILTER( 'Table', [Enrollment ID] IN __Valid && [Survey Type] = "Final" ),
"__Prev",
VAR __id = [Enrollment ID]
VAR __Result = SUMX( FILTER( ALL('Table'), [Enrollment ID] = __id && [Survey Type] = "Initial" ), [Gen Well Being Score] )
RETURN
__Result
),
"__Diff", [Gen Well Being Score] - [__Prev]
),
"__Percent", DIVIDE( [__Diff], [__Prev], 0 )
)
VAR __Result = AVERAGEX( __Table, [__Percent] )
RETURN
__Result``````

Become an expert!: Enterprise DNA
External Tools: MSHGQM
Latest book!:
The Definitive Guide to Power Query (M)

DAX is easy, CALCULATE makes DAX hard...
Helper III

Thanks for your help on this.  Would it be at all possbile to tweak the scenario a bit. Upon testing the dax measure you created it became apparent that there are instences where enrollments have more than two surveys - either more than one initial survey or more than one final survey, in such cases we need to take the earliest date and where there are multiple final survey we need to take the most recent date.

Not sure if I should post as a new thread.

Some updated test data would look like:

 Enrollment ID Survey Type Gen Well Being Score Survey Date ERL01 Initial 3 01 January 2023 ERL01 Final 6 01 June 2023 ERL02 Initial 01 January 2023 ERL03 Initial 02 January 2023 ERL03 Final 4 01 June 2023 ERL02 Final 01 June 2023 ERL04 Initial 5 01 February 2023 ERL04 Initial 5 01 February 2023 ERL05 Initial 3 01 November 2022 ERL05 Final 4 01 February 2023 ERL01 Initial 2 20 December 2022 ERL05 Final 5 15 February 2023

The result would be a table as follows:

 Year Average Percent Change Wellbeing 2023 133.3%

The 133% come from:

ERL01 First Initial Survey = 2; Final Survey = 6 is a 200% increase (6-2 = 4, 4/2 = 2)

ERL05 Initial Survey = 3; lastest final survey = 5 is a 66.6% increase (5-3 = 2, 2/3 = 0.666)

The dax you wrote before filters out enrollments that don't have 2 valid survey responses, and therefore provides a blank for the new text data, where the only valid responses have 3 survey responses.

Obviously no obligation to respond, and happy to post as a new thread if that's prefered. You've obviously provided a solution to the question I asked originally.

Many thanks,

Helper III

Thanks @Greg_Deckler

Your code works as expected. With ALL as you provided.

Apoligies it took be a while to accept it as the solution. It was throwing up an error I couldn't understand for a few days, and then it clicked - my calendar table and the date field in the survey table were not formatted correctly so the relationship wasn't working. I worked it out eventually.

Upon further testing with my actual data it turns out that Enrollment IDs sometime have more than 2 completed responses - i.e. they might have two inital survey and one final, or multiple final surveys. I need to do a bit of thinking around what's needed, but its likely in such cases we want to only take the earliest dated initial survey, and the most recent/ latest dated final survey.

Thanks again for the help, very much appreciated. Thanks also to @v-xuxinyi-msft.

Helper III

Thanks @Greg_Deckler ,

Should be ALLSELECTED I need in that case, as the Years will be within the table. I tried just swapping the two 'ALL's to ALLSELECTED, but that just returns a zero.

Not sure if it matters but the Year is coming through a relationship between the [Survey Date] and my calendar table.

Thanks for all the help.

Love the book btw, clearly I need to read more of it.

Announcements

#### New forum boards available in Real-Time Intelligence.

Ask questions in Eventhouse and KQL, Eventstream, and Reflex.

#### Power BI Monthly Update - May 2024

Check out the May 2024 Power BI update to learn about new features.

#### Fabric certifications survey

Certification feedback opportunity for the community.

Top Solution Authors
Top Kudoed Authors