Solved: Re: Average of answer by question with multiple va...

Max_Kloosterman · ‎11-26-2018

Hi,

I am analysing survey results and would like to average the answers for a subset of the questions based on their question type.

All answers are in a single column. Questiontypes are likert, open and yes/no, the values for likert type questions are numerical from 1 to 5. the answers to the open questions contain text.

I expect I should use the AVERAGEX function and CALCULATETABLE to run the averages on but lack the knowledge to calculate the right table so that it only includes likert type questions and averaging becomes possible. Help is much appreciated!

Kind regards,

Max

Structure of my dataset:

3 Tables which are crosslinked.

The Answers table contains:

AnswerID

QuestionID

RespondentID

Answer

The QuestionsTable contains:

QuestionID

QuestionType

Question

QuestionTopic

The Respondentstable contains:

OrganisationID

ResponentID

Organisation

OrganisationType

LivioLanzo · ‎01-15-2019

Hi @Max_Kloosterman,

if you want to analyze just LIKERT questions, then you should import only these ones into Power BI. Do you know how to do that with Power Query?

Once it is imported it's just a matter of defining relationships and using dax AVERAGE function

Did I answer your question correctly? Mark my answer as a solution!

Proud to be a Datanaut!

View solution in original post

LivioLanzo · ‎11-26-2018

Hello @Max_Kloosterman

would you be able to post a sample of your data and expected results?

Did I answer your question correctly? Mark my answer as a solution!

Proud to be a Datanaut!

Max_Kloosterman · ‎11-26-2018

Hi Livio,

Sure!

example from Answers Table:

Example from questions table:

Example from respondents table:

Expected Results

A table I can base various graphs on that looks like:

Kind regards,

Max

LivioLanzo · ‎11-26-2018

Hi again @Max_Kloosterman

unless I am missing something, I think we're missing a table telling us what's the correct answer per question. Otherwise based on what the score would be calculated?

Did I answer your question correctly? Mark my answer as a solution!

Proud to be a Datanaut!

Max_Kloosterman · ‎11-26-2018

Hi @LivioLanzo

There is no correct answer. The statements are scored disagree completely (1) to agree completely (5).

Hope that clarifies,

Kind regards,

Max

LivioLanzo · ‎11-26-2018

Hi @Max_Kloosterman,

What about the questions which are not LIKERT? The ones with OPEN Text and YN, should they not be considered?

Did I answer your question correctly? Mark my answer as a solution!

Proud to be a Datanaut!

Max_Kloosterman · ‎12-12-2018

Hi @LivioLanzo,

sorry for my late reply!

The open questions can be ignored, the Y/N questions as well. The likert questions are the basis for a comparison between respondents / groups of respondents.

This is probably where I stranded, I couldn't get a measure calculated on the column containing the likert scores (as well as the other scores) to work.

Kind regards,

Max

Max_Kloosterman · ‎01-15-2019

Hi @LivioLanzo

Thanks for your help and questions so far - hope the clarification helped?

Kind regards,

Max

LivioLanzo · ‎01-15-2019

Hi @Max_Kloosterman,

if you want to analyze just LIKERT questions, then you should import only these ones into Power BI. Do you know how to do that with Power Query?

Once it is imported it's just a matter of defining relationships and using dax AVERAGE function

Did I answer your question correctly? Mark my answer as a solution!

Proud to be a Datanaut!

Max_Kloosterman · ‎01-17-2019

Hi @LivioLanzo

Thanks and that makes sense - I should be able to figure out how to do that, part of my learning journey 🙂

One follow-up question on data modelling in this respect:

There are three question types in my survey of which likert is one. Would you recommend having three different fact tables (one for the answers belonging to each question type) or instead to create three columns - one for each question type - and just have one fact table with a lot of null values?

I am using the answer table as my fact table and questions and respondents as dimension tables. Does that make sense?

Kind regards,

Max

LivioLanzo · ‎01-17-2019

Hi @Max_Kloosterman,

if you want to work out the averages as done by your previous example then I would say you need a separate fact table for the Questions LIKERT because for these questions the answers are numeric and for the other questions they are not, therefore you couldn't be averaging strings.

Generally speaking though, you would wanna count how many people gave a certain answer or how people responded to a specific question, therefore you could have everything within one fact table and you wouldn't need to have three columns per each question type. The question type Foreign Key can be in just one column and would be linked to the questions dimensions.

And you can make even more improvements by coming to this final model:

Did I answer your question correctly? Mark my answer as a solution!

Proud to be a Datanaut!

Max_Kloosterman · ‎02-07-2019

Thanks again @LivioLanzo and I am currenlty using the proposed setup to make the report work.

I do run into one problem though with this setup though.

One question asks to select from a multiple choice list:

For example: choose the role that closes fits your own from the list below:

- Director

- SVP

- VP

- Employee of the Month

- Janitor

When no respondent selects janitor: then how do I show in a graph that we have no janitors using the proposed setup? since it was never selected it would not be part of the answer table correct?

LivioLanzo · ‎02-07-2019

Hi @Max_Kloosterman

have you tried within your visual to allow for 'Show Items with No Data'?

Did I answer your question correctly? Mark my answer as a solution!

Proud to be a Datanaut!

Max_Kloosterman · ‎02-07-2019

Hi @LivioLanzo,

Not yet and your suggestion got me thinking about a solution that might work in this specific case. I will work on figuring out what might work tonight.

The thing is that the connector I am using to get data from surveymonkey is giving me two tables:

One table with all questions - essentially the questionnaire

One table with all responses - only the ones that have been selected.

I think I will try the following steps to get to where i need to be:

1. Import both tables from surveymonkey

2. join them into one table - which would indeed give me the choices that have never been selected as lines in the table with no respondents against them.

3. use that base table to create the structure you described earlier - than it might work.