Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Calling all Data Engineers! Fabric Data Engineer (Exam DP-700) live sessions are back! Starting October 16th. Sign up.

Reply
Deepak6642
Frequent Visitor

Filter dataset based on rows belonging to all selected categories.

I am looking to do a Cohort analysis of students across multiple grades but the caveat being the students are the same in each grade.

 

For example, A class of 20 students graduates from Grade 1 to Grade 5 over 5 years. In this cohort, some students leave the class, and some new students join over the years.

 

I am struggling in finding score averages of students across 5 years of students who have attended all grades from 1 to 5.

 

Consider the following sample dataset

 

student_id, grade, score

1,1,100

2,1,80

3,1,85

4,1,90

5,1,93

1,2,100 //Graduate to grade 2

2,2,80

3,2,85

4,2,90

6,2,93 // New students joins, and student #5 leaves

1,3,100 Graduate to grade 3

2,3,80

3,3,85

4,3,90

6,3,93

7,3,87 // New student joins in grade 3

 

So my cohort should only consider student nos 1 through 4

 while doing averages, count of students across grades etc.

 

Can you advise?

2 REPLIES 2
v-frfei-msft
Community Support
Community Support

Hi @Deepak6642,

 

Based on my test, we can take the following steps to meet your requirement.

 

1. Enter the data and create a calculated table using the formula. Then create relationship between tables based on the student_id.

 

Table = 
DISTINCT (
    UNION (
        SELECTCOLUMNS ( '1', "new", '1'[student_id] ),
        SELECTCOLUMNS ( '2', "new", '2'[student_id] ),
        SELECTCOLUMNS ( '3', "new", '3'[student_id] )
    )
)

2. Create the measures as below. And create a table visual and make that filterd by Measure.

 

Measure = IF(MAX('1'[grade]) = BLANK() || MAX('2'[grade])=BLANK() || MAX('3'[grade])= BLANK(), BLANK(),1)
ave 1 = SUMX(ALLSELECTED('1'),'1'[score])/[count]
ave 2 = SUMX(ALLSELECTED('2'),'2'[score])/[count]
ave 3 = SUMX(ALLSELECTED('3'),'3'[score])/[count]
count = CALCULATE(COUNT('Table'[new]),ALLSELECTED('Table'))

Capture.PNG

 

For more details, please check the pbix as attached. If it doesn't meet your requirement, kindly share your excepted result to me.

 

https://www.dropbox.com/s/be9en7zernaeeyv/ilter%20dataset%20based%20on%20rows%20belonging%20to%20all...

 

Regards,

Frank

Community Support Team _ Frank
If this post helps, then please consider Accept it as the solution to help the others find it more quickly.

Thank you for the attempt @v-frfei-msft.

 

However, this is not close to the expected result. The data for the various grades comes from a single table instead of different tables, as listed in the original message.

 

The dashboard must allow the user to choose a start and end grade, which ought to filter the results and present the same student cohort data as an aggregate.

 

The resultant graph/table must show the aggregate value across only the students which exist in all the grades from the chosen grade.

 

final.png

 

Above you can see the plot of aggregates of scores for a subject across the grades. The score aggregate is for students who are the same across all grades over the years. It excludes students who joined in or left in the middle for the selected grade/period.

 

 

Helpful resources

Announcements
FabCon Global Hackathon Carousel

FabCon Global Hackathon

Join the Fabric FabCon Global Hackathon—running virtually through Nov 3. Open to all skill levels. $10,000 in prizes!

FabCon Atlanta 2026 carousel

FabCon Atlanta 2026

Join us at FabCon Atlanta, March 16-20, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.

Top Solution Authors
Top Kudoed Authors