Design Pattern - Groups and Super Groups!

Greg_Deckler · ‎03-07-2017

The Challenge

A fairly common data model design pattern can be described such that:

Given a set of criteria that are numerically scored
These scored criteria are grouped together in arbitrary sets and the group scored via an aggregation of criteria scores
These groups of scores are further grouped into arbitrary sets (super groups) and the super group scored via an aggregation of group scores

For example, a restaurant rating system might uses a ranking criteria of satisfactory from 0-100. These questions could be grouped into questions about "Service", "Atmosphere", "Quality", "Value" and "Cleanliness" where each group's score is simply an average of the individual question scores. These five groups are then grouped into "Restaurant" and "Food" where the score for each of these super groups is the minimum score of related sub-groups.

The Data Model

The data model to implement this design is relatively straight-forward.

[SuperGroups] 1-* [Groups2SuperGroups] 1-1 [Groups] 1-* [Attributes2Groups] 1-* [AttributeScores]

We will use the following columns and data to build a sample data model via "Enter Data" queries:

AttributeScores

AttributeScores contains two columns, "Attribute" and "Score".

Attribute	Score
Attribute 1	100
Attribute 2	75
Attribute 3	80
Attribute 4	100
Attribute 5	60
Attribute 6	30
Attribute 7	40
Attribute 8	35
Attribute 9	75
Attribute 10	90

Groups

Groups contains a single column "Group"

Group

Group1

Group2

Group3

Group4

SuperGroups

SuperGroups contains a single column "SuperGroup":

SuperGroup

SuperGroup1

SuperGroup2

Attributes2Groups

Attributes2Groups contains 2 columns "Group" and "Attribute":

Group	Attribute
Group1	Attribute1
Group1	Attribute2
Group1	Attribute3
Group2	Attribute4
Group2	Attribute5
Group3	Attribute6
Group3	Attribute7
Group3	Attribute8
Group4	Attribute9
Group4	Attribute10

Groups2SuperGroups

Groups2SuperGroups contains two columns, "Group" and "SuperGroup":

SuperGroup	Group
SuperGroup1	Group1
SuperGroup1	Group2
SuperGroup2	Group3
SuperGroup2	Group4

A Simple Solution

Create an AverageScore column in Groups:

AverageScore = CALCULATE(AVERAGE(AttributeScores[Score]),RELATEDTABLE(AttributeScores))

Groups data is now:

Groups

Group	AverageScore
Group1	85
Group2	80
Group3	35
Group4	82.5

Create a MinScore column in SuperGroups:

MinScore = CALCULATE(MIN(Groups[AverageScore]),RELATEDTABLE(Groups))

SuperGroups data is now:

SuperGroups

SuperGroup	MinScore
SuperGroup1	80
SuperGroup2	35

Adding Complexity

What happens when we change our core AttributeScores table to include an additional field so that we can store multiple "things" in the same table:

Attribute	Score	Thing
Attribute1	100	Thing1
Attribute2	75	Thing1
Attribute3	80	Thing1
Attribute4	100	Thing1
Attribute5	60	Thing1
Attribute6	30	Thing1
Attribute7	40	Thing1
Attribute8	35	Thing1
Attribute9	75	Thing1
Attribute10	90	Thing1
Attribute1	50	Thing2
Attribute2	25	Thing2
Attribute3	30	Thing2
Attribute4	50	Thing2
Attribute5	10	Thing2
Attribute6	50	Thing2
Attribute7	50	Thing2
Attribute8	50	Thing2
Attribute9	90	Thing2
Attribute10	90	Thing2

Note that the first 10 entries have remained the same other than the addition of "Thing1" in the "Thing" column. Now we have a representation of the same survey or scoring conducted on multiple "things".

Houston, We Have a Problem

By simply adding this additional field, we might think that we have not changed the data model to such a degree that our simple solution will suffice. No such luck.

If we simply put our Group and AverageScore in a table visual and add a slicer for "Thing", we get the same numbers for each Thing.

Group	AverageScore
Group1	60
Group2	55
Group3	42.5
Group4	86.25

When we should get:

Thing1

Group	AverageScore
Group1	85
Group2	80
Group3	35
Group4	82.5

Thing2

Group	AverageScore
Group1	35
Group2	30
Group3	50
Group4	90

Similarly, if we add a simple table visualization of SuperGroups to show "SuperGroup" and "MinScore", regardless of slicer selection, these remain:

SuperGroup	MinScore
SuperGroup1	55
SuperGroup2	42.5

If we try using row level security instead of slicers, the results are the same. So what is going on? Basically, calculated columns are essentially static in nature, calculated upon data refresh and are not updated contextually. Note that I realize that if you use the original "Score" value from AttributeScores and choose the dynamic "Average" aggregation that it works at the group level, but this only holds for very simplistic aggregations.

A Slightly More Complex Solution

In order to solve this problem, we switch from calculated columns to measures.

Back to the Groups table (or anywhere), we can create a measure defined such that:

AverageScoreMeasure = CALCULATE(AVERAGE(AttributeScores[Score]),RELATEDTABLE(AttributeScores))

This is EXACTLY the same DAX calculation as we had for our calculated column. However, the calculation for MinScore is dramatically different. For the MinScore as a measure, we are trying to find the minimum score of a measure and thus we must return the measure AverageScoreMeasure to our measure calculation within the correct context. Thus, to do this we define MinScoreMeasure such that:

MinScoreMeasure = MINX ( SUMMARIZE ( Groups, Groups[Group] , "AVG",[AverageScoreMeasure] ), [AVG])

What is going on here is that we are creating a summarized table using the SUMMARIZE function. We are returning two columns, the first is a grouping by our "Groups" column called "Groups" and the second is our AverageScoreMeasure calculation in a column called "AVG". We are then taking the MIN of column AVG.

Now, when we create table visualizations using our "Group" and "AverageScoreMeasure" in one table and our "SuperGroup" and "MinScoreMeasure" in anther table, we receive the correct results for "Thing1" and "Thing2" whether using slicers or RLS.

Thing1

Groups

Group	AverageScoreMeasure
Group1	85
Group2	80
Group3	35
Group4	82.5

SuperGroups

SuperGroup	MinScoreMeasure
SuperGroup1	80
SuperGroup2	35

Thing2

Groups

Group	AverageScoreMeasure
Group1	35
Group2	30
Group3	50
Group4	90

SuperGroups

SuperGroup	MinScoreMeasure
SuperGroup1	30
SuperGroup2	50

Conclusion

The designer pattern described here has many applications to just about any scenario in which scores or rankings need to be aggregated and then aggregated again. The solutions described within this article handle both simple scenarios as well as complex scenarios involving multiple base scoring sets of data. This scenario demonstrates the limitations of custom columns and the corresponding superiority of measures for handling aggregation roll-ups.

Vvelarde · ‎03-07-2017

Great article and I'm happy to have made a small contribution to this pattern.

Pratima · ‎03-08-2017

Can any one help me out on How to work on the Rest API in Power BI, More Guide lines in R Script as well as DAX Expression

Pratima · ‎03-08-2017

Can any one help me out on How to work on the Rest API in Power BI, More Guide lines in R Script as well as DAX Expression

Thanks in Advance,

Pratima

DavidMoss · ‎04-15-2017

Groups, sub groups and super groups....well thought out @Greg_Deckler

Anonymous · ‎04-20-2020

edit: I retract my earlier statement, I had a cardinality error in the relationships that I had missed even though I triple checked prior to posting. The original formula does work properly.

@Greg_Deckler I'm following along in Pwr BI and found a typo in at least on of your formulas. Above you have:

AverageScore = CALCULATE(AVERAGE(AttributeScores[Score]),RELATEDTABLE(AttributeScores))

When in fact the related table should be 'Attributes2Groups' no the actual Attribute table.

Cheers,

-R

Design Pattern - Groups and Super Groups!

How to provide sample data in the Power BI Forum

Re: Supercharge Your Visuals: Easy Conditional For...

Re: Power BI Tip: How To Sort A Column With Duplic...

Enhanced Data Profiling in Power Query: GUI and Ta...

Power BI Quick Reference Sheet