Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn a 50% discount on the DP-600 certification exam by completing the Fabric 30 Days to Learn It challenge.

Reply
seigneur101
New Member

Nested SUMMARIZE doesn't work with VAR?

Hello everyone!

 

I came up with a solution to a problem that works, but I am not sure whether it is the best solution. However, regardless of that, I don't understand why one of my many other trials doesn't work given my final solution.

 

I want to clarify why my final solution works, but not my other one.

 

Here is a gross summary of what I was trying to achieve:

 

  1. The data look like this:
    • Dimensions: Company (a.k.a. Enterprise/Business), Function (a.k.a. Job/Department), and CompensationType (e.g. base salary, bonuses, ...)
    • Fact table: Compensation fact table (list of salary entries by compensation type, company, and function; many entries for each company, because the underlying data basically represent unique employees)
  2. The calculation required is this: Get the average median salary (by function) of the average salaries (by function by company)
    1. First, we calculate the average salaries for each (company * function) combination for the selected compensation types. The resulting table should have 1 row per company * function, showing the average.
    2. We calculate the median company average for each function. The resulting table should have 1 row per function, showing the median average.
    3. We calculate the average of the medians of the functions selected

 

[OPTIONAL READING: For the curious reader, we always start with company averages rather than taking the granular salary entries to ensure that large companies do not skew the calcs.

 

Furthermore, the third step (taking the average of the medians) becomes "necessary" when someone's job encompasses the responsibilities of two (or more) jobs, but the underlying data is scarse for one or more of the jobs; again, trying not to skew and take the arithmetic average instead.

 

Maybe not the best, I don't know, but that's what the specs were.]

 

Basically, we want the average median of the averages.

 

The following code does exactly what we want:

 

Measure_Works = 
AVERAGEX(
	SUMMARIZE(
		SUMMARIZE(
			'FactCompensation';
			DimCompany[CompanyID]; DimFunction[FunctionId];
			"Average"; 
			AVERAGE(FactCompensation[CompensationAmount])
		);
		DimFunction[FunctionId];
		"Percentile"; 
		PERCENTILEX.INC(
			SUMMARIZE(
				'FactCompensation';
				DimCompany[CompanyID]; DimFunction[FunctionId];
				"Average"; 
				AVERAGE([CompensationAmount])
			); 
			[Average]; 
			0,50
		)
	);
	[Percentile]
)

 

However, this is hard to understand, and the code contains duplicate sections. At first, I had tried to use "var" in order to split the calculations into logical groupings, and also avoid code repetitions altogether. My logic was this:

 

  1. I'll calculate the averages by company/function first;
  2. I'll calculate the medians of 1), and;
  3. I'll calculate the average of the medians in 2).

 

This is what the code looked like (doesn't work):

 

Measure_DoesNotWork = 
var tblAverageByOrg =
    SUMMARIZE(
        FactCompensation;
        DimCompany[CompanyID]; DimFunction[FunctionId];
        "Average"; AVERAGE(FactCompensation[CompensationAmount])
    )

var tblPercentiles =
	summarize(
		tblAverageByOrg;
		DimFunction[FunctionId];
		"Percentile"; PERCENTILEX.INC(tblAverageByOrg; [Average]; 0,50)
	)

return AVERAGEX(tblPercentiles; [Percentile])

If you replace the variables used by their definition, you'll quickly see that the second code snippet becomes exactly the same as the first code snippet (which works).

 

Both measures do not return the same results.

 

Am I mistaken in thinking that they should have returned the same results? Am I missing something?

 

I have prepared a folder in which you'll be able to see what I mean. The folder includes a Power BI with the data imported in it, as well as an Excel spreadsheet that replicates both calculations (one that is OK, one that is wrong), and have provided the DAX calculations for both measures as well. In the Excel spreadsheet, you'll see that the second step of the calculations (calculating the median of the averages by function) does not work; it calculates the median over all the rows, irrespective of function.

 

I have shared the folder from my personal Google Driver. Please find it here: PBI_Embedded_Summarize_Example.

 

Why does the first code snippet work, but not the second one?

 

Thank you!

 

Philippe

1 ACCEPTED SOLUTION
TomMartens
Super User
Super User

Hey,

 

you should just use SUMMARIZE to group rows and add columns using ADDCOLUMNS(...), see here:

https://www.sqlbi.com/articles/best-practices-using-summarize-and-addcolumns/

 

Under certain circumstances it's still favorable to use ADDCOLUMNS(SUMMARIZRE(...),...) instead of SUMMARIZECOLUMNS, but the aritcle above is also mentioning this.

 

Not sure if this will help, but SUMMARIZE is an iterator, meaning maybe you should consider to wrap you expression into a CALCULATE(...) to enforce context transitiont.

 

Just wondering, without using your sample files

 

Regards,

Tom



Did I answer your question? Mark my post as a solution, this will help others!

Proud to be a Super User!
I accept Kudos 😉
Hamburg, Germany

View solution in original post

1 REPLY 1
TomMartens
Super User
Super User

Hey,

 

you should just use SUMMARIZE to group rows and add columns using ADDCOLUMNS(...), see here:

https://www.sqlbi.com/articles/best-practices-using-summarize-and-addcolumns/

 

Under certain circumstances it's still favorable to use ADDCOLUMNS(SUMMARIZRE(...),...) instead of SUMMARIZECOLUMNS, but the aritcle above is also mentioning this.

 

Not sure if this will help, but SUMMARIZE is an iterator, meaning maybe you should consider to wrap you expression into a CALCULATE(...) to enforce context transitiont.

 

Just wondering, without using your sample files

 

Regards,

Tom



Did I answer your question? Mark my post as a solution, this will help others!

Proud to be a Super User!
I accept Kudos 😉
Hamburg, Germany

Helpful resources

Announcements
LearnSurvey

Fabric certifications survey

Certification feedback opportunity for the community.

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.