Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Compete to become Power BI Data Viz World Champion! First round ends August 18th. Get started.

Reply
Anonymous
Not applicable

(Very basic question) How to calculate the average of a subset of a row

First off, apologies for posting such a basic question - I've been working with Power BI for about a week with no real experience in it and I'm having to learn as I go with whatever help I can find on Google!

 

I've merged a couple of tables together to give me some of the fundamental information I need for a profitability report (see examples in screenshot) and have linked it to master data containing an overall "time worked" value from another sheet.  Of course because these aren't unique values (i.e. it's many-many) the merged value ("Total Hours") is duplicating across each applicable line.  I've dealt with this before using grouping/index columns but in this case those tricks won't work.

 

What I need to do therefore is average out the Total Hours column by the count of Staff ID and Period (calculated to be end of month) but I don't know the syntax.  I found another solution which suggested using an average over each but that applies the result to every line item equally and doesn't filter down by unique entry.

 

Please could I get a pointer?  In the example attached lines 2 to 4 would show 15.25, lines 5 to 7 would show 20.5, etc.

 

Untitled.jpg

1 ACCEPTED SOLUTION
Anonymous
Not applicable

I was able to create a solution for myself since originally asking.  I created a concatenate column in the data table (not in PQE) combining the staff ID and period to create a single vector indicator.  Then I created a second column with a count of that filter using COUNTX, FILTER and EARLIER:

Concat Count = COUNTX(FILTER('All Master Data', EARLIER('All Master Data' -Concat- )='All Master Data' -Concat- ),'All Master Data' -Concat- ).
 
From there I divided the billed hours by the Concat Count column and was able to get the proper pro-rata value in my measures.  I appreciate your answer likely would have worked too @v-kkf-msft but as I mentioned at the outset - and sadly something which is often ignored/missed in technical forums like this - I don't yet have the underlying knowledge to know whether a proposal is likely to be technically correct without a lot of stress testing, and can't always explain in the right terms what I'm looking for!

View solution in original post

7 REPLIES 7
Anonymous
Not applicable

I was able to create a solution for myself since originally asking.  I created a concatenate column in the data table (not in PQE) combining the staff ID and period to create a single vector indicator.  Then I created a second column with a count of that filter using COUNTX, FILTER and EARLIER:

Concat Count = COUNTX(FILTER('All Master Data', EARLIER('All Master Data' -Concat- )='All Master Data' -Concat- ),'All Master Data' -Concat- ).
 
From there I divided the billed hours by the Concat Count column and was able to get the proper pro-rata value in my measures.  I appreciate your answer likely would have worked too @v-kkf-msft but as I mentioned at the outset - and sadly something which is often ignored/missed in technical forums like this - I don't yet have the underlying knowledge to know whether a proposal is likely to be technically correct without a lot of stress testing, and can't always explain in the right terms what I'm looking for!
Greg_Deckler
Community Champion
Community Champion

@Anonymous Where does count of staff id come from? And the Period? Sorry, having trouble following, can you post sample data as text and expected output?
Not really enough information to go on, please first check if your issue is a common issue listed here: https://community.powerbi.com/t5/Community-Blog/Before-You-Post-Read-This/ba-p/1116882

Also, please see this post regarding How to Get Your Question Answered Quickly: https://community.powerbi.com/t5/Community-Blog/How-to-Get-Your-Question-Answered-Quickly/ba-p/38490

The most important parts are:
1. Sample data as text, use the table tool in the editing bar
2. Expected output from sample data
3. Explanation in words of how to get from 1. to 2.



Follow on LinkedIn
@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
DAX For Humans

DAX is easy, CALCULATE makes DAX hard...
Anonymous
Not applicable

Thank you @Greg_Deckler .  To answer your points:

  • There are several tables holding static and aggregate data - the Staff table for example has a single list of Staff Names and Staff IDs which are used in some mappings.  Revenue data comes from another source and is driven by Staff ID and Client ID.  Period is a list of month end dates taken from time entry data and converted to end of month during transformation.  So as an example (all data invented):
    • Staff ID may be JA/Jack Adams, MB/Melanie Bonner and ST/Steve Toussand for example.
    • Client ID may be ALP01/Alpha Co, TEL02/Telecomms Solutions, ZER04/Zeraya Experts for example.
    • Jack may work with Alpha Co in Period July 2021 and will do 3 hours of work for them.  He may then work with Zeraya for 7 hours that same month.  In August he works 20 hours with Telecomms.  Melanie works only with Alpha and does 5 hours of work in May, June and July.  Steve works on a piecemeal basis and posts 2 hours to Alpha in June 2021, but then later enters another time entry for that same month for a further 3.75 hours to Alpha.  He also posts 1.5 hours to Telecomms in July, 0.75 hours to Zeraya in July and 8 hours to Alpha, 2 hours to Telecomms and 1.75 to Zeraya in August.
    • The resulting data:
      • Client ID        |        Staff ID        |        Period        |        Hours

ALP01                    JA                      July 2021                3.00

ZER04                    JA                      July 2021                7.00

TEL02                    JA                      August 2021           20.00

ALP01                   MB                     May 2021               5.00

ALP01                   MB                     June 2021               5.00

ALP01                   MB                     July 2021                5.00

ALP01                   ST                       June 2021               2.00

ALP01                   ST                       June 2021               3.75

TEL02                    ST                       July 2021                1.50

ZER04                   ST                       July 2021                 0.75

ALP01                   ST                       August 2021            8.00

TEL02                   ST                       August 2021            2.00 

ZER04                   ST                       August 2021            1.75

  • Alongside this are associated cost and revenue rates but these have already been calculated as a fixed value per hour so are not causing any issues at the moment.
  • Each of these three people will bill me for their time at a fixed rate, but for the total hours they work in the month, whether on client work or not (there will be admin time for example):
    • Jack will bill me for [Total Hours] 15.00 hours in July, 22.00 hours in August.
    • Melanie will bill me for 6.50 hours each month.
    • Steve will bill me for 6.50 hours in June, 3.00 hours in July and 15.00 hours in August.
  • The problem therefore is allocating out those hours billed against the total hours worked - the distribution doesn't matter, Jack's bill for July, for example, could be split 50% against each of his July timesheet lines, or split 30%/70% against his time worked - it will be used in the same way in the visualisation whichever way it is presented.
  • However because there is a many to many relationship between Jack's name, the period he works, and the period he bills me for, adding his billing data through a merge will look like this:
    •         Client ID        |        Staff ID        |        Period        |        Hours        |        Billed

ALP01                    JA                      July 2021                3.00                    15.00

ZER04                    JA                      July 2021                7.00                    15.00

  • And therefore his total will be wrong in summation because he will be recorded as having billed 30 hours (two entries at 15.00 hours each because of the duplication).  Ideally the result would look like this:
    •         Client ID        |        Staff ID        |        Period        |        Hours        |        Billed

ALP01                    JA                      July 2021                3.00                    7.50

ZER04                    JA                      July 2021                7.00                    7.50

 

I did try to find this particular issue in the FAQ you posted but I can't see it.  I also can't post the PBIX file as it contains some sensitive data on staff and clients - I'm struggling to explain this as you can see because I'm not yet familiar with the nomenclature, I'm an old hand at Excel and what I could do there doesn't always translate to what I can do here.

@Anonymous Where is Billed coming from? Seems like you could just SUMMARIZE or SUMMARIZECOLUMNS and take the average?



Follow on LinkedIn
@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
DAX For Humans

DAX is easy, CALCULATE makes DAX hard...
Anonymous
Not applicable

Hi @Greg_Deckler, as I understand it that's a DAX function whereas I'm processing this in Power Query with the intent of getting a table of standardized data I can use in visualisations - should I be doing this in DAX instead then?

 

The billed value is taken from a static table (linked to a spreadsheet) which is updated by hand at the end of each month.

Hi @Anonymous ,

 

Has your problem been solved? If the problem is still not resolved, please provide detailed error information or the expected result you expect. Let me know immediately, looking forward to your reply.

 

Best Regards,
Winniz

Hi @Anonymous ,

 

Please try the following formula:

 

AverageBilled = 
CALCULATE(
    MAX('Table'[Billed]) / COUNT('Table'[Staff ID]),
    FILTER(
        'Table',
        'Table'[Staff ID] = EARLIER('Table'[Staff ID]) && 'Table'[Period] = EARLIER('Table'[Period])
    )
)

image.png

 

If the problem is still not resolved, please provide detailed error information or the expected result you expect. Let me know immediately, looking forward to your reply.

Best Regards,
Winniz

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Helpful resources

Announcements
August Power BI Update Carousel

Power BI Monthly Update - August 2025

Check out the August 2025 Power BI update to learn about new features.

August 2025 community update carousel

Fabric Community Update - August 2025

Find out what's new and trending in the Fabric community.

Top Solution Authors