Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn a 50% discount on the DP-600 certification exam by completing the Fabric 30 Days to Learn It challenge.

Reply
Bigjogle
New Member

Calculating the Average of a group of Calculated Minimum values from a dataset

Hello all,

 

First time posting here but I have been delving through the forums and have not found anyone doing quite what I am trying to accomplish.

 

I have a folder that is being populated with log files, each log file contains run data for one run (Repetition) of an experiment setup (DOE #).

 

I would like to calculate the minimum value for "Load" for each "run" (Repetition) and then calculate the average of these minimums based on experiment setup (DOE #). 

 

Here is a link to the dataset: https://drive.google.com/file/d/1phSWp27iRWmjp0khJq49hrQ7zp0ekosQ/view?usp=sharing

Here is a link to my current pbix: https://drive.google.com/file/d/1T-Ds1G7f6K2UMuxiOiOo7OxluuGB00Ws/view?usp=sharing

 

Any help would be greatly appriciated!

 

Bigjogle_0-1698269934227.png

 

-John

1 ACCEPTED SOLUTION
TomMartens
Super User
Super User

Hey @Bigjogle ,

 

I assume this measure returns what you are looking for:

average of min values = 
SELECTCOLUMNS(
    GROUPBY(
        SELECTCOLUMNS(
            ADDCOLUMNS(
                CALCULATETABLE(
                    SUMMARIZE(
                        'Log'
                        ,   'Log'[DOE #]
                        ,   'Log'[Repetition]
                    )
                    , ALL( 'Log'[Repetition] )
                )
                , "minV" , [Min Value]
            )
            , "doe #" , [DOE #]
            , "minV" , [minV]
        )
        , [doe #]
        , "avg_minV" , AVERAGEX( CURRENTGROUP() , [minV] )
    )
    , "value" , [avg_minV]
)

At least the values look like the expected ones:
image.png
Hopefully, this provides what you are looking for.

 

Regards,

Tom



Did I answer your question? Mark my post as a solution, this will help others!

Proud to be a Super User!
I accept Kudos 😉
Hamburg, Germany

View solution in original post

5 REPLIES 5
gmsamborn
Super User
Super User

Hi @Bigjogle 

 

I did this with 2 measures.  I'm sure it can be done in 1 but this seems simpler.

 

zMin = 
    CALCULATE(
        MIN( 'Log'[Load (lbf)] ),
        ALLEXCEPT( 'Log', 'Log'[DOE #], 'Log'[Repetition] )
    )

zAvg = 
    AVERAGEX(
        ALLSELECTED( 'Log'[Repetition], 'Log'[Reading] ),
        [zMin]
    )

 

I hope this helps.

This method also is a solution to the problem and thank you for your time!

Interestingly, the two solutions that are currently here give slightly  different results (likely due to how many digits went into the calculations). 
One solution gives 0.85, the other 0.84. If I manually calculate based on the 2 decimal numbers I get 0.845 (so if one solution was even off by 0.001 it could be swayed in either direction!)

Not a problem for my case, I just thought it was interesting.

TomMartens
Super User
Super User

Hey @Bigjogle ,

 

I assume this measure returns what you are looking for:

average of min values = 
SELECTCOLUMNS(
    GROUPBY(
        SELECTCOLUMNS(
            ADDCOLUMNS(
                CALCULATETABLE(
                    SUMMARIZE(
                        'Log'
                        ,   'Log'[DOE #]
                        ,   'Log'[Repetition]
                    )
                    , ALL( 'Log'[Repetition] )
                )
                , "minV" , [Min Value]
            )
            , "doe #" , [DOE #]
            , "minV" , [minV]
        )
        , [doe #]
        , "avg_minV" , AVERAGEX( CURRENTGROUP() , [minV] )
    )
    , "value" , [avg_minV]
)

At least the values look like the expected ones:
image.png
Hopefully, this provides what you are looking for.

 

Regards,

Tom



Did I answer your question? Mark my post as a solution, this will help others!

Proud to be a Super User!
I accept Kudos 😉
Hamburg, Germany

Thank you so much for the very rapid response. This worked great!

Hey @Bigjogle , Hey @gmsamborn ,

 

I consider calculating the average value one of the most challenging tasks in DAX. The reason for this is reflecting on the number of rows or - the divisor. When I adjust my solution a little and visualize the divisor like this:

 

 

average of min values = 
countrows(
    // GROUPBY(
        SELECTCOLUMNS(
            ADDCOLUMNS(
                CALCULATETABLE(
                    SUMMARIZE(
                        'Log'
                        ,   'Log'[DOE #]
                        ,   'Log'[Repetition]
                    )
                    , ALL( 'Log'[Repetition] )
                )
                , "minV" , [Min Value]
            )
            , "doe #" , [DOE #]
            , "minV" , [minV]
        )
    //     , [doe #]
    //     , "avg_minV" , AVERAGEX( CURRENTGROUP() , [minV] )
    // )
    // , [avg_minV]
)

 

 

and @gmsamborn solution like so:

 

 

zAvg = 
    countrows(
    // AVERAGEX(
        ALLSELECTED( 'Log'[Repetition], 'Log'[Reading] )
    //     [zMin]
    )

 

 

There are completey different results:

TomMartens_0-1698377293873.png

This will also explain the different "accuracy" when changing the number of decimals. And by writing this i realized that my initial DAX was not the solution I intended to provide, this is the one I wanted to share:

 

 

average of min values__ = 
AVERAGEX(
    GROUPBY(
        SELECTCOLUMNS(
            ADDCOLUMNS(
                CALCULATETABLE(
                    SUMMARIZE(
                        'Log'
                        ,   'Log'[DOE #]
                        ,   'Log'[Repetition]
                    )
                    , ALL( 'Log'[Repetition] )
                )
                , "minV" , [Min Value]
            )
            , "doe #" , [DOE #]
            , "minV" , [minV]
        )
        , [doe #]
        , "avg_minV" , AVERAGEX( CURRENTGROUP() , [minV] )
    )
    , [avg_minV]
)

 

 

The difference becomes obvious when the property "Total" is turned on again for the table data visualizaion:

image.png

@Bigjogle hopefully this additional context helps to understand the different results. For this specific use case I favor my appraoch. But as always, it depends 🙂 This is the reason why I bother my colleagues with an exact definition of the divisior (the number of rows) when I'm getting asked to help with calculating an average.

 

Regards,

Tom



Did I answer your question? Mark my post as a solution, this will help others!

Proud to be a Super User!
I accept Kudos 😉
Hamburg, Germany

Helpful resources

Announcements
RTI Forums Carousel3

New forum boards available in Real-Time Intelligence.

Ask questions in Eventhouse and KQL, Eventstream, and Reflex.

MayPowerBICarousel

Fabric Monthly Update - May 2024

Check out the May 2024 Fabric update to learn about new features.

LearnSurvey

Fabric certifications survey

Certification feedback opportunity for the community.