Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Find everything you need to get certified on Fabric—skills challenges, live sessions, exam prep, role guidance, and more. Get started

Reply
maqsud
Helper III
Helper III

Calculate correlation between multiple columns

I have table name Data imported in PowerBI

IDIMPTEXT1IMPTEXT2A
1454.5
2544.3
3444.7

 

 

I want Table in PowerBI which can calculate correlation between IMPTEXT1 with and IMPTEXT2 with A.

Some Column NameA
IMPTEXT10.78 (some correlation value)
IMPTEXT20.85 (some correlation value)

 

I tried following but this measure gives me single correlation value. And i am looking for Table in PowerBI which can be calculated through some DAX.

 

Correlation Coefficient = 
VAR Correlation_Table =
    FILTER (
        ADDCOLUMNS (
            VALUES ( Data[ID] ),
            "Value_X", CALCULATE ( SUM ( Data[IMPTEXT1] ) ),
            "Value_Y", CALCULATE ( SUM ( Data[IMPTEXT2] ) )
        ),
        AND (
            NOT ( ISBLANK ( [Value_X] ) ),
            NOT ( ISBLANK ( [Value_Y] ) )
        )
    )
VAR Count_Items =
    COUNTROWS ( Correlation_Table )
VAR Sum_X =
    SUMX ( Correlation_Table, [Value_X] )
VAR Sum_X2 =
    SUMX ( Correlation_Table, [Value_X] ^ 2 )
VAR Sum_Y =
    SUMX ( Correlation_Table, [Value_Y] )
VAR Sum_Y2 =
    SUMX ( Correlation_Table, [Value_Y] ^ 2 )
VAR Sum_XY =
    SUMX ( Correlation_Table, [Value_X] * [Value_Y] )
VAR Pearson_Numerator =
    Count_Items * Sum_XY - Sum_X * Sum_Y
VAR Pearson_Denominator_X =
    Count_Items * Sum_X2 - Sum_X ^ 2
VAR Pearson_Denominator_Y =
    Count_Items * Sum_Y2 - Sum_Y ^ 2
VAR Pearson_Denominator =
    SQRT ( Pearson_Denominator_X * Pearson_Denominator_Y )
RETURN
    DIVIDE ( Pearson_Numerator, Pearson_Denominator )

 

 

Thank you for time and consideration.

1 ACCEPTED SOLUTION
v-easonf-msft
Community Support
Community Support

Hi , @maqsud 

The Dax you provided is  just to   calculate correlation between IMPTEXT1 and IMPTEXT2 , so it gives you single correlation value.

I think what you want is to calculate  the correlation between IMPTEXT1 and A 

 

Correlation Coefficient_IMPTEXT1 = 
VAR Correlation_Table =
    FILTER (
        ADDCOLUMNS (
            VALUES ( Data[ID] ),
            "Value_X", CALCULATE ( SUM ( Data[IMPTEXT1] ) ),
            "Value_Y", CALCULATE ( SUM ( Data[A] ) )
        ),
        AND (
            NOT ( ISBLANK ( [Value_X] ) ),
            NOT ( ISBLANK ( [Value_Y] ) )
        )
    )
VAR Count_Items =
    COUNTROWS ( Correlation_Table )
VAR Sum_X =
    SUMX ( Correlation_Table, [Value_X] )
VAR Sum_X2 =
    SUMX ( Correlation_Table, [Value_X] ^ 2 )
VAR Sum_Y =
    SUMX ( Correlation_Table, [Value_Y] )
VAR Sum_Y2 =
    SUMX ( Correlation_Table, [Value_Y] ^ 2 )
VAR Sum_XY =
    SUMX ( Correlation_Table, [Value_X] * [Value_Y] )
VAR Pearson_Numerator =
    Count_Items * Sum_XY - Sum_X * Sum_Y
VAR Pearson_Denominator_X =
    Count_Items * Sum_X2 - Sum_X ^ 2
VAR Pearson_Denominator_Y =
    Count_Items * Sum_Y2 - Sum_Y ^ 2
VAR Pearson_Denominator =
    SQRT (ABS( Pearson_Denominator_X * Pearson_Denominator_Y ))
RETURN
    DIVIDE ( Pearson_Numerator, Pearson_Denominator )

 

and the correlation between IMPTEXT2 and A .

 

Correlation Coefficient_IMPTEXT2 = 
VAR Correlation_Table =
    FILTER (
        ADDCOLUMNS (
            VALUES ( Data[ID] ),
            "Value_X", CALCULATE ( SUM ( Data[IMPTEXT2] ) ),
            "Value_Y", CALCULATE ( SUM ( Data[A] ) )
        ),
        AND (
            NOT ( ISBLANK ( [Value_X] ) ),
            NOT ( ISBLANK ( [Value_Y] ) )
        )
    )
VAR Count_Items =
    COUNTROWS ( Correlation_Table )
VAR Sum_X =
    SUMX ( Correlation_Table, [Value_X] )
VAR Sum_X2 =
    SUMX ( Correlation_Table, [Value_X] ^ 2 )
VAR Sum_Y =
    SUMX ( Correlation_Table, [Value_Y] )
VAR Sum_Y2 =
    SUMX ( Correlation_Table, [Value_Y] ^ 2 )
VAR Sum_XY =
    SUMX ( Correlation_Table, [Value_X] * [Value_Y] )
VAR Pearson_Numerator =
    Count_Items * Sum_XY - Sum_X * Sum_Y
VAR Pearson_Denominator_X =
    Count_Items * Sum_X2 - Sum_X ^ 2
VAR Pearson_Denominator_Y =
    Count_Items * Sum_Y2 - Sum_Y ^ 2
VAR Pearson_Denominator =
    SQRT ( ABS(Pearson_Denominator_X * Pearson_Denominator_Y ))
RETURN
    DIVIDE ( Pearson_Numerator, Pearson_Denominator )

 

 

If I misunderstand what you mean ,please feel free to let me know.

 

Best Regards,
Community Support Team _ Eason

 

View solution in original post

3 REPLIES 3
v-easonf-msft
Community Support
Community Support

Hi , @maqsud 

The Dax you provided is  just to   calculate correlation between IMPTEXT1 and IMPTEXT2 , so it gives you single correlation value.

I think what you want is to calculate  the correlation between IMPTEXT1 and A 

 

Correlation Coefficient_IMPTEXT1 = 
VAR Correlation_Table =
    FILTER (
        ADDCOLUMNS (
            VALUES ( Data[ID] ),
            "Value_X", CALCULATE ( SUM ( Data[IMPTEXT1] ) ),
            "Value_Y", CALCULATE ( SUM ( Data[A] ) )
        ),
        AND (
            NOT ( ISBLANK ( [Value_X] ) ),
            NOT ( ISBLANK ( [Value_Y] ) )
        )
    )
VAR Count_Items =
    COUNTROWS ( Correlation_Table )
VAR Sum_X =
    SUMX ( Correlation_Table, [Value_X] )
VAR Sum_X2 =
    SUMX ( Correlation_Table, [Value_X] ^ 2 )
VAR Sum_Y =
    SUMX ( Correlation_Table, [Value_Y] )
VAR Sum_Y2 =
    SUMX ( Correlation_Table, [Value_Y] ^ 2 )
VAR Sum_XY =
    SUMX ( Correlation_Table, [Value_X] * [Value_Y] )
VAR Pearson_Numerator =
    Count_Items * Sum_XY - Sum_X * Sum_Y
VAR Pearson_Denominator_X =
    Count_Items * Sum_X2 - Sum_X ^ 2
VAR Pearson_Denominator_Y =
    Count_Items * Sum_Y2 - Sum_Y ^ 2
VAR Pearson_Denominator =
    SQRT (ABS( Pearson_Denominator_X * Pearson_Denominator_Y ))
RETURN
    DIVIDE ( Pearson_Numerator, Pearson_Denominator )

 

and the correlation between IMPTEXT2 and A .

 

Correlation Coefficient_IMPTEXT2 = 
VAR Correlation_Table =
    FILTER (
        ADDCOLUMNS (
            VALUES ( Data[ID] ),
            "Value_X", CALCULATE ( SUM ( Data[IMPTEXT2] ) ),
            "Value_Y", CALCULATE ( SUM ( Data[A] ) )
        ),
        AND (
            NOT ( ISBLANK ( [Value_X] ) ),
            NOT ( ISBLANK ( [Value_Y] ) )
        )
    )
VAR Count_Items =
    COUNTROWS ( Correlation_Table )
VAR Sum_X =
    SUMX ( Correlation_Table, [Value_X] )
VAR Sum_X2 =
    SUMX ( Correlation_Table, [Value_X] ^ 2 )
VAR Sum_Y =
    SUMX ( Correlation_Table, [Value_Y] )
VAR Sum_Y2 =
    SUMX ( Correlation_Table, [Value_Y] ^ 2 )
VAR Sum_XY =
    SUMX ( Correlation_Table, [Value_X] * [Value_Y] )
VAR Pearson_Numerator =
    Count_Items * Sum_XY - Sum_X * Sum_Y
VAR Pearson_Denominator_X =
    Count_Items * Sum_X2 - Sum_X ^ 2
VAR Pearson_Denominator_Y =
    Count_Items * Sum_Y2 - Sum_Y ^ 2
VAR Pearson_Denominator =
    SQRT ( ABS(Pearson_Denominator_X * Pearson_Denominator_Y ))
RETURN
    DIVIDE ( Pearson_Numerator, Pearson_Denominator )

 

 

If I misunderstand what you mean ,please feel free to let me know.

 

Best Regards,
Community Support Team _ Eason

 

harshnathani
Community Champion
Community Champion

Hi @maqsud ,

 

Please check if this helps.

 

https://www.youtube.com/watch?v=xtRXakHARZg

 

Regards,

Harsh Nathani

Helpful resources

Announcements
Europe Fabric Conference

Europe’s largest Microsoft Fabric Community Conference

Join the community in Stockholm for expert Microsoft Fabric learning including a very exciting keynote from Arun Ulag, Corporate Vice President, Azure Data.

Power BI Carousel June 2024

Power BI Monthly Update - June 2024

Check out the June 2024 Power BI update to learn about new features.

RTI Forums Carousel3

New forum boards available in Real-Time Intelligence.

Ask questions in Eventhouse and KQL, Eventstream, and Reflex.