DAX vs SQL performance

anton_burtsev · ‎01-23-2017

Hello,

I have a performance issue with cohort analysis in Power BI. I have a log of sessions for a web site in table ga_sessions. It contains 30 million records and has the following columns

gid - visitor id from google analytics
time - beginning of sessions
medium - type of traffic: SEM, EMAIL, SEO, DIRECT, REFERRAL, etc.
cost - price of that session (e.g. click price)
revenue - purchases happend in session

(and also other fields as "ad campaign", "region", etc)

I want to select a cohort of users in past and look how they behave up to present. In T-SQL I write

SELECT s.medium, SUM(s.cost) cost, sum(s.revenue) revenue
FROM (
        SELECT gid, MIN(time) time FROM ga_sessions
        WHERE medium = 'email' AND time >= '2016-03-01' AND time < '2016-04-01'
        GROUP BY gid
) c
JOIN ga_sessions s ON c.gid = s.gid AND s.time >= c.time
GROUP BY s.medium
ORDER BY cost DESC

This query finishes in 300 milliseconds. But I need the same in Power BI:

EVALUATE
CALCULATETABLE (
    SUMMARIZE (
        GENERATE (
            SUMMARIZE (
                'Cohort Sessions',
                'Cohort Sessions'[gid],
                "c_time", MIN ( 'Cohort Sessions'[time] )
            ),
            FILTER (
                Sessions,
                Sessions[gid] = 'Cohort Sessions'[gid] && Sessions[time] >= [c_time]
            )
        ),
        Sessions[medium],
        "cost", SUM ( Sessions[cost] ),
        "revenue", SUM ( Sessions[revenue] ),
    ),
    'Cohort Sessions'[cohort medium] = "email",
    'Cohort Sessions'[time] >= DATE ( 2016, 03, 01 ),
    'Cohort Sessions'[time] < DATE ( 2016, 04, 01 )
)
ORDER BY [cost] desc

DAX query finishes in 7 seconds (9 seconds on cold cache)!

If I comment part Sessions[time] >= [c_time] the query finishes in 1.7 sec, but this condition is very important for me.

I created 2 instance of ga_sessions table ('Sessions' and 'Cohort Sessions') in Power BI since I need to provide user with distinct filters for cohort definition and futher sessions.

How I can optimize DAX query ?

Greg_Deckler · ‎01-23-2017

Any thought around doing the join and such in your query ("M") versus in DAX?

Follow on LinkedIn
@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!: DAX For Humans

DAX is easy, CALCULATE makes DAX hard...