Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

The Power BI Data Visualization World Championships is back! Get ahead of the game and start preparing now! Learn more

Reply
anton_burtsev
New Member

DAX vs SQL performance

Hello,

I have a performance issue with cohort analysis in Power BI. I have a log of sessions for a web site in table ga_sessions. It contains 30 million records and has the following columns

 

  • gid - visitor id from google analytics
  • time - beginning of sessions
  • medium - type of traffic: SEM, EMAIL, SEO, DIRECT, REFERRAL, etc.
  • cost - price of that session (e.g. click price)
  • revenue - purchases happend in session

(and also other fields as "ad campaign", "region", etc)

 

I want to select a cohort of users in past and look how they behave up to present. In T-SQL I write

 

SELECT s.medium, SUM(s.cost) cost, sum(s.revenue) revenue
FROM (
        SELECT gid, MIN(time) time FROM ga_sessions
        WHERE medium = 'email' AND time >= '2016-03-01' AND time < '2016-04-01'
        GROUP BY gid
) c
JOIN ga_sessions s ON c.gid = s.gid AND s.time >= c.time
GROUP BY s.medium
ORDER BY cost DESC

 

This query finishes in 300 milliseconds. But I need the same in Power BI:

 

EVALUATE
CALCULATETABLE (
    SUMMARIZE (
        GENERATE (
            SUMMARIZE (
                'Cohort Sessions',
                'Cohort Sessions'[gid],
                "c_time", MIN ( 'Cohort Sessions'[time] )
            ),
            FILTER (
                Sessions,
                Sessions[gid] = 'Cohort Sessions'[gid] && Sessions[time] >= [c_time]
            )
        ),
        Sessions[medium],
        "cost", SUM ( Sessions[cost] ),
        "revenue", SUM ( Sessions[revenue] ),
    ),
    'Cohort Sessions'[cohort medium] = "email",
    'Cohort Sessions'[time] >= DATE ( 2016, 03, 01 ),
    'Cohort Sessions'[time] < DATE ( 2016, 04, 01 )
)
ORDER BY [cost] desc

 

DAX query finishes in 7 seconds (9 seconds on cold cache)!

 

If I comment part Sessions[time] >= [c_time] the query finishes in 1.7 sec, but this condition is very important for me.

 

I created 2 instance of ga_sessions table ('Sessions' and 'Cohort Sessions') in Power BI since I need to provide user with distinct filters for cohort definition and futher sessions.

ch.png

 

How I can optimize DAX query ?

1 REPLY 1
Greg_Deckler
Community Champion
Community Champion

Any thought around doing the join and such in your query ("M") versus in DAX?



Follow on LinkedIn
@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
DAX For Humans

DAX is easy, CALCULATE makes DAX hard...

Helpful resources

Announcements
Power BI DataViz World Championships

Power BI Dataviz World Championships

The Power BI Data Visualization World Championships is back! Get ahead of the game and start preparing now!

December 2025 Power BI Update Carousel

Power BI Monthly Update - December 2025

Check out the December 2025 Power BI Holiday Recap!

FabCon Atlanta 2026 carousel

FabCon Atlanta 2026

Join us at FabCon Atlanta, March 16-20, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.