Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

We've captured the moments from FabCon & SQLCon that everyone is talking about, and we are bringing them to the community, live and on-demand. Starts on April 14th. Register now

Reply
anton_burtsev
New Member

DAX vs SQL performance

Hello,

I have a performance issue with cohort analysis in Power BI. I have a log of sessions for a web site in table ga_sessions. It contains 30 million records and has the following columns

 

  • gid - visitor id from google analytics
  • time - beginning of sessions
  • medium - type of traffic: SEM, EMAIL, SEO, DIRECT, REFERRAL, etc.
  • cost - price of that session (e.g. click price)
  • revenue - purchases happend in session

(and also other fields as "ad campaign", "region", etc)

 

I want to select a cohort of users in past and look how they behave up to present. In T-SQL I write

 

SELECT s.medium, SUM(s.cost) cost, sum(s.revenue) revenue
FROM (
        SELECT gid, MIN(time) time FROM ga_sessions
        WHERE medium = 'email' AND time >= '2016-03-01' AND time < '2016-04-01'
        GROUP BY gid
) c
JOIN ga_sessions s ON c.gid = s.gid AND s.time >= c.time
GROUP BY s.medium
ORDER BY cost DESC

 

This query finishes in 300 milliseconds. But I need the same in Power BI:

 

EVALUATE
CALCULATETABLE (
    SUMMARIZE (
        GENERATE (
            SUMMARIZE (
                'Cohort Sessions',
                'Cohort Sessions'[gid],
                "c_time", MIN ( 'Cohort Sessions'[time] )
            ),
            FILTER (
                Sessions,
                Sessions[gid] = 'Cohort Sessions'[gid] && Sessions[time] >= [c_time]
            )
        ),
        Sessions[medium],
        "cost", SUM ( Sessions[cost] ),
        "revenue", SUM ( Sessions[revenue] ),
    ),
    'Cohort Sessions'[cohort medium] = "email",
    'Cohort Sessions'[time] >= DATE ( 2016, 03, 01 ),
    'Cohort Sessions'[time] < DATE ( 2016, 04, 01 )
)
ORDER BY [cost] desc

 

DAX query finishes in 7 seconds (9 seconds on cold cache)!

 

If I comment part Sessions[time] >= [c_time] the query finishes in 1.7 sec, but this condition is very important for me.

 

I created 2 instance of ga_sessions table ('Sessions' and 'Cohort Sessions') in Power BI since I need to provide user with distinct filters for cohort definition and futher sessions.

ch.png

 

How I can optimize DAX query ?

1 REPLY 1
Greg_Deckler
Community Champion
Community Champion

Any thought around doing the join and such in your query ("M") versus in DAX?



Follow on LinkedIn
@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
DAX For Humans

DAX is easy, CALCULATE makes DAX hard...

Helpful resources

Announcements
New to Fabric survey Carousel

New to Fabric Survey

If you have recently started exploring Fabric, we'd love to hear how it's going. Your feedback can help with product improvements.

Power BI DataViz World Championships carousel

Power BI DataViz World Championships - June 2026

A new Power BI DataViz World Championship is coming this June! Don't miss out on submitting your entry.

Join our Fabric User Panel

Join our Fabric User Panel

Share feedback directly with Fabric product managers, participate in targeted research studies and influence the Fabric roadmap.