March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount! Early bird discount ends December 31.
Register NowBe one of the first to start using Fabric Databases. View on-demand sessions with database experts and the Microsoft product team to learn just how easy it is to get started. Watch now
Hi,
Context:
I have a table with 7 billion rows in bigquery, the table is partitioned and clustered and performs well when sending queries.
I'm trying to get the number of unique users based on a certain condition and the following SQL syntax works perfectly.
SELECT
COUNT(DISTINCT users) AS cnt_users
FROM (
SELECT
users,
AVG(transactions) AS transactions
FROM
`project.dataset.canales`
WHERE
Date BETWEEN '2023-10-01' AND '2023-11-05'
GROUP BY
users)
WHERE
transactions BETWEEN 15 AND 25;
The result is one row with the value and the query runs in 2-3 seconds.
When translate this query to dax:
light_users =
VAR _users =
SUMMARIZECOLUMNS (
canales[users],
"avg_trx", AVERAGE ( canales[transactions] )
)
VAR _filter =
FILTER ( _users, [avg_trx] >= 15 && [avg_trx] <= 25 )
RETURN
COUNTROWS ( _filter )
The result in power bi is
Looking at the query created by Dax in bigquery I can see that Dax created a temp table but the quantity is 1+ million records
The table returns to Power BI to make the final calculation.
So, How can I calculate all directly in the source and solve the problem with the returned rows?
I tried using vars, without vars and nothing works for me.
Thanks
Hi @jaime_parra ,
In your DAX expression, you are using SUMMARIZECOLUMNS, which may generate intermediate tables with detailed information for subsequent calculations. In some cases, this can lead to poor performance, especially when working with large datasets.
One way to optimize DAX queries is to use a combination of CALCULATETABLE and VALUES. the idea is to create a filtered table directly in the source and reduce the number of rows before bringing it into Power BI for the final computation.
Try formula like below:
light_users =
CALCULATE (
COUNTROWS (
VALUES ( canales[users] )
),
FILTER (
ALL ( canales ),
canales[Date] >= DATE ( 2023, 10, 1 ) &&
canales[Date] <= DATE ( 2023, 11, 5 ) &&
AVERAGE ( canales[transactions] ) >= 15 &&
AVERAGE ( canales[transactions] ) <= 25
)
)
Best Regards,
Adamk Kong
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly
I have made the corresponding tests and different variations in the code. However, the result obtained in the query is blank.
When testing and validating what the code does in the dwh I see that it creates the following:
The problem observed with this metric is:
Thank you very much for your collaboration, in case of any other suggestions I will be very attentive.
Regards
Hi @jaime_parra
try without variables
light_users =
COUNTROWS (
FILTER (
SUMMARIZE (
Cancels,
canales[users],
"@avg_trx", AVERAGE ( canales[transactions] )
),
[@avg_trx] >= 15
&& [@avg_trx] <= 25
)
)
Hi @tamerj1
I told you that I did the test with the DAX you sent me and it still generates the same in the data source. So the problem persists.
March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!
Your insights matter. That’s why we created a quick survey to learn about your experience finding answers to technical questions.
Arun Ulag shares exciting details about the Microsoft Fabric Conference 2025, which will be held in Las Vegas, NV.
User | Count |
---|---|
23 | |
15 | |
12 | |
9 | |
8 |
User | Count |
---|---|
41 | |
32 | |
29 | |
12 | |
12 |