Join us at FabCon Atlanta from March 16 - 20, 2026, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.
Register now!The Power BI Data Visualization World Championships is back! Get ahead of the game and start preparing now! Learn more
Hi all,
I’m working on a Fabric semantic model sourcing data directly from Microsoft Warehouse (using tables, not views). My model includes (excluding other dimensions):
Fact table: ~3 billion rows
Dimension table: ~500 million rows
Relationship: one-to-many (Dim → Fact)
I need to calculate distinct counts on dimension attributes, based on filters applied on the fact table. The current setup is DirectQuery, and my capacity is F126.
Given the scale of the data, I want to know the most optimum way to perform this DISTINCTCOUNT.
What would be the recommended approach in Fabric / DirectQuery for this scenario?
Thanks in advance for any guidance!
Solved! Go to Solution.
Adding Aggregation table is the best approach.
~ at source (table)
~ at manage aggregations in Power BI
https://learn.microsoft.com/en-us/power-bi/transform-model/aggregations-advanced
goal: smaller aggregated fact table
I noticed that you have dimenion of 500 M rows, which is typically a lot for dimension table. Unless it is a master list of combinations (like bridge table).
Please see if you can reduce the size or row count in other means! again each requirement is different, just sharing my thoughts!
Optional: Read: https://www.sqlbi.com/wp-content/uploads/Understanding-Distinct-Count-in-DAX-Query-Plans.pdf
Hi @alexandra-pbi ,
With the data volumes you described, DirectQuery cannot perform a distinct count efficiently by joining the fact and dimension on the fly, even when the fact table is heavily filtered. The only reliable way to make this operation perform well is to use an aggregated table that precomputes the distinct count at the level your report needs.
If you prefer not to change the Warehouse, you can still create this aggregated table directly in the semantic model as an Import table and map it with Manage Aggregations. A dimension with five hundred million rows is also unusually large, so reducing its size where possible will help. There is no alternative setting or DAX technique that can make a direct distinct count over these tables run efficiently in DirectQuery without pre-aggregation.
Best Regards,
Tejaswi.
Community Support
Hi @alexandra-pbi ,
I wanted to check if you had the opportunity to review the information provided. Please feel free to contact us if you have any further questions.
Thank you.
Hi @alexandra-pbi ,
I wanted to follow up and see if you had a chance to review the information shared. If you have any further questions or need additional assistance, feel free to reach out.
Thank you.
Adding Aggregation table is the best approach.
~ at source (table)
~ at manage aggregations in Power BI
https://learn.microsoft.com/en-us/power-bi/transform-model/aggregations-advanced
goal: smaller aggregated fact table
I noticed that you have dimenion of 500 M rows, which is typically a lot for dimension table. Unless it is a master list of combinations (like bridge table).
Please see if you can reduce the size or row count in other means! again each requirement is different, just sharing my thoughts!
Optional: Read: https://www.sqlbi.com/wp-content/uploads/Understanding-Distinct-Count-in-DAX-Query-Plans.pdf
Hi @alexandra-pbi
The best option is to create an Aggregation table in the Fabric Warehouse and then use this table in the Semantic Model.
In your Microsoft Warehouse, create a new table that groups by the attributes you typically filter or group by and pre-calculates the distinct count.
Please find the reference link - https://learn.microsoft.com/en-us/fabric/enterprise/powerbi/aggregations-auto
https://www.sqlbi.com/tv/aggregations-in-power-bi/
Example
CREATE TABLE fact_customer_distinctcount_agg AS
SELECT Cast(f.orderdate AS DATE) AS OrderDate,
p.productcategory,
g.region,
Count(DISTINCT f.customerid) AS DistinctCustomerCount
FROM facttable f
JOIN dimproduct p
ON f.productid = p.productid
JOIN dimgeography g
ON f.geographyid = g.geographyid
GROUP BY Cast(f.orderdate AS DATE),
p.productcategory,
g.region;
Proud to be a Super User! | |
Hi @PijushRoy
Thank you for your prompt reply.
I would like to avoid any changes in the Warehouse.
Do you believe there is no alternative, even with a basic star schema, to execute this distinct count efficiently?
Ideally, I am looking for a way to direct the engine to perform an inner join between the fact and the dimension after applying filters that reduce the fact table size.
Thank you for your guidance.
The Power BI Data Visualization World Championships is back! Get ahead of the game and start preparing now!
| User | Count |
|---|---|
| 39 | |
| 37 | |
| 33 | |
| 32 | |
| 29 |
| User | Count |
|---|---|
| 132 | |
| 88 | |
| 82 | |
| 68 | |
| 64 |