Solved: Re: Distinct Count on High-Cardinality Dimension F...

alexandra-pbi · ‎11-17-2025

Hi all,

I’m working on a Fabric semantic model sourcing data directly from Microsoft Warehouse (using tables, not views). My model includes (excluding other dimensions):

Fact table: ~3 billion rows
Dimension table: ~500 million rows
Relationship: one-to-many (Dim → Fact)

I need to calculate distinct counts on dimension attributes, based on filters applied on the fact table. The current setup is DirectQuery, and my capacity is F126.

Given the scale of the data, I want to know the most optimum way to perform this DISTINCTCOUNT.

What would be the recommended approach in Fabric / DirectQuery for this scenario?

Thanks in advance for any guidance!

sevenhills · ‎11-17-2025

Adding Aggregation table is the best approach.
~ at source (table)
~ at manage aggregations in Power BI
https://learn.microsoft.com/en-us/power-bi/transform-model/aggregations-advanced
goal: smaller aggregated fact table

I noticed that you have dimenion of 500 M rows, which is typically a lot for dimension table. Unless it is a master list of combinations (like bridge table).

Please see if you can reduce the size or row count in other means! again each requirement is different, just sharing my thoughts!

Optional: Read: https://www.sqlbi.com/wp-content/uploads/Understanding-Distinct-Count-in-DAX-Query-Plans.pdf

View solution in original post

v-tejrama · ‎11-17-2025

Hi @alexandra-pbi ,

With the data volumes you described, DirectQuery cannot perform a distinct count efficiently by joining the fact and dimension on the fly, even when the fact table is heavily filtered. The only reliable way to make this operation perform well is to use an aggregated table that precomputes the distinct count at the level your report needs.

If you prefer not to change the Warehouse, you can still create this aggregated table directly in the semantic model as an Import table and map it with Manage Aggregations. A dimension with five hundred million rows is also unusually large, so reducing its size where possible will help. There is no alternative setting or DAX technique that can make a direct distinct count over these tables run efficiently in DirectQuery without pre-aggregation.

Best Regards,
Tejaswi.
Community Support

v-tejrama · ‎11-20-2025

Hi @alexandra-pbi ,

I wanted to check if you had the opportunity to review the information provided. Please feel free to contact us if you have any further questions.

Thank you.

v-tejrama · ‎11-23-2025

Hi @alexandra-pbi ,

I wanted to follow up and see if you had a chance to review the information shared. If you have any further questions or need additional assistance, feel free to reach out.

Thank you.

sevenhills · ‎11-17-2025

Adding Aggregation table is the best approach.
~ at source (table)
~ at manage aggregations in Power BI
https://learn.microsoft.com/en-us/power-bi/transform-model/aggregations-advanced
goal: smaller aggregated fact table

I noticed that you have dimenion of 500 M rows, which is typically a lot for dimension table. Unless it is a master list of combinations (like bridge table).

Please see if you can reduce the size or row count in other means! again each requirement is different, just sharing my thoughts!

Optional: Read: https://www.sqlbi.com/wp-content/uploads/Understanding-Distinct-Count-in-DAX-Query-Plans.pdf

PijushRoy · ‎11-17-2025

Hi @alexandra-pbi

The best option is to create an Aggregation table in the Fabric Warehouse and then use this table in the Semantic Model.

In your Microsoft Warehouse, create a new table that groups by the attributes you typically filter or group by and pre-calculates the distinct count.

Please find the reference link - https://learn.microsoft.com/en-us/fabric/enterprise/powerbi/aggregations-auto

https://www.sqlbi.com/tv/aggregations-in-power-bi/

Example

CREATE TABLE fact_customer_distinctcount_agg AS
  SELECT Cast(f.orderdate AS DATE)    AS OrderDate,
         p.productcategory,
         g.region,
         Count(DISTINCT f.customerid) AS DistinctCustomerCount
  FROM   facttable f
         JOIN dimproduct p
           ON f.productid = p.productid
         JOIN dimgeography g
           ON f.geographyid = g.geographyid
  GROUP  BY Cast(f.orderdate AS DATE),
            p.productcategory,
            g.region;

Did I answer your question? Mark my post as a solution!
Appreciate your Like/Kudos

Proud to be a Super User!

alexandra-pbi · ‎11-17-2025

Hi @PijushRoy

Thank you for your prompt reply.

I would like to avoid any changes in the Warehouse.

Do you believe there is no alternative, even with a basic star schema, to execute this distinct count efficiently?

Ideally, I am looking for a way to direct the engine to perform an inner join between the fact and the dimension after applying filters that reduce the fact table size.

Thank you for your guidance.

Distinct Count on High-Cardinality Dimension Filtered by Fact in Fabric

Helpful resources

Power BI Dataviz World Championships

Power BI Monthly Update - December 2025

FabCon Atlanta 2026

How to Get Your Question Answered Quickly

FabCon is coming to Atlanta

Distinct Count on High-Cardinality Dimension Filtered by Fact in Fabric

Helpful resources

Power BI Dataviz World Championships

Power BI Monthly Update - December 2025

FabCon Atlanta 2026

How to Get Your Question Answered Quickly