Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

View all the Fabric Data Days sessions on demand. View schedule

Reply
Anonymous
Not applicable

Datamodel: Best practice / Optimize performance on large tables

Hello,

 

I have a business problem that I'm seeking to solve, in what may not be the traditional star schema I would usually go for, due the size of the tables - both tables grow significantly each day.

 

 

Table A: Not sure if to call this a slowly changing dimension table or fact table, but it contains 300m rows of customer subscription info, containing a start and end date for when a customers has subscribed/has access to a given product.

In simple terms,1 line could be described as the following columns: CustomerID, ProductID, Start Date, End Date

A customer will at any given time have multiple products, with varying start / end dates.

 

Table B: Fact table with 900m rows, contaning the products a customer has bought at a specific point in time based on the subscriptions they have in A. There is not a direct relationship between a specific subscription and a product bought. A subscription just give access to a range of products with no relationship defined.

Columns could be: Date, Time, CustomerID, Product Purchased.

 

 

The goal is to calculate:
 - A: How many customers bought a product last month and not this month
- B: Of customers counted in A, how many had a subscription but dit not use it, and how many did no longer have a subscription.

 

A is fairly simple, as that can be calculated with simple dax logic only on table A.

 

B however requires me to check for each customer, if they have subscription that grants them access on a the specific day I'm measuring on.

 

 

 

My initial thougt was to expand table A to a day granularity, so instead of start-/end date, I would have 1 line per day that a given customer has had a access granting subscription.

Using a sample of the data, this table however gets rather big, so I doubt it will work in production.

 

Therefore...   does anyone have experience working with similar large fact / slowly changing dimension tables containing start- / end dates and knows what's best practice in terms of performance?

 

I'm aware that an alternative solution is to keep my table B as is, and then write my way out of it through DAX. Making DAX iterrate over each row to check weather Min/Max dates is within the period I'm seaching for.

 

If example tables / files are needed, I will upload if requested. 

 

3 REPLIES 3
v-lili6-msft
Community Support
Community Support

hi, @Anonymous

      Could you share some simple sample data and the expected output? You can upload it to OneDrive or Dropbox and post the link here. Do mask sensitive data before uploading. 

 

Best Regards,

Lin

Community Support Team _ Lin
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
Anonymous
Not applicable

 

@v-lili6-msft PBIX sample is avaliable here, with dataset limited to a single "customer" - PBIX Sample

 

My described table A would be "fAccess_Granting_From_To_Dates' and 'fProducts Sold' is table B.

Table 'fAccess_Granting_Days' is the same information, but from and to dates expanded to individual dates, as described above.

 

Note:

I'm aware that I can summarize table 'fAccess_Granting_Days' by removing the "Access_Granting_Product" column. But I'm later expecting to be able to answer which products customers had access through, so I'm keeping it for now.

 

As you can see from the table row counts, expanding the dates increases row count from 19 to 1.146 - the cardinality will however not explode dramatically.

 

The following measure is not finished, but somewhat describes what I want to achieve:

Customers without re-purchase, but Access now = 
VAR Access_Now = CALCULATE( COUNTROWS( fAccess_Granting_Days ) )
VAR Products_Sold_This_Month = CALCULATE( COUNTROWS( 'fProducts Sold' ) )
VAR Products_Sold_Last_Month = CALCULATE( COUNTROWS( 'fProducts Sold' ) ; PREVIOUSMONTH( dDates[Date] ) )
RETURN
CALCULATE( DISTINCTCOUNT( dCustomer[CustomerID] );
    //Count Distinct Customers
    FILTER( dCustomer ;
    //Filter Customer Table to find customers with no sales this month, but sales last month and have access now
        Products_Sold_This_Month = 0 &&    
        Products_Sold_Last_Month > 0 &&
        Access_Now > 0 )
)

 

I'm aware that for DAX / end-user performance, table 'fAccess_Granting_Days'  is preferable to "fAccess_Granting_From_To_Dates' is it should return much faster query times, as a measure on "fAccess_Granting_From_To_Dates' would have to itterate over the entire table to check for the correct dates.

I'm however afraid that 'fAccess_Granting_Days' will grow far to big.

 

So what is the suggestion for best practice in such cases?

 

 

hi, @Anonymous

     Use Table 'fAccess_Granting_Days' will make your model bigger and bigger, and usually, we need a date table,

then use crossjoin function to add a measure like below:

Customers without re-purchase, but Access now = 
VAR _Table= FILTER(CROSSJOIN(fAccess_Granting_From_To_Dates,'Date'),'Date'[Date]>=fAccess_Granting_From_To_Dates[Access_From_Date]&&'Date'[Date]<=fAccess_Granting_From_To_Dates[Access_To_Date]) VAR Access_Now = CALCULATE( COUNTROWS( _Table ) ) VAR Products_Sold_This_Month = CALCULATE( COUNTROWS( 'fProducts Sold' ) ) VAR Products_Sold_Last_Month = CALCULATE( COUNTROWS( 'fProducts Sold' ) ; PREVIOUSMONTH( dDates[Date] ) ) RETURN CALCULATE( DISTINCTCOUNT( dCustomer[CustomerID] ); //Count Distinct Customers FILTER( dCustomer ; //Filter Customer Table to find customers with no sales this month, but sales last month and have access now Products_Sold_This_Month = 0 && Products_Sold_Last_Month > 0 && Access_Now > 0 ) )

  Measure will take up memory when you use, but column or table will take up memory all the time.

 

Best Regards,

Lin

Community Support Team _ Lin
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Helpful resources

Announcements
November Power BI Update Carousel

Power BI Monthly Update - November 2025

Check out the November 2025 Power BI update to learn about new features.

Fabric Data Days Carousel

Fabric Data Days

Advance your Data & AI career with 50 days of live learning, contests, hands-on challenges, study groups & certifications and more!

FabCon Atlanta 2026 carousel

FabCon Atlanta 2026

Join us at FabCon Atlanta, March 16-20, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.

Top Solution Authors
Top Kudoed Authors