Solved: Re: Caculating the sales date at which a certain o...

MJEnnis · ‎04-16-2023

I am trying to help my wife solve a problem related to occupancy rates for accommodations. I have created measures that accurately calculate the number of units occupied on any given date ([Occupancy]), and I have created another measure that calculates the corresponding occupancy rate ([Occupancy %]). But what I need to figure out now is something called a “stop sale” date. More specifically, that is the sales date on which the occupancy for a given type of unit reached 90% for a given occupancy date. (The date on which a unit is sold almost always proceeds the date on which the guests arrive.) Note that there are numerous different types of units (called a “risorsa”, that is, “resource”)

There are two relevant tables in the model: a table that lists all arrivals, and a table that lists all sales. Each row in the arrivals table represents one unit booked. Each row in the sales table, however, may refer to multiple booked units. There is also a calendar table in the model (referenced in the Occupancy measure below).

I can easily calculate cumulative sales, cumulative units sold, etc. either by the sale date or the occupancy date. I just cannot seem to wrap my head around how to create a dynamic calculation of cumulative occupancy rate for each unit type and each occupancy date based on the sales date. Other than creating some massive calculated table.

I did figure out a way to return the last date of sale for any unit that should have a “stop sale” imposed. But this is not helpful, because the date needs to be pegged not to the current occupancy rate but the point in time at which occupancy reached 90%. So the measure is really only telling me what I already know: that we have passed 90% occupancy.

Any ideas?

Here is the structure of the sales table

Order ID	Totale Conto	Data Acquisizione
1111B	£500.	14.04.23
2222A	£500.	13.04.23
3333C	£1,000.	12.04.23
4444D	£500.	04.04.23
5555E	£1,000.	04.04.23

Here is the basic structure of the arrivals table:

Arrivo	Partenza	ID Risorsa	Order ID
01.08.23 00:00	08.08.23	AA	1111B
01.07.23 00:00	14.07.23	AA	2222A
02.07.23 00:00	13.07.23	BB	3333C
01.07.23 00:00	08.07.23	BB	4444D
05.08.23 00:00	12.08.23	CC	5555E
02.07.23 00:00	13.07.23	BB	3333C
05.08.23 00:00	12.08.23	CC	5555E

Here is the Occupancy Measure

Occupancy = 
IF(ISBLANK(
Sumx(
    'Calendar', 
Calculate(
    COUNTROWS('Lista arrivi'),
FILTER(
    'Lista arrivi', 
'Calendar'[Date] >= 'Lista arrivi'[Arrivo] && 
'Calendar'[Date] < 'Lista arrivi'[Partenza])))), 0,
Sumx(
    'Calendar', 
Calculate(
    COUNTROWS('Lista arrivi'),
FILTER(
    'Lista arrivi', 
'Calendar'[Date] >= 'Lista arrivi'[Arrivo] && 
'Calendar'[Date] < 'Lista arrivi'[Partenza])))
)

Here is the measure that calculates Occupancy Rate

Occupancy % = DIVIDE(DIVIDE([Occupancy], [Units]), COUNTROWS('Calendar'))

Here is the (useless) measure that returns the last date of sale on record for a given occupancy date.

Stop Sale Date = 

VAR MaxDate = MAXX('Calendar', 'Calendar'[Date])

VAR ORDER_IDS = VALUES('Lista arrivi'[Order ID])

VAR Tab1 = ADDCOLUMNS(SELECTCOLUMNS('Calendar', 
"Date", 'Calendar'[Date]), "Occupancy", [Occupancy %])

VAR Tab2 = ADDCOLUMNS(Tab1, 
"Stop Sale", IF([Occupancy] > 0.895, 1, 0))

VAR Tab3 = ADDCOLUMNS(Tab2, 
"STOP DATE", CALCULATE(MAX(Bookings[Data Acquisizione]), 
    FILTER(ALLSELECTED(Bookings), 
    Bookings[Data Acquisizione] <= MaxDate
    && CONTAINS(ORDER_IDS,'Lista arrivi'[Order ID], Bookings[Order ID]))
    )
    )

VAR OCCUP = AVERAGEX(Tab1, [Occupancy %])

VAR STOP_SALE = AVERAGEX(Tab2, [Stop Sale])

RETURN IF([Occupancy %] < 0.895, "NO", MAXX(Tab3, [STOP DATE]))

And this is what the visulizations need to look like (left table is occupancy rates and right side is supposed to report the date on which 90% was reached, but currently reports the last day on which the unit type, aka "resource", was sold):

MJEnnis · ‎04-19-2023

SSD = 

VAR d = MAX('Calendar'[Date])
VAR u = SUM('Numero e ID risorse'[Numero risorse])

VAR T1 = SELECTCOLUMNS('Lista Arrivi',
            "ID", 'Lista Arrivi'[Order ID],
            "A", 'Lista Arrivi'[Arrivo],
            "P", 'Lista Arrivi'[Partenza]
            )

VAR T2 = ADDCOLUMNS(T1,
        "SD", LOOKUPVALUE(Bookings[DataAcquisizione], Bookings[Order ID], [ID])
            )  

VAR T3 = SELECTCOLUMNS('Calendar',
            "D", 'Calendar'[Date]
            )

VAR T4 = CROSSJOIN(T2, T3)

VAR T5 = FILTER(T4, [D] >= [A] && [D] < [P])

VAR T6 = SUMMARIZE(T5, [SD], [D], 
        "OCC", COUNTROWS(FILTER(T5, [D] = d && [SD] = EARLIER([SD]))),
        "U", u
        )

VAR T7 = ADDCOLUMNS(T6, "OCC%", DIVIDE([OCC], [U]))

VAR T8 = SUMMARIZE(T7, [SD], [D],
        "T_OCC%", SUMX(FILTER(T7, [D] = d), [OCC%])
        )

VAR T9 = FILTER(T8, [T_OCC%] >= 0.895 && [D] = d)

VAR MIN_OCC = MINX(T9, [T_OCC%])

VAR T10 = FILTER(T9, [T_OCC%] = MIN_OCC)

VAR SSD = MAXX(T10, [SD])

RETURN (IF(ISBLANK(SSD), "NO", SSD))

View solution in original post

MJEnnis · ‎04-19-2023

SSD = 

VAR d = MAX('Calendar'[Date])
VAR u = SUM('Numero e ID risorse'[Numero risorse])

VAR T1 = SELECTCOLUMNS('Lista Arrivi',
            "ID", 'Lista Arrivi'[Order ID],
            "A", 'Lista Arrivi'[Arrivo],
            "P", 'Lista Arrivi'[Partenza]
            )

VAR T2 = ADDCOLUMNS(T1,
        "SD", LOOKUPVALUE(Bookings[DataAcquisizione], Bookings[Order ID], [ID])
            )  

VAR T3 = SELECTCOLUMNS('Calendar',
            "D", 'Calendar'[Date]
            )

VAR T4 = CROSSJOIN(T2, T3)

VAR T5 = FILTER(T4, [D] >= [A] && [D] < [P])

VAR T6 = SUMMARIZE(T5, [SD], [D], 
        "OCC", COUNTROWS(FILTER(T5, [D] = d && [SD] = EARLIER([SD]))),
        "U", u
        )

VAR T7 = ADDCOLUMNS(T6, "OCC%", DIVIDE([OCC], [U]))

VAR T8 = SUMMARIZE(T7, [SD], [D],
        "T_OCC%", SUMX(FILTER(T7, [D] = d), [OCC%])
        )

VAR T9 = FILTER(T8, [T_OCC%] >= 0.895 && [D] = d)

VAR MIN_OCC = MINX(T9, [T_OCC%])

VAR T10 = FILTER(T9, [T_OCC%] = MIN_OCC)

VAR SSD = MAXX(T10, [SD])

RETURN (IF(ISBLANK(SSD), "NO", SSD))

MJEnnis · ‎04-17-2023

@amitchandak and @lbendlin, I see that both of you have tackled problems with occupancy rates before. Do you have any hints? (I am really stuck on this one...) Thanks!

lbendlin · ‎04-17-2023

BB and CC resources look like they are overbooked? Not clear where you get your percentage from.

MJEnnis · ‎04-17-2023

@lbendlin,

BB and CC are dummy data.

Maybe this will help...

To calculate the occupancy on any given day, I first calculate the number of units available:

Units = Sum('Numero e ID risorse'[Numero risorse])

And then I calculate the number of units occupied on a given date (a unit is occupied starting on the arrival date, but is no longer occupied on the departure date):

Occupancy = 

VAR OCC =
Sumx(
    'Calendar', 
Calculate(
    COUNTROWS('Lista arrivi'),
FILTER(
    'Lista arrivi', 
'Calendar'[Date] >= 'Lista arrivi'[Arrivo] && 
'Calendar'[Date] < 'Lista arrivi'[Partenza])))
)

RETURN IF(ISBLANK(OCC), 0, OCC)

Finally, I calculate the rate by dividing the units occupied by the units available, and then dividing this by the number of days that occur during the selected/filtered period:

Occupancy % = DIVIDE(DIVIDE([Occupancy], [Units]), COUNTROWS('Calendar'))

This works perfectly. The matrix on the left above has the first day of each week on the rows, and the day of the week on the columns. The values are simply the measure [Occupancy %]. Not pictured are the slicers which filter the date range and select unit types.

But what I need to do now is not calculate the Occupancy for each date, but return the date of sale on which occupancy reached 90% for each date. And I need to do this in such a way that the date slicers and unit type slicers work on that calculation as well.

The visualization on the right uses the other measure above ([Stop Sale Date]) to return the last date of sale on record before the last refresh, in the event that we are already past the 90% threshold. It works with the filters. But what I need is the sale date on which each occupancy reached 90%. So I need to calculate the occupancy rate in such a way that it not only considers arrival and departure date, but also the sale date... I am completely stuck....

lbendlin · ‎04-17-2023

Cannot help you if you cannot provide usable sample data that completely covers your scenario. I also have an opinion or two on the data model but if it works for you then these opinions don't matter.

MJEnnis · ‎04-18-2023

@lbendlin Here is a mock up of what I have right now.

The visualization on the left is correct. The one on the right is not even halfway there.

I want the second visulization to display the sale date ('Bookings'[DataAcquisizione]) on which the occupancy date indicated by the matrix reached an occupancy of 90%. I want the filters for "Risorsa" and the date range to work on this claculation as well.

Let me know if that does not make sense...

Thanks for any help!

lbendlin · ‎04-18-2023

Thank you for the sample data. It is not granular enough to show the 90% step-over ( goes from 80 or 83 directly to 100). I changed the criteria to 80% to have something to play with. Next you will need to decide if that filter should be applied across resources or by resource.

(Note: I made some minor modifications to simplify your formulas)

See attached.

MJEnnis · ‎04-18-2023

@lbendlin ,

1) Your visual is not loading for me. If I remove the resource ID from the column field, it loads. But if I drop it back in I get an error and the visual doesn't load. It is strange, because the resouce IDs can still be used in a slicer... I haven't update PowerBi in a while, so maybe something changed with how columns work in matrices or something in the meantime...

2) You have definitely simplified the forumlas. And thanks a lot for that! However, this is merely a simplified version of what I was already calculating: basic occupancy rate for all occupancy dates. Your visualization reports the same results as mine, simply putting the unit types in the matrix rather than in a slicer. What I need is more complex. For each occupancy date in question I need to know the sale date at which 90% occupancy was reached, filterable/segementable by unit type.

Maybe a more detailed explanation of what this is used for will help. Let's say I am looking at all bookings made as of today (18 April 2023) for the month of August. I need to be able to determine which units I should stop selling today via third parties (e.g., Booking.com) for all days in the month of August. If August 1 is at 80% capacity for a particular unit, I can keep selling it for that date. If August 2 is at 90% capacity for the same unit, then I should impose a "stop sale" as of today. Your visualization and mine can help me make these types of decisions. However, if August 3 is at 95% capacity, then I should have imposed a "stop sale" before today. What I want to know is the date on which 90% was reached for every day in August, for example, filterable/segementable by unit type.

What I want in the visualization is not the current occupancy rate for past, present and future occupancy dates. I want to see the sale date (typically before the present) at which 90% was reached. To do this, the formula has to somehow also consider [DataAcquisizione] from the Bookings table.

What I thought about doing was creating a table var which contains all possible combinations of sales dates and occupancy dates in two vectors and then adds the occupancy and/or occupancy % measure to this as columns. I could probably figure out this solution with some trial and error, but I do not think it would be very efficient. So I am hoping there is a much simpler solution...

Really sorry for any confusion!

MJEnnis · ‎04-18-2023

This almost does the trick, but the occupancy % is off...

STOP SALE DATE_ = 

var d = max('Calendar'[Date])

VAR T1 = DISTINCT(SELECTCOLUMNS(Bookings,
        "SD", Bookings[DataAcquisizione]
))

VAR T2 = FILTER(DISTINCT(SELECTCOLUMNS('Calendar',
        "D", 'Calendar'[Date])),
        [D] = d)

VAR T3 = CROSSJOIN(T1, T2)

VAR T4 = ADDCOLUMNS(T3, 
"OCC", DIVIDE([Occupancy], [Units],0)
)

VAR STOP_DATE = MINX(FILTER(T4, [OCC] >= 0.895), [SD])

RETURN MAXX(T4, [OCC])

MJEnnis · ‎04-18-2023

Seems to work if you embed the other measures as vars...

STOP SALE DATE_ = 

var d = max('Calendar'[Date])

VAR O = COALESCE(Sumx('Calendar', Calculate(COUNTROWS('Lista arrivi'),'Lista arrivi'[Arrivo]<=d , 'Lista arrivi'[Partenza]>d)), 0)

VAR OP = DIVIDE([Occupancy], [Units],0)

VAR T1 = DISTINCT(SELECTCOLUMNS(Bookings,
        "SD", Bookings[DataAcquisizione]
))

VAR T2 = FILTER(DISTINCT(SELECTCOLUMNS('Calendar',
        "D", 'Calendar'[Date])),
        [D] = d)

VAR T3 = CROSSJOIN(T1, T2)

VAR T4 = ADDCOLUMNS(T3, 
"OCC", OP
)

VAR STOP_DATE = MINX(FILTER(T4, [OCC] >= 0.895), [SD])

RETURN STOP_DATE

Can you simplify, @lbendlin?

lbendlin · ‎04-18-2023

instead of DISTINCT(SELECTCOLUMNS() ) you should be able to use VALUES()

lbendlin · ‎04-18-2023

I think it might help to add the expected arrival and departure dates to the bookings table. That would simplify the calculations (at the cost of potentially duplicating these columns in both tables, unless you differentiate between estimated and actual).

MJEnnis · ‎04-18-2023

@lbendlin

So... the last measure above was not working. There were two problems: 1) the occupancy rate was still calculated only based on the occupancy date (without considering the sales date), 2) it was returning the first sales date on record (not the date at which 90% was achieved).

So I have gone at it with the complex table var strategy and this is what I have come up with. I have crossjoined all combinations of sales dates and occupancy dates in the data set. And then I calculated a cumulative occupancy rate for each occupancy date so that I could look up the first sales date on which the threshold was reached.

It works!

And it does not seem to be slowing things down working with a very small sample of dummy data. But I am concerned when I apply it to the real data, especially when additional years of data and other tables and measures are added to the model.

Do you have any suggestions on how to improve the efficiency?

SSD = 
VAR d = MAX('Calendar'[Date])

VAR T1 = SELECTCOLUMNS('Lista Arrivi',
            "ID", 'Lista Arrivi'[Order ID],
            "A", 'Lista Arrivi'[Arrivo],
            "P", 'Lista Arrivi'[Partenza]
            )

VAR T2 = ADDCOLUMNS(T1,
        "SD", LOOKUPVALUE(Bookings[DataAcquisizione], Bookings[Order ID], [ID])
            )  

VAR T3 = SELECTCOLUMNS('Calendar',
            "D", 'Calendar'[Date]
            )

VAR T4 = CROSSJOIN(T2, T3)

VAR T5 = FILTER(T4, [D] >= [A] && [D] < [P])

VAR T6 = SUMMARIZE(T5, [SD], [D], 
        "OCC", COUNTROWS(FILTER(T5, [D] = EARLIER([D]) && [SD] = EARLIER([SD]))),
        "U", SUMX('Numero e ID risorse', 'Numero e ID risorse'[Numero risorse])
        )

VAR T7 = ADDCOLUMNS(T6, "OCC%", DIVIDE([OCC], [U]))

VAR T8 = SUMMARIZE(T7, [SD], [D],
        "T_OCC%", SUMX(FILTER(T7, [D] = EARLIER([D])), [OCC%])
        )

VAR SSD = MINX(FILTER(T8, [T_OCC%] >= 0.895 && [D] = d), [SD])

RETURN (IF(ISBLANK(SSD), "NO", SSD))

lbendlin · ‎04-18-2023

Looks ok-ish. "EARLIER" is a sign of weakness 🙂 Use variables instead.

At this point you can whip out DAX Studio to look at the query plan and decide if it is performant enough. (Check for excessive number of records)

MJEnnis · ‎04-19-2023

I can replace EARLIER([D]) with the VAR d and it works fine, because the visualization segements the data by individual dates.

However, I do not know of way to store [SD] in the current row context as a VAR, as it would have to somehow come from the T5 table VAR.

SSD = 

VAR d = MAX('Calendar'[Date])

VAR T1 = SELECTCOLUMNS('Lista Arrivi',
            "ID", 'Lista Arrivi'[Order ID],
            "A", 'Lista Arrivi'[Arrivo],
            "P", 'Lista Arrivi'[Partenza]
            )

VAR T2 = ADDCOLUMNS(T1,
        "SD", LOOKUPVALUE(Bookings[DataAcquisizione], Bookings[Order ID], [ID])
            )  

VAR T3 = SELECTCOLUMNS('Calendar',
            "D", 'Calendar'[Date]
            )

VAR T4 = CROSSJOIN(T2, T3)

VAR T5 = FILTER(T4, [D] >= [A] && [D] < [P])

VAR T6 = SUMMARIZE(T5, [SD], [D], 
        "OCC", COUNTROWS(FILTER(T5, [D] = d && [SD] = EARLIER([SD]))),
        "U", SUMX('Numero e ID risorse', 'Numero e ID risorse'[Numero risorse])
        )

VAR T7 = ADDCOLUMNS(T6, "OCC%", DIVIDE([OCC], [U]))

VAR T8 = SUMMARIZE(T7, [SD], [D],
        "T_OCC%", SUMX(FILTER(T7, [D] = d), [OCC%])
        )

VAR SSD = MINX(FILTER(T8, [T_OCC%] >= 0.895 && [D] = d), [SD])

RETURN (IF(ISBLANK(SSD), "NO", SSD))

MJEnnis · ‎04-18-2023

The bookings table does have the same "arrivo" and "partenza" columns in the real file. I left them out of the mock up to simplify.

Is there a way to simplify my tab var solution considering this? (note that the columns have the same name in 'Bookings'.

PS, my code above is simply returning the first date of sale... I guess it needs some tweaking.

MJEnnis · ‎04-18-2023

Thanks a lot @lbendlin!

I am going to check this out and get back to you!

MJEnnis · ‎04-17-2023

I cannot share the raw data of my wife's company. But if no one answers by tomorrow my time (Central Europe), then I might find time to prepare a PBI file that is representative.

In the meantime, I would be interested in any suggestions regarding the model. I have done quite a bit of work with measures and visualizations and would not want to change anything that affects those drastically. But I still quite early in this project and I am already planning on optimizing a few things... I never mind tips from gurus!

MJEnnis · ‎04-17-2023

I forgot the measure that calculates the total units available, based on another table.

Units = Sum('Numero e ID risorse'[Numero risorse])

Caculating the sales date at which a certain occupancy rate was reached for each occupancy date

Helpful resources

Join us at the Microsoft Fabric Community Conference

We want your feedback!

Microsoft Fabric Community Conference 2025

A Year in Review - December 2024

Join us at the 2025 Microsoft Fabric Community Conference