Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Find everything you need to get certified on Fabric—skills challenges, live sessions, exam prep, role guidance, and more. Get started

Reply
Anonymous
Not applicable

Modelling/Transforming rows with overlapping Start Dates and End Dates to determine client status

Hi all,

 

I have written my problem below. I hope with enough clarity. I would prefer a solution in Power Query/M, but I'll gladly settle for a solution in DAX.

 

The use case:

My organization manages "subscriptions" for health care products in various categories and I need to determine whether clients are NEW, RETURN or CONTINUED clients. A subscription is the period from Start Date up to and including the End Date for a particular product within a specific product category. A client is NEW when there (s)he has not had a previous subscription with the organization. A client is a RETURN customer when the date difference between the latest Start Date and the previous End Date is more than 31 days. A client is CONTINUED when this difference is less than 31 days. 

 

My issue:

A client can have subscriptions to multiple products within a specific product category with overlapping Start/End Dates. For example. A client can have a subscription for a electric bicycle from 1-1-2021 to 30-6-2021 and later have a concurrent subscription for a taxi service from 1-4-2021 to 31-12-2021. From a business perspective we consider this subscription to last from 1-1-2021 to 31-12-2021.

 

If these data issues didn't exist, I would Merge the table onto itself using the Index / Index - 1 trick, and then calculate the DateDiff between the previous End Date and the later Start Date. However, this does not work without smoothing over the above mentioned issue. 

 

Sample data

My data looks like this. Note that column 'Product' is irrelevant for my requirements but kept in for clarity (as to how the issue is introduced into the data). Column 'Serial#' is not native to the data, but a group index I added to join the table onto itself. Maybe somewhat unintuitively it indexes from the most recent subscription. This can be changed of course.

 

IDClient IDProdCat IDProd IDStart DateEnd Date Serial#
1007AA011-8-202231-12-20221
2007AA011-1-202231-7-20222
3007AA011-1-202130-6-20213
4007AA021-6-20201-3-20214
5008AA011-1-202231-12-20221
6008AA011-1-202131-12-20212
7008AA021-6-202030-9-20203

 

My issue is illustrated by the rows with ID 3 and 4. From a business perspective these rows are actually one single subscription starting 1-6-2020 (row 4) and ending 30-6-2021 (row 3). In this example there are two overlapping row. In practice there is theoretically no limit to the number of overlapping products.

 

Any idea/suggestions on how to achieve the transformation/modelling of these overlapping rows to show the subscription length of the overarching product category rather than the individual products?

 

To be sure, I ultimately need to determine whether clients are NEW, RETURN or CONTINUED based on Product Category subscriptions. So if there are alternative solutions, I'm open to those.

 

Thanks in advance!

1 ACCEPTED SOLUTION
v-kkf-msft
Community Support
Community Support

Hi @Anonymous ,

 

Please add these custom columns in Power Query Editor.

 

= List.Dates([Start Date], Duration.Days([End Date]-[Start Date])+1 ,#duration(1, 0, 0, 0))
= let CID = [Client ID], LDate = [List_Dates],
mylist = Table.SelectRows(#"Added Custom", each [Client ID] = CID and
List.ContainsAny([List_Dates], LDate))[#" Serial#"]
in Text.Combine(List.Transform(mylist, Text.From), ",")

vkkfmsft_3-1660015418321.png

 

Then calculate the maximum and minimum dates by grouping in [Client ID] and [Custom] columns.

 

vkkfmsft_1-1660015276636.pngvkkfmsft_2-1660015336309.png

 

If the problem is still not resolved, please provide detailed error information or the expected result you expect. Let me know immediately, looking forward to your reply.
Best Regards,
Winniz
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

 

View solution in original post

3 REPLIES 3
v-kkf-msft
Community Support
Community Support

Hi @Anonymous ,

 

Please add these custom columns in Power Query Editor.

 

= List.Dates([Start Date], Duration.Days([End Date]-[Start Date])+1 ,#duration(1, 0, 0, 0))
= let CID = [Client ID], LDate = [List_Dates],
mylist = Table.SelectRows(#"Added Custom", each [Client ID] = CID and
List.ContainsAny([List_Dates], LDate))[#" Serial#"]
in Text.Combine(List.Transform(mylist, Text.From), ",")

vkkfmsft_3-1660015418321.png

 

Then calculate the maximum and minimum dates by grouping in [Client ID] and [Custom] columns.

 

vkkfmsft_1-1660015276636.pngvkkfmsft_2-1660015336309.png

 

If the problem is still not resolved, please provide detailed error information or the expected result you expect. Let me know immediately, looking forward to your reply.
Best Regards,
Winniz
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

 

Anonymous
Not applicable

Thanks @v-kkf-msft 

 

It solves the problem, so I accept your solution. I also learned a lot from it, so that's even better. It does wreak havoc on the data load/memory with my dataset, eventually loading 9GB of data on a 4mb source file, with calculation lasting almost a day. 

 

But that's a secondary concern. Primary concern has been solved. Thanks!

v-kkf-msft
Community Support
Community Support

Hi @Anonymous ,

 

I will try to solve it, please be patient.

Helpful resources

Announcements
Sept PBI Carousel

Power BI Monthly Update - September 2024

Check out the September 2024 Power BI update to learn about new features.

September Hackathon Carousel

Microsoft Fabric & AI Learning Hackathon

Learn from experts, get hands-on experience, and win awesome prizes.

Sept NL Carousel

Fabric Community Update - September 2024

Find out what's new and trending in the Fabric Community.

Top Kudoed Authors