The ultimate Microsoft Fabric, Power BI, Azure AI, and SQL learning event: Join us in Stockholm, September 24-27, 2024.
Save €200 with code MSCUST on top of early bird pricing!
Find everything you need to get certified on Fabric—skills challenges, live sessions, exam prep, role guidance, and more. Get started
Hello,
I have a customer table with multiple status that relate to a date range. I've used these date ranges to create and a unique reference for each customer&status.
i.e
Customer - A, Status - Green, startdate - 01/01/2022 enddate - 25/02/2022 = unique reference AGreen
Customer - A, Status - Blue, startdate - 05/04/2022 enddate - 30/04/2022 = unique reference ABlue
My other table contains orders with various start and end dates.
i.e
Customer A, Order 9875, start date - 04/01/2022, order end - 06/01/2022
Customer A, Order 9917, start date - 07/04/2022, order end - 05/05/2022
I need to add the unique customer&status reference to the order table where the order was 'live' at any point on or within the start and end dates of the customer status.
desired result
Order 9875 unique reference = AGreen
Order 9917 unique reference = ABlue
Some orders may still be live (holding null in enddate) and some customer status end dates maybe null also
I've tried merging queries but I can't figure out the best way to use a custom column / calculated column to look up the correct value.
Can anyone point me in the right direction please?
Many thanks
H
Solved! Go to Solution.
Another option is to start with the Orders table, merge in the Customer table (matching on the Customer column), expand the date and unique reference columns, and then filter out the rows that don't satisfy the time overlap.
Here's some example code:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WclTSUbK0MDcFUib6hvpGBkZGQKYZjBmrA1ViaWgOpMz1TWBKTPVNoUpiAQ==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Customer = _t, Order = _t, #"Start Date" = _t, #"Order End" = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Customer", type text}, {"Order", Int64.Type}, {"Start Date", type date}, {"Order End", type date}}, "en-IN"),
#"Merged Queries" = Table.NestedJoin(#"Changed Type", {"Customer"}, Customers, {"Customer"}, "Customers", JoinKind.LeftOuter),
#"Expanded Customers" = Table.ExpandTableColumn(#"Merged Queries", "Customers", {"Start Date", "End Date", "Unique Reference"}, {"Customers.Start Date", "Customers.End Date", "Customers.Unique Reference"}),
#"Filtered Rows" = Table.SelectRows(#"Expanded Customers", each ([Customers.Start Date] <= [Order End] and [Customers.End Date] >= [Start Date]))
in
#"Filtered Rows"
Another option is to start with the Orders table, merge in the Customer table (matching on the Customer column), expand the date and unique reference columns, and then filter out the rows that don't satisfy the time overlap.
Here's some example code:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WclTSUbK0MDcFUib6hvpGBkZGQKYZjBmrA1ViaWgOpMz1TWBKTPVNoUpiAQ==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Customer = _t, Order = _t, #"Start Date" = _t, #"Order End" = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Customer", type text}, {"Order", Int64.Type}, {"Start Date", type date}, {"Order End", type date}}, "en-IN"),
#"Merged Queries" = Table.NestedJoin(#"Changed Type", {"Customer"}, Customers, {"Customer"}, "Customers", JoinKind.LeftOuter),
#"Expanded Customers" = Table.ExpandTableColumn(#"Merged Queries", "Customers", {"Start Date", "End Date", "Unique Reference"}, {"Customers.Start Date", "Customers.End Date", "Customers.Unique Reference"}),
#"Filtered Rows" = Table.SelectRows(#"Expanded Customers", each ([Customers.Start Date] <= [Order End] and [Customers.End Date] >= [Start Date]))
in
#"Filtered Rows"
Hi, Thank you for this. I have managed to implement this into my model. Just checking through the data to see that it's all doing as expected but so far looking great.
thanks again.
H
Hi @Anonymous
You can do this in Power Query my modifying the tables. You might create tables that aren't loaded but are just used in a merge for this purpose. My file is here - I did this in Excel for simplicity's sake.
What I did:
In each table I created a DateRange field that is a list of the dates from the StartDate to EndDate. It is actually numeric, but you convert to dates in a second:
You then hit the double-arrows in the upper right of that new column, Expand into Rows, then change the type to date. You'll get this:
Do that with both tables, then from the order table, merge it with the customer table as shown, using 2 columns:
Do not expand the merged "Customers" column. Instead, add a custom column with this formula:
if Table.RowCount([Customers]) > 0 then [Customers][Status]{0} else null
If the merged Customers column has more than 1 record, this will get the Status value of the first record, which is Green in this case.
At this point, keep only the columns you want (get rid of the Customers and DateRange for example), then use Table.Distinct (remove duplicates on the whole table) and you get this:
The order table can be brought in as is. The Customer table would need a copy of the original before all of this work. You could duplicate it and remove all of the steps I added, or create a reference to it to create this expanded table, etc. Many ways to approach it.
DAX is for Analysis. Power Query is for Data Modeling
Proud to be a Super User!
MCSA: BI ReportingHi,
thanks so much for your reply. I am trying to replicate this, but adding the first column I get
Expression.Error: The name 'Changed Type' wasn't recognized. Make sure it's spelled correctly
any idea why this is?
#"Changed Type" refers to the previous step. Change it to whatever the name of your previous step is.
Hi @Anonymous ,
I would advise doing this in DAX. Power Query can do this, but performance will be awful compared to DAX.
In DAX you would create a new calculated column in your Orders table, something like this:
..customerUniqueReference =
CALCULATE(
VAR __customer = VALUES(ordersTable[Customer])
VAR __orderStart = VALUES(ordersTable[StartDate])
VAR __orderEnd = VALUES(ordersTable[EndDate])
RETURN
MAXX(
FILTER(
customersTable,
customersTable[Customer] = __customer
&& customersTable[StartDate] <= __orderEnd
&& customersTable[EndDate] >= __orderStart
),
customersTable[UniqueReference]
)
)
You may want to fiddle wth the '>=' and '<=' bits to get the exact matching behaviour that you need, but this basic structure used in DAX should be super-fast.
Pete
Proud to be a Datanaut!
Thank you for your reply, its good to know there are many ways to achieve the desired result.
Join the community in Stockholm for expert Microsoft Fabric learning including a very exciting keynote from Arun Ulag, Corporate Vice President, Azure Data.
Check out the August 2024 Power BI update to learn about new features.