Power BI is turning 10, and we’re marking the occasion with a special community challenge. Use your creativity to tell a story, uncover trends, or highlight something unexpected.
Get startedJoin us for an expert-led overview of the tools and concepts you'll need to become a Certified Power BI Data Analyst and pass exam PL-300. Register now.
I need to join some data (in a method I don't entirely agree with but I don't see any other option based on my dataset) and although I have a solution it takes an age to load due to 'many to many' relationships and large datasets so I want to find a better/more efficient method and it seems to be the merge that causes the issue so that seems a logical place to start with the optimisation....
Scenario:
Invoice Table and Job Table, standard practice is to join the 2 via Job number so we can compare the Job Date and Invoice Date. However if there is no Job Number then we want to join by Item/Product then find the last Job Date <= Invoice Date and use that for comparison.
Invoice Table Sample
Job Table Sample
Desired Result
Achieved using the basic/slow/memory heavy method of:
let
Source = Excel.CurrentWorkbook(){[Name="Table2"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Invoice Number", Int64.Type}, {"InvoiceDate", type date}, {"Item", type text}}),
#"Merged Queries" = Table.NestedJoin(#"Changed Type", {"Item"}, Job, {"Prod"}, "Job", JoinKind.LeftOuter),
#"Expanded Job" = Table.ExpandTableColumn(#"Merged Queries", "Job", {"JobDate"}, {"JobDate"}),
#"Filtered Rows" = Table.SelectRows(#"Expanded Job", each [JobDate] <= [InvoiceDate]),
#"Sorted Rows" = Table.Sort(#"Filtered Rows",{{"JobDate", Order.Descending}}),
#"Removed Duplicates" = Table.Distinct(#"Sorted Rows", {"Invoice Number"})
in
#"Removed Duplicates"
So I found a way to merge only the Job Date <= Invoice Date using the following...
= Table.AddColumn(#"Changed Type", "Custom", (I)=> Table.SelectRows(Job,(J)=> J[Prod] = I[Item] and J[JobDate] <= I[InvoiceDate]))
and once the above is expanded I can return only a single line per Invoice with the latest Job Date
= Table.SelectRows(#"Changed Type1", each [JobDate] = List.Max(let latest = [Item] in Table.SelectRows(#"Changed Type1", each [JobDate] <= [InvoiceDate] and [Item] = latest)[JobDate]))
Full Advanced Code
let
Source = Excel.CurrentWorkbook(){[Name="Table2"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Invoice Number", Int64.Type}, {"InvoiceDate", type date}, {"Item", type text}}),
#"Added Custom" = Table.AddColumn(#"Changed Type", "Custom", (I)=> Table.SelectRows(Job,(J)=> J[Prod] = I[Item] and J[JobDate] <= I[InvoiceDate])),
#"Expanded Custom" = Table.ExpandTableColumn(#"Added Custom", "Custom", {"JobDate"}, {"JobDate"}),
#"Changed Type1" = Table.TransformColumnTypes(#"Expanded Custom",{{"JobDate", type date}}),
#"Added Custom1" = Table.SelectRows(#"Changed Type1", each [JobDate] = List.Max(let latest = [Item] in Table.SelectRows(#"Changed Type1", each [JobDate] <= [InvoiceDate] and [Item] = latest)[JobDate]))
in
#"Added Custom1"
However what I can't do is get both these steps into one and stop the additional lines being created during the merge (ok yes its reduced slightly by only pulling those line with an Job Date <= Invoice Date but still we are talking millions of additional matches).
Any advice on this would be appreciated as I'm all out of answers and can't find the same use case else where.
Thanks,
Solved! Go to Solution.
let
Source = Excel.CurrentWorkbook(){[Name="Table2"]}[Content],
JobGroup=Table.Buffer(Table.Group(Job,"Prod",{"n",each List.Sort([JobDate],1)})),
Custom1=Table.AddColumn(Source,"JobDate",each List.Skip(JobGroup{[Prod=[Item]]}?[n]? ??{},(x)=>x>[InvoiceDate]){0}?)
in Custom1
let
Source = Excel.CurrentWorkbook(){[Name="Table2"]}[Content],
JobGroup=Table.Buffer(Table.Group(Job,"Prod",{"n",each List.Sort([JobDate],1)})),
Custom1=Table.AddColumn(Source,"JobDate",each List.Skip(JobGroup{[Prod=[Item]]}?[n]? ??{},(x)=>x>[InvoiceDate]){0}?)
in Custom1
Absolute hero! Thank you very much now I just need to work out what its doing 😁
This is your chance to engage directly with the engineering team behind Fabric and Power BI. Share your experiences and shape the future.
Check out the June 2025 Power BI update to learn about new features.
User | Count |
---|---|
13 | |
13 | |
10 | |
8 | |
7 |
User | Count |
---|---|
17 | |
10 | |
7 | |
7 | |
6 |