Solved: Steps for Creating Multiple Step Joins on Differen...

MarkDGaal · ‎05-11-2016

I have two tables, Table 1; always has the full detail:

Location | Category 1 | Category 2 | Category 3 | Category 4 | Hours
WAC      | Plane      |Civil       | Recon      | Air        |  1000
WAC      | Plane      | Military   | Patrol     | Land       |  1000
WAC      | Plane      | Civil      |Survey      |Water       |  1000

Table 2 sometime has the full detail and sometimes it does not and never has the "Hours" field:

ID#| Location | Category 1 | Category 2 | Category 3 | Category 4
001| WAC      | Plane      | Civil      | Recon      | Air            
002| WAC      | Plane      | Civil      | Recon      | Air                    
003| WAC      | Plane      | Military   | Patrol     |            
004| WAC      | Plane      | Civil      |            |

*at minimum Table2 always has a match to Table1 based on Location and Category1

I want to create a multiple step join in which the first time I merge Table1 & Table2 it uses Location & Category 1-4 to bring over a count of matching ID#'s as well as the ID# itself. Then, the next time I only use Location & Category 1-3. Followed by Location & Category 1-2.... etc For a final output that looks like:

Location | Category 1 | Category 2 | Category 3 | Category 4 | Hours | CountMatchingTable2.ID#| Table2.ID#
WAC      | Plane      | Civil      | Recon      | Air        |  1000 | 2                      | 001 (match on Loc & C1-4)
WAC      | Plane      | Civil      | Recon      | Air        |  1000 | 2                      | 002 (match on Loc & C1-4) 
WAC      | Plane      | Military   | Patrol     | Land       |  1000 | 1                      | 003 (match on Loc & C1-3) 
WAC      | Plane      | Civil      |Survey      |Water       |  1000 | 1                      | 004 (match on Loc & C1-2)

Thus far I've created Table2 Grouped by Location & Category 1-4, Table 2 grouped by Location & Category 1-3, Table2 Grouped by Location & Category 1-2, etc... Then, I merge (left outer join) Table1 with Table2 on Location and Category1-4. This duplicates the first record in Table1 and ascribes ID# 001 and ID#002 to the records. The 3rd and 4th Records in Table1 are null because Category 4 and 3 do not match respectively.

Now my problem is I would like to only do the remaining (progressively less detailed) joins on the unmatched records in Table1. However, I can't figure out how to do that exclusively on the unmatched records and when I try to merge Table1 again with Table2 on Location & Category1-3 I duplicate the already matched (1st and 2nd records) from Table1. The end goal here is to divide Hours by the number of matching records in Table2 and then create a relationship back to Table2 using ID#.

Also if there is a way to do this without having 4 separate queries for Table2 grouped 4 different ways kindly let me know. Thanks in advance.

ImkeF · ‎05-11-2016

Somehow this request feels a bit strange, but anyway - here we go:

// Table2
let
    Source = YourTable2,
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"ID", Int64.Type}, {"Location", type text}, {"Category 1", type text}, {"Category 2", type text}, {"Category 3", type text}, {"Category 4", type text}})
in
    #"Changed Type"

// Table1
let
    Source = YourTable1,
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Location", type text}, {"Category 1", type text}, {"Category 2", type text}, {"Category 3", type text}, {"Category 4", type text}, {"Hours", Int64.Type}}),
    #"Added Index" = Table.AddIndexColumn(#"Changed Type", "Index", 0, 1),
    MatchCat4 = Table.NestedJoin(#"Added Index",{"Location", "Category 1", "Category 2", "Category 3", "Category 4"},DistinctCats,{"Location", "Category 1", "Category 2", "Category 3", "Category 4"},"Cat4",JoinKind.LeftOuter),
    StrikesCat4 = Table.AggregateTableColumn(MatchCat4, "Cat4", {{"First", List.Last, "Cat4"}}),
    MatchCat3 = Table.NestedJoin(StrikesCat4,{"Location", "Category 1", "Category 2", "Category 3"},DistinctCats,{"Location", "Category 1", "Category 2", "Category 3"},"NewColumn",JoinKind.LeftOuter),
    StrikesCat3 = Table.AggregateTableColumn(MatchCat3, "NewColumn", {{"First", List.Last, "Cat3"}}),
    MatchCat2 = Table.NestedJoin(StrikesCat3,{"Location", "Category 1", "Category 2"},DistinctCats,{"Location", "Category 1", "Category 2"},"NewColumn",JoinKind.LeftOuter),
    StrikesCat2 = Table.AggregateTableColumn(MatchCat2, "NewColumn", {{"First", List.Last, "Cat2"}}),
    HighestMatchLevel = Table.AddColumn(StrikesCat2, "HighestMatchLevel", each List.Min({[Cat4],[Cat3],[Cat2]}))
in
    HighestMatchLevel

// DistinctCats
let
    Source = Table2,
    #"Grouped Rows" = Table.Group(Source, {"Location", "Category 1", "Category 2", "Category 3", "Category 4"}, {{"First", each List.Min([ID]), type number}, {"CountIDs", each Table.RowCount(_), type number}})
in
    #"Grouped Rows"

// Result
let
    Source = Table2,
    MergeDistinctCats = Table.NestedJoin(Source,{"Location", "Category 1", "Category 2", "Category 3", "Category 4"},DistinctCats,{"Location", "Category 1", "Category 2", "Category 3", "Category 4"},"NewColumn",JoinKind.LeftOuter),
    ShowLookups = Table.ExpandTableColumn(MergeDistinctCats, "NewColumn", {"First", "CountIDs", "AllIDs"}, {"First", "CountIDs", "AllIDs"}),
    MergeTable1OnID = Table.NestedJoin(ShowLookups,{"First"},Table1,{"HighestMatchLevel"},"NewColumn",JoinKind.LeftOuter),
    ShowTotalHours = Table.ExpandTableColumn(MergeTable1OnID, "NewColumn", {"Hours"}, {"TotalHours"}),
    CalculateHoursPerID = Table.AddColumn(ShowTotalHours, "HoursPerID", each [TotalHours]/[CountIDs])
in
    CalculateHoursPerID

The query-names are given as //... at the beginning of each query

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

View solution in original post

ImkeF · ‎05-11-2016

Somehow this request feels a bit strange, but anyway - here we go:

// Table2
let
    Source = YourTable2,
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"ID", Int64.Type}, {"Location", type text}, {"Category 1", type text}, {"Category 2", type text}, {"Category 3", type text}, {"Category 4", type text}})
in
    #"Changed Type"

// Table1
let
    Source = YourTable1,
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Location", type text}, {"Category 1", type text}, {"Category 2", type text}, {"Category 3", type text}, {"Category 4", type text}, {"Hours", Int64.Type}}),
    #"Added Index" = Table.AddIndexColumn(#"Changed Type", "Index", 0, 1),
    MatchCat4 = Table.NestedJoin(#"Added Index",{"Location", "Category 1", "Category 2", "Category 3", "Category 4"},DistinctCats,{"Location", "Category 1", "Category 2", "Category 3", "Category 4"},"Cat4",JoinKind.LeftOuter),
    StrikesCat4 = Table.AggregateTableColumn(MatchCat4, "Cat4", {{"First", List.Last, "Cat4"}}),
    MatchCat3 = Table.NestedJoin(StrikesCat4,{"Location", "Category 1", "Category 2", "Category 3"},DistinctCats,{"Location", "Category 1", "Category 2", "Category 3"},"NewColumn",JoinKind.LeftOuter),
    StrikesCat3 = Table.AggregateTableColumn(MatchCat3, "NewColumn", {{"First", List.Last, "Cat3"}}),
    MatchCat2 = Table.NestedJoin(StrikesCat3,{"Location", "Category 1", "Category 2"},DistinctCats,{"Location", "Category 1", "Category 2"},"NewColumn",JoinKind.LeftOuter),
    StrikesCat2 = Table.AggregateTableColumn(MatchCat2, "NewColumn", {{"First", List.Last, "Cat2"}}),
    HighestMatchLevel = Table.AddColumn(StrikesCat2, "HighestMatchLevel", each List.Min({[Cat4],[Cat3],[Cat2]}))
in
    HighestMatchLevel

// DistinctCats
let
    Source = Table2,
    #"Grouped Rows" = Table.Group(Source, {"Location", "Category 1", "Category 2", "Category 3", "Category 4"}, {{"First", each List.Min([ID]), type number}, {"CountIDs", each Table.RowCount(_), type number}})
in
    #"Grouped Rows"

// Result
let
    Source = Table2,
    MergeDistinctCats = Table.NestedJoin(Source,{"Location", "Category 1", "Category 2", "Category 3", "Category 4"},DistinctCats,{"Location", "Category 1", "Category 2", "Category 3", "Category 4"},"NewColumn",JoinKind.LeftOuter),
    ShowLookups = Table.ExpandTableColumn(MergeDistinctCats, "NewColumn", {"First", "CountIDs", "AllIDs"}, {"First", "CountIDs", "AllIDs"}),
    MergeTable1OnID = Table.NestedJoin(ShowLookups,{"First"},Table1,{"HighestMatchLevel"},"NewColumn",JoinKind.LeftOuter),
    ShowTotalHours = Table.ExpandTableColumn(MergeTable1OnID, "NewColumn", {"Hours"}, {"TotalHours"}),
    CalculateHoursPerID = Table.AddColumn(ShowTotalHours, "HoursPerID", each [TotalHours]/[CountIDs])
in
    CalculateHoursPerID

The query-names are given as //... at the beginning of each query

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

MarkDGaal · ‎05-12-2016

Hummm, okay I've gotten through the code and I think we have a bit of a misunderstanding, let me illustrate with my original tables:

Table1 (i've added an Index as you did in your code):

Location | Category 1 | Category 2 | Category 3 | Category 4 | Hours |Index
WAC      | Plane      |Civil       | Recon      | Air        |  1000 |1
WAC      | Plane      | Military   | Patrol     | Land       |  1000 |2
WAC      | Plane      | Civil      |Survey      |Water       |  1000 |3

Table2 from my original post:

ID#| Location | Category 1 | Category 2 | Category 3 | Category 4
001| WAC      | Plane      | Civil      | Recon      | Air            
002| WAC      | Plane      | Civil      | Recon      | Air                    
003| WAC      | Plane      | Military   | Patrol     |            
004| WAC      | Plane      | Civil      |            |

When I do the Matching on my most detialed level ID# 001 AND 002 both match with Index #1.

Accordingly in my final output table (Index added):

Location | Category 1 | Category 2 | Category 3 | Category 4 | Hours | Index| CountMatchingTable2.ID#| Table2.ID#
WAC      | Plane      | Civil      | Recon      | Air        |  1000 | 1    | 2                      | 001 (match on Loc & C1-4)
WAC      | Plane      | Civil      | Recon      | Air        |  1000 | 1    | 2                      | 002 (match on Loc & C1-4) 
WAC      | Plane      | Military   | Patrol     | Land       |  1000 | 2    | 1                      | 003 (match on Loc & C1-3) 
WAC      | Plane      | Civil      |Survey      |Water       |  1000 | 3    | 1                      | 004 (match on Loc & C1-2)

Index 1 has duplicated and 1000 will be divided by 2 matching ID's splitting 500 hours to ID#001 and ID#002. I believe by using "List.Last" in your "Strikes" this sort of duplication will not be possible.....

I hear you on this feeling odd, but really all that I am trying to accomplish is making sure that my Hours make it back to the ID#'s at the most detailed levels possible and then not have the hours ever double-counted. I'm open to a different approach if you can think of one...

ImkeF · ‎05-12-2016

Hi Mark,

please have a look at my code again. It should do exactly what you've requested (at least at this point): The result-query starts with query 2 (which is in your example the table with the ID's and incomplete Category-allocations).

So please check your table names and the according references in your code.

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

MarkDGaal · ‎05-13-2016

Hmmm... no it's basically not allowing ID# 002 to match back with Table1 only ID#001....

Could it be the List.Min in the DistinctCats query? Wouldn't that always just take ID#001 rather than both ID 001 and 002 when the count of IDs is > 1?

ImkeF · ‎05-13-2016

This is the result I got:

Please check if this is your desired result.

If that’s your desired result, please post your query code so that I can have a look to check.

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

MarkDGaal · ‎05-16-2016

Okay, phew finally got my head around this, thanks ImkeF you are correct the code handles the duplication correctly! Thanks for all of your help.

MarkDGaal · ‎05-12-2016

Seems I've hit a snag in the Results query....

That query refers to an "ALLIDS" field that is not present in the //Distincats query did you mean to add an "All Rows" Aggregation to the Group By?

ImkeF · ‎05-12-2016

Sorry, yes, I did that and later removed it. Just throw it out:

ShowLookups = Table.ExpandTableColumn(MergeDistinctCats, "NewColumn", {"First", "CountIDs"}, {"First", "CountIDs"}),

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

ImkeF · ‎05-12-2016

I've changed the code for the first 2 queries. Now the first "Source"-steps reference the 2 tables which should be provided ("YourTable1" and "YourTable2")

So if these tables already exist in your model the easiest way would be to rename them to: "YourTable1" and "YourTable2".

That way these queries will work instantly.

Yes, this will create some redundant code and queries. So if you feel confident enough, just clean up afterwards 🙂

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

MarkDGaal · ‎05-12-2016

@ImkeF Thank you, I'm starting through your code now (I like to learn so I'm doing it step-by-step). It appears that at the first Aggregation of the most detailed level this step takes 4-5 mins of processing time (2cores/4threads at 50% utilization):

#"StrikesCat4" = Table.AggregateTableColumn(#"MatchCat4", "NewColumn", {{"First", List.Last, "Cat4IDMatch"__li__ng_escape____li__ng_escape____li__ng_escape__}__li__ng_escape____li__ng_escape____li__ng_escape__})

Table2 is 493 rows and Table1 is 1478 rows (after Grouping, source is 170,000rows)... should it be that processor intensive? I should also add that ID from Table2 is not an integer field but rather a combination of letters and numbers. Also, what does "List.Last" do as that doesn't appear to be a function I can do through the Aggregation GUI?

EDIT: Table2 Source is a SharePoint Library Table1 Source is an Excel Doc.

EDIT2: Note Saving the Excel Doc source for Table1 already Grouped (1478rows vs 170,000) did not improve the processor time for Matching&Striking in Table1

ImkeF · ‎05-12-2016

Sorry, didn't pay attention to the performance at all. This is about finding the right place for the List.Buffer or Table.Buffer basically.

Try it here:

// DistinctCats
let
    Source = Table2,
    #"Grouped Rows" = Table.Buffer(Table.Group(Source, {"Location", "Category 1", "Category 2", "Category 3", "Category 4"}, {{"First", each List.Min([ID]), type number}, {"CountIDs", each Table.RowCount(_), type number}}))
in
    #"Grouped Rows"

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

MarkDGaal · ‎05-12-2016

Well that helped! All the Matchs&Strikes run in 5secs now as opposed to 50mins. Getting to the Results Table now should be done shortly, will let you know if I have any issues.

Steps for Creating Multiple Step Joins on Different Keys

Helpful resources

Fabric Community Update - July 2025

Power BI Monthly Update - July 2025

How to Get Your Question Answered Quickly

Party with Power BI’s own Guy in a Cube

Steps for Creating Multiple Step Joins on Different Keys

Helpful resources

Fabric Community Update - July 2025

Power BI Monthly Update - July 2025

How to Get Your Question Answered Quickly