Re: Iterative matching using weighting of multiple...

BobBonner · ‎06-20-2018

I have to come up with an algorithm to find suitable matches for mentors and proteges. To simplify it (maybe), we have a table with employee id, role (mentor or protegee), 5 columns of answers to preferences (where the weighting will happen). In the columns for preferences, an employee has ranked his/her preferences 1 through 5. E.g. PreferenceTopic1 = 4, PreferenceTopic2 = 3, PreferenceTopic3 = 1, PreferenceTopic4 = 5, and PreferenceTopic5 = 2. If this person were a protege, a good mentor would have rank similarly. So, I have to iterate through combinations of employee ids who are proteges against the employee ids who are mentors and weight them on the shortest differences in preference topics. E.g. If possible mentor rates PT1 = 4, PT2 = 3, PT3 = 2, PT4 = 5, PT5 = 1, then they would be a really good match for the original example. There low number (good match) would be 0+0+1+0+1 = 2. I'm not worried about perfectly matching every mentor to a protege, but care about ranking them as "best fits". Not sure if this is best suited for m language in editor or DAX in formula bar.

OwenAuger · ‎06-21-2018

@BobBonner

Assuming you use @ImkeF's method to calculate the "difference" between all possible pairs of Mentors and Proteges, do you also want to come up with an optimal pairing of Mentors and Proteges?

This would be an example of the Assignment Problem (assuming you define your objective as minimizing the sum of differences between pairs selected), which you could solve using lp.assign in the lpSolve R package, within Power Query.

I have previously used the lpSolve package in the Power BI service to solve linear programming problems, so there shouldn't be a problem using that package in the service.

It would be possible but difficult to write your own M code to solve the Assignment Problem in Power Query without using R, so would certainly recommend R.

Post back if you need help with that.

Regards,

Owen

Owen Auger
Did I answer your question? Mark my post as a solution!
Blog
LinkedIn

ImkeF · ‎06-21-2018

Thanks @OwenAuger for pointing towards this pattern!

As this seems to be a general pattern, I thought it is worth a try in M 😉

let
    CatFields = {"pref1","pref2","pref3","pref4","pref5"}, 
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlTSUcpNzSvJLwIyQBwjIDYGYhMgNlWK1YkGi8CVwKQg0jAlxuhKjKDYEIxBSkAaCoryS1LTU6FaTZDUQtSYoqjB7hozFDXGUGwIxUA1sQA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [#"employee id" = _t, role = _t, pref1 = _t, pref2 = _t, pref3 = _t, pref4 = _t, pref5 = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"employee id", Int64.Type}, {"role", type text}, {"pref1", Int64.Type}, {"pref2", Int64.Type}, {"pref3", Int64.Type}, {"pref4", Int64.Type}, {"pref5", Int64.Type}}),
    Mentors = Table.Buffer(Table.SelectRows(#"Changed Type", each ([role] = "mentor"))),
    Proteges = Table.SelectRows(#"Changed Type", each ([role] = "protege")),

	IterativeCombine = List.Skip(List.Generate( ()=>
		[Proteges = Proteges, Counter = 0],
		each [Counter]<=Table.RowCount(Mentors),
		each [	SelectedMentor = Record.FieldValues(Record.SelectFields(Mentors{[Counter]}, CatFields)),
				SelectVals = Table.AddColumn([Proteges], "CatVals", each Record.FieldValues(Record.SelectFields(_, CatFields))),
				CombineWithSelectedMentor = Table.AddColumn(SelectVals, "Combine", (x) => List.Zip({x[CatVals],List.Repeat(SelectedMentor, List.Count(x[CatVals]))})),
				Score = Table.AddColumn(CombineWithSelectedMentor, "Score", each List.Sum(List.Transform([Combine], each Number.Abs(_{0}-_{1})))),
				EmpID = Table.SelectRows(Score, each ([Score] = List.Min(Score[Score]))){0}[employee id],
				Proteges = Table.SelectRows([Proteges], each [employee id] <> EmpID),
				Counter = [Counter]+1],
		each [EmpID]
	)),
    CreateTable = Table.FromColumns({Mentors[employee id], IterativeCombine}, {"Mentor", "Protege"})
in
    CreateTable

It doesn't use the algorithm you mentioned, just iterates through the (table of) mentors, checks who is closest, and returns the first of that result. Then it eliminates the allocated protege from the table of proteges that goes into the next round of matching.

So not a real optimization, as it is dependent on the order of the elements, but a reasonalbe result.

Would be interested if that solves your problem @BobBonner and how the performance of this code is. Thx!

(You have to replace the code in step "Source" by a reference to your table and eventually adjust the values in step "CatFields")

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

ImkeF · ‎06-20-2018

You can do this in the query editor:

1) create a list with the profiles attributes for both

2) do a cartesian join, so that each item will be compared with all of the others.

3) For the comparison use List.Sum(List.Transform(List.Zip({protegeList, mentorList}), each Number.Abs(_{0}-_{1}))). That will add up all deviations

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

Iterative matching using weighting of multiple columns

Helpful resources

Fabric Community Update - July 2025

Power BI Monthly Update - July 2025

How to Get Your Question Answered Quickly

Party with Power BI’s own Guy in a Cube

Iterative matching using weighting of multiple columns

Helpful resources

Fabric Community Update - July 2025

Power BI Monthly Update - July 2025

How to Get Your Question Answered Quickly