cancel
Showing results for
Did you mean:

Grow your Fabric skills and prepare for the DP-600 certification exam by completing the latest Microsoft Fabric challenge.

## Algorithm for the comparer functions in Table.Group, List.Min, List.Max

Dear Team,

In Power Query M Code I am trying to figure out how the internal algorithm works for the comparer functions:

• Comparer.Equals
• Comparer.FromCulture
• Comparer.Ordinal
• Comparer.OrdinalIgnoreCase
• Custom Function for comparing like: if x > y then 1 else if x < y then -1 else 0

Help says that these functions generate an array of ones and zeros and negative ones to help determine ordering, sorting, grouping, max and min calculations and other calculations in functions like List.Sort, Table.Group, List.Min, Lit.Max and many more.

But help is very sparse on exactly what x and y represent and how the resultant array of ones, zeroes and negative ones actually determine the final result.

As for as I can tell, for a single column (comparing sequential rows in a column) comparison:

x = current row in a column

y = next row in column

and

x > y then 1 or x < y then -1 or x = y then 0.

My first question about the algorithm is, if we have these formulas:

MinAlgorithmn = List.Min({-1,-5,0,6,6,3},null,(x,y) => if x > y then 1 else if x < y then -1 else 0) = -5

DefaultMin = List.Min({-1,-5,0,6,6,3}) = -5

MaxAlgorithmn = List.Max({-1,-5,0,6,6,3},null,(x,y) => if x > y then 1 else if x < y then -1 else 0) = 6

DefaultMax = List.Max({-1,-5,0,6,6,3})

Then the original data set would be:

{-1,-5,0,6,6,3}

The algorithm would be:

-1 > -5 algorithm yields:= 1

-5 < 0 algorithm yields:= -1

0 < 6 algorithm yields:= -1

6 = 6 algorithm yields:= 0

6 > 3 algorithm yields:= 1

The resultant array would be: {1,-1,-1,0,1,??}

How in the world does this array {1,-1,-1,0,1,??} pick out from the array, {-1,-5,0,6,6,3}, the -5 for the minimum value and 6 for the maximum value? What pattern am I missing? Or is this not at all how the List.Min and List.Max functions work?

My second question about the algorithm is, if we have this starting table:

 Supplier Costs Cœur Work Landscape 2100 Coeur Work Landscape 1522.36 Aerial Surveyor 3582.44 Ærial Surveyor 550.75 Æsthetic Landscape 1992.25 Cœur Work Landscape 481.26 Ærial Surveyor 1005.75 Coeur Work Landscape 250.99

If we use this formula:

=Table.Group(AddDataTypes, {"Supplier"}, {{"TotalCosts", each List.Sum([Costs]), type nullable number}})

We get:

 Supplier TotalCosts Cœur Work Landscape 2581.26 Coeur Work Landscape 1773.35 Aerial Surveyor 3582.44 Ærial Surveyor 1556.5 Æsthetic Landscape 1992.25

If we use this formula:

= Table.Group(AddDataTypes, {"Supplier"}, {{"TotalCosts", each List.Sum([Costs]), type nullable number}},null,Comparer.FromCulture("en-US",true))

We get:

 Supplier TotalCosts Cœur Work Landscape 4354.61 Aerial Surveyor 5138.94 Æsthetic Landscape 1992.25

I have broken apart how it looks like the algorithm works here so that we can see the resultant array:

 Supplier Costs x y Comparer.FromCulture Cœur Work Landscape 2100 Cœur Work Landscape Coeur Work Landscape 0 Coeur Work Landscape 1522.36 Coeur Work Landscape Aerial Surveyor 1 Aerial Surveyor 3582.44 Aerial Surveyor Ærial Surveyor 0 Ærial Surveyor 550.75 Ærial Surveyor Æsthetic Landscape -1 Æsthetic Landscape 1992.25 Æsthetic Landscape Cœur Work Landscape -1 Cœur Work Landscape 481.26 Cœur Work Landscape Ærial Surveyor 1 Ærial Surveyor 1005.75 Ærial Surveyor Coeur Work Landscape -1 Coeur Work Landscape 250.99 Coeur Work Landscape

How in the world does this array {0,1,0,-1,-1,1,-1,??} make the final grouping group based in “en-US” equivalent letters? What pattern am I missing? Or is it working some other way?

Thanks, Team!

4 REPLIES 4
Super User

@mgirvin  and @Rickmaurinus , what an interesting topic, and great points on the sparse documentation.  These do need much love and attention.    @Rickmaurinus , great article, thank you!  How can these questions, thoughts and other noodlings turn into one or more videos in the future?  Would the two of you consider coop'ting on one or more?

Helper V

Hey Mike,

That is an interesting topic. This week I finished an article on using Table.Group with Custom Comparer functions, which can you read here: https://gorilla.bi/power-query/table-group-custom-comparers/

For a Comparer function to be valid, it needs to be symmetric. So defining a comparer function, should result in 1, 0 or -1 as a result. What I mean with symmetric is that if row 1 is evaluated first and row 2 after then the resulting order should be identical to when row 2 is evaluated first and row 1 second.

Now, I don't think the result of evaluating the comparer logic is used independently for each row in the dataset. How I see it, is that the function is applied repeatedly to all the rows, to form groups. So the logic tries row 1 and 2, and says row 2 comes first. It then applies the logic to row 3, and compares it to row 1 and 2, and positions row 3. Then it performs the logic for row 4, and looks where in the order row 4 belongs.

Also, whereas using GroupKind.Global requires a symmetric comparer function, you can 'get away' with a simplified comparer function when using GroupKind.Local.

Have a look at my article and I'd be happy to hear your thoughts. You can also see another example illustrated here.

Regards,

Rick

Community Support

Hi @mgirvin

It's an interesting topic, however there is no public documentation about how these Power Query native functions are working in the back end.

My guess is that after every comparison, a temporary max or min value will be stored and used to compare with the next value. For example if we hope to get List.Max from {-1,-5,0,6,6,3}, the first time we get -1 as the temporary max value and give it to x. When comparing x with -5, x will still be -1 as it's larger than -5. When comparing x with 0, x will be replaced by 0. Then use the new x to compare with 6 and get a new max value 6 for x. Repeating this comparison for the rest values in the list until the last one.

However this is just my simple guess as we don't know how the native function is written. In addition, from a programming point of view, this is also not the most efficient approach.

Best Regards,
Jing
If this post helps, please Accept it as Solution to help other members find it. Appreciate your Kudos!

What gets me, however, is why does Micrsoft help always metion the if x > y then 1 else if x < y then -1 else 0. Why even mention the numbers? What is the point of telling people that the comparer functions assign 1, -1 and 0, without any further explanation...

Announcements

#### Europe’s largest Microsoft Fabric Community Conference

Join the community in Stockholm for expert Microsoft Fabric learning including a very exciting keynote from Arun Ulag, Corporate Vice President, Azure Data.

#### Power BI Monthly Update - June 2024

Check out the June 2024 Power BI update to learn about new features.

#### New forum boards available in Real-Time Intelligence.

Ask questions in Eventhouse and KQL, Eventstream, and Reflex.

Top Solution Authors
Top Kudoed Authors