Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Get certified in Microsoft Fabric—for free! For a limited time, get a free DP-600 exam voucher to use by the end of 2024. Register now

Reply
mgirvin
Advocate I
Advocate I

Algorithm for the comparer functions in Table.Group, List.Min, List.Max

Dear Team,

In Power Query M Code I am trying to figure out how the internal algorithm works for the comparer functions:

  • Comparer.Equals
  • Comparer.FromCulture
  • Comparer.Ordinal
  • Comparer.OrdinalIgnoreCase
  • Custom Function for comparing like: if x > y then 1 else if x < y then -1 else 0

Help says that these functions generate an array of ones and zeros and negative ones to help determine ordering, sorting, grouping, max and min calculations and other calculations in functions like List.Sort, Table.Group, List.Min, Lit.Max and many more.

But help is very sparse on exactly what x and y represent and how the resultant array of ones, zeroes and negative ones actually determine the final result.

As for as I can tell, for a single column (comparing sequential rows in a column) comparison:

x = current row in a column

y = next row in column

and

x > y then 1 or x < y then -1 or x = y then 0.

 

My first question about the algorithm is, if we have these formulas:

MinAlgorithmn = List.Min({-1,-5,0,6,6,3},null,(x,y) => if x > y then 1 else if x < y then -1 else 0) = -5

DefaultMin = List.Min({-1,-5,0,6,6,3}) = -5

MaxAlgorithmn = List.Max({-1,-5,0,6,6,3},null,(x,y) => if x > y then 1 else if x < y then -1 else 0) = 6

DefaultMax = List.Max({-1,-5,0,6,6,3})

Then the original data set would be:

{-1,-5,0,6,6,3}

The algorithm would be:

-1 > -5 algorithm yields:= 1

-5 < 0 algorithm yields:= -1

0 < 6 algorithm yields:= -1

6 = 6 algorithm yields:= 0

6 > 3 algorithm yields:= 1

The resultant array would be: {1,-1,-1,0,1,??}

How in the world does this array {1,-1,-1,0,1,??} pick out from the array, {-1,-5,0,6,6,3}, the -5 for the minimum value and 6 for the maximum value? What pattern am I missing? Or is this not at all how the List.Min and List.Max functions work?

 

My second question about the algorithm is, if we have this starting table:

Supplier

Costs

Cœur Work Landscape

2100

Coeur Work Landscape

1522.36

Aerial Surveyor

3582.44

Ærial Surveyor

550.75

Æsthetic Landscape

1992.25

Cœur Work Landscape

481.26

Ærial Surveyor

1005.75

Coeur Work Landscape

250.99

 

If we use this formula:

=Table.Group(AddDataTypes, {"Supplier"}, {{"TotalCosts", each List.Sum([Costs]), type nullable number}})

We get:

Supplier

TotalCosts

Cœur Work Landscape

2581.26

Coeur Work Landscape

1773.35

Aerial Surveyor

3582.44

Ærial Surveyor

1556.5

Æsthetic Landscape

1992.25

 

If we use this formula:

= Table.Group(AddDataTypes, {"Supplier"}, {{"TotalCosts", each List.Sum([Costs]), type nullable number}},null,Comparer.FromCulture("en-US",true))

We get:

Supplier

TotalCosts

Cœur Work Landscape

4354.61

Aerial Surveyor

5138.94

Æsthetic Landscape

1992.25

 

I have broken apart how it looks like the algorithm works here so that we can see the resultant array:

Supplier

Costs

x

y

Comparer.FromCulture

Cœur Work Landscape

2100

Cœur Work Landscape

Coeur Work Landscape

0

Coeur Work Landscape

1522.36

Coeur Work Landscape

Aerial Surveyor

1

Aerial Surveyor

3582.44

Aerial Surveyor

Ærial Surveyor

0

Ærial Surveyor

550.75

Ærial Surveyor

Æsthetic Landscape

-1

Æsthetic Landscape

1992.25

Æsthetic Landscape

Cœur Work Landscape

-1

Cœur Work Landscape

481.26

Cœur Work Landscape

Ærial Surveyor

1

Ærial Surveyor

1005.75

Ærial Surveyor

Coeur Work Landscape

-1

Coeur Work Landscape

250.99

Coeur Work Landscape

  

 

How in the world does this array {0,1,0,-1,-1,1,-1,??} make the final grouping group based in “en-US” equivalent letters? What pattern am I missing? Or is it working some other way?

Thanks, Team!

4 REPLIES 4
foodd
Super User
Super User

@mgirvin  and @Rickmaurinus , what an interesting topic, and great points on the sparse documentation.  These do need much love and attention.    @Rickmaurinus , great article, thank you!  How can these questions, thoughts and other noodlings turn into one or more videos in the future?  Would the two of you consider coop'ting on one or more?

Rickmaurinus
Helper V
Helper V

Hey Mike,

 

That is an interesting topic. This week I finished an article on using Table.Group with Custom Comparer functions, which can you read here: https://gorilla.bi/power-query/table-group-custom-comparers/

For a Comparer function to be valid, it needs to be symmetric. So defining a comparer function, should result in 1, 0 or -1 as a result. What I mean with symmetric is that if row 1 is evaluated first and row 2 after then the resulting order should be identical to when row 2 is evaluated first and row 1 second.

 

Now, I don't think the result of evaluating the comparer logic is used independently for each row in the dataset. How I see it, is that the function is applied repeatedly to all the rows, to form groups. So the logic tries row 1 and 2, and says row 2 comes first. It then applies the logic to row 3, and compares it to row 1 and 2, and positions row 3. Then it performs the logic for row 4, and looks where in the order row 4 belongs. 

 

Also, whereas using GroupKind.Global requires a symmetric comparer function, you can 'get away' with a simplified comparer function when using GroupKind.Local. 

 

Have a look at my article and I'd be happy to hear your thoughts. You can also see another example illustrated here.

 

Regards,

Rick

v-jingzhan-msft
Community Support
Community Support

Hi @mgirvin 

 

It's an interesting topic, however there is no public documentation about how these Power Query native functions are working in the back end. 

 

My guess is that after every comparison, a temporary max or min value will be stored and used to compare with the next value. For example if we hope to get List.Max from {-1,-5,0,6,6,3}, the first time we get -1 as the temporary max value and give it to x. When comparing x with -5, x will still be -1 as it's larger than -5. When comparing x with 0, x will be replaced by 0. Then use the new x to compare with 6 and get a new max value 6 for x. Repeating this comparison for the rest values in the list until the last one. 

 

However this is just my simple guess as we don't know how the native function is written. In addition, from a programming point of view, this is also not the most efficient approach.

 

Best Regards,
Jing
If this post helps, please Accept it as Solution to help other members find it. Appreciate your Kudos!

Thank you very much for your reply : )

What gets me, however, is why does Micrsoft help always metion the if x > y then 1 else if x < y then -1 else 0. Why even mention the numbers? What is the point of telling people that the comparer functions assign 1, -1 and 0, without any further explanation...

Helpful resources

Announcements
November Carousel

Fabric Community Update - November 2024

Find out what's new and trending in the Fabric Community.

Live Sessions with Fabric DB

Be one of the first to start using Fabric Databases

Starting December 3, join live sessions with database experts and the Fabric product team to learn just how easy it is to get started.

Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount! Early Bird pricing ends December 9th.

Nov PBI Update Carousel

Power BI Monthly Update - November 2024

Check out the November 2024 Power BI update to learn about new features.

Users online (556)