Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Next up in the FabCon + SQLCon recap series: The roadmap for Microsoft SQL and Maximizing Developer experiences in Fabric. All sessions are available on-demand after the live show. Register now

Reply
Anonymous
Not applicable

How to Group By the distinct elements of two columns ?

I'm working on an ATP matches dataset, and I wanna create a Players table out of a Group By on the Matches table, but the players are split between a Winner and a Loser column, and I want a way to create that Players table by taking the list of distinct players in the two columns, and then adding calculated columns such as the number of matches they had, the win percentage, etc. 

 

Thanks in advance.

2 ACCEPTED SOLUTIONS
Johanno
Continued Contributor
Continued Contributor

Hi, I think your solution sounds good. You goal is to have a new PlayersTable and that's what you get? If I create a new table with:

PlayersTable = UNION(DISTINCT('ATP Dataset_2012-01 to 2017-07_Int_V4'[Player1]);DISTINCT('ATP Dataset_2012-01 to 2017-07_Int_V4'[Player2]))

I get a new table with all players, 991 at the moment. 

View solution in original post

Johanno
Continued Contributor
Continued Contributor

Another thought:

- On each row in your table you will be able to find a winner and a looser - we already have the winner so the looser is the other guy (a calculated column):

 

Looser = IF('ATP Dataset_2012-01 to 2017-07_Int_V4'[Player1]='ATP Dataset_2012-01 to 2017-07_Int_V4'[Winner];'ATP Dataset_2012-01 to 2017-07_Int_V4'[Player2];'ATP Dataset_2012-01 to 2017-07_Int_V4'[Player1])

 

- We still need a complete table with all the players, as before create a new table:

 

PlayersTable = SUMMARIZECOLUMNS('ATP Dataset_2012-01 to 2017-07_Int_V4'[Player1];"Int";MIN('ATP Dataset_2012-01 to 2017-07_Int_V4'[Player1_Int]);"Rank average";AVERAGE('ATP Dataset_2012-01 to 2017-07_Int_V4'[Player1_Rank]))

 

- We need a relationship between the two tables, so set an active relationship from PlayersTable to Winner and a passive to Looser

- Then in our fact table we can create a measure with win rate (for instance):

 

Win rate = COUNT('ATP Dataset_2012-01 to 2017-07_Int_V4'[Winner])/(COUNT('ATP Dataset_2012-01 to 2017-07_Int_V4'[Winner])+CALCULATE(COUNT('ATP Dataset_2012-01 to 2017-07_Int_V4'[Looser]);USERELATIONSHIP('ATP Dataset_2012-01 to 2017-07_Int_V4'[Looser];PlayersTable[Player1])))

 

- Create a visual table with player from PlayersTable and our new measure where you can add columns and slicers:

Skärmklipp.JPG

 

I havn't looked at rank, I guess you could create something similar based on the rank from the last match? Any thoughts? 🤨

 

View solution in original post

6 REPLIES 6
Johanno
Continued Contributor
Continued Contributor

Hi, do you have some sample data and can show the disered result?

Anonymous
Not applicable

@Johannohere's the dataset; so far I ended up going around the problem by creating a new table as the union of the distinct elements of Player1 and Player2 and adding calculated columns, but I'm wondering if there's any way to group a table by a union of two columns without taking the shortcut I took.

Johanno
Continued Contributor
Continued Contributor

Hi, I think your solution sounds good. You goal is to have a new PlayersTable and that's what you get? If I create a new table with:

PlayersTable = UNION(DISTINCT('ATP Dataset_2012-01 to 2017-07_Int_V4'[Player1]);DISTINCT('ATP Dataset_2012-01 to 2017-07_Int_V4'[Player2]))

I get a new table with all players, 991 at the moment. 

Anonymous
Not applicable

@Johanno  Thing is, when I use this solution, filters do not apply, for example even if I have a "players" and a "win percentage" column, I can't visualize who has the highest win ratio on clay for example because I don't see any way to link the two tables in a way that'd make the filter recognize the Players table as being related to Matches. Do you know how to resolve this issue by any chance ?

Johanno
Continued Contributor
Continued Contributor

You're right, I didn't think all the way. Guess the challenge is that data is grouped per match and we don't have one row per player and match. Tried to use Unpivot in Power Query to get around this but didn't succeed.

 

Another way to create a table with desired columns is:

PlayersTable = SUMMARIZECOLUMNS('ATP Dataset_2012-01 to 2017-07_Int_V4'[Player1];"Int";MIN('ATP Dataset_2012-01 to 2017-07_Int_V4'[Player1_Int]);"Rank average";AVERAGE('ATP Dataset_2012-01 to 2017-07_Int_V4'[Player1_Rank]))

but that only uses player1..

Johanno
Continued Contributor
Continued Contributor

Another thought:

- On each row in your table you will be able to find a winner and a looser - we already have the winner so the looser is the other guy (a calculated column):

 

Looser = IF('ATP Dataset_2012-01 to 2017-07_Int_V4'[Player1]='ATP Dataset_2012-01 to 2017-07_Int_V4'[Winner];'ATP Dataset_2012-01 to 2017-07_Int_V4'[Player2];'ATP Dataset_2012-01 to 2017-07_Int_V4'[Player1])

 

- We still need a complete table with all the players, as before create a new table:

 

PlayersTable = SUMMARIZECOLUMNS('ATP Dataset_2012-01 to 2017-07_Int_V4'[Player1];"Int";MIN('ATP Dataset_2012-01 to 2017-07_Int_V4'[Player1_Int]);"Rank average";AVERAGE('ATP Dataset_2012-01 to 2017-07_Int_V4'[Player1_Rank]))

 

- We need a relationship between the two tables, so set an active relationship from PlayersTable to Winner and a passive to Looser

- Then in our fact table we can create a measure with win rate (for instance):

 

Win rate = COUNT('ATP Dataset_2012-01 to 2017-07_Int_V4'[Winner])/(COUNT('ATP Dataset_2012-01 to 2017-07_Int_V4'[Winner])+CALCULATE(COUNT('ATP Dataset_2012-01 to 2017-07_Int_V4'[Looser]);USERELATIONSHIP('ATP Dataset_2012-01 to 2017-07_Int_V4'[Looser];PlayersTable[Player1])))

 

- Create a visual table with player from PlayersTable and our new measure where you can add columns and slicers:

Skärmklipp.JPG

 

I havn't looked at rank, I guess you could create something similar based on the rank from the last match? Any thoughts? 🤨

 

Helpful resources

Announcements
New to Fabric survey Carousel

New to Fabric Survey

If you have recently started exploring Fabric, we'd love to hear how it's going. Your feedback can help with product improvements.

Power BI DataViz World Championships carousel

Power BI DataViz World Championships - June 2026

A new Power BI DataViz World Championship is coming this June! Don't miss out on submitting your entry.

Join our Fabric User Panel

Join our Fabric User Panel

Share feedback directly with Fabric product managers, participate in targeted research studies and influence the Fabric roadmap.

March Power BI Update Carousel

Power BI Community Update - March 2026

Check out the March 2026 Power BI update to learn about new features.