Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Get certified in Microsoft Fabric—for free! For a limited time, get a free DP-600 exam voucher to use by the end of 2024. Register now

Reply
KelvinMorel
Helper II
Helper II

Nested duplicated MLB Players list

Hi,

I’m having a crazy “Duplicated” challenge but I’m out of options, I have a list of MLB players stats since early 90's, for each year I have the same player, till they are actif of course, I would like to create a list of unique MLB players to use as dimension table.

 

 I've extracted list of names, handedness and positions attributes: 2 issues where the first can’t be solved because of the second.

  1. Players could play multiple positions, ex. Adam Dunn played OutField [OF], InField [IF] and Designated Hitter [DH]
    I could removed duplicated based on [FullName] + [Handedness] but…
     
  2. Also multiple players could share the same name, ex. Adam Eaton Pitcher and Adam Eaton OF
    but also players could share the same name and handedness, ex. Will Smith Pitcher and Will Smith Catcher

    KelvinMorel_1-1632830278571.png

    KelvinMorel_2-1632830303085.png

     

Sample of this list

 

Yeah I know… Other options are welcome!

 

Thx,

2 REPLIES 2
KelvinMorel
Helper II
Helper II

Hi @HotChilli,

 

Thx, I like the "Primary position" suggestion. Add an additional piece of info for player isn't an option.

 

Grtz,

 

HotChilli
Super User
Super User

For Item 2, you need an additional piece of info to make each player unique e.g. birthdate, middle name, height or something.  Hopefully you can add it to your data early on in the process and without doing it manually but if that's the way it has to be done, then that's what you have to do.

 

Item 1 isn't really a problem - it just depends on what you are using the data for.  It works as a Fact table, if you want to keep track of the different positions players have.  If you want to use it as a dimension table, then use a primary position field or get rid of the positions and remove duplicates.

Helpful resources

Announcements
November Carousel

Fabric Community Update - November 2024

Find out what's new and trending in the Fabric Community.

Live Sessions with Fabric DB

Be one of the first to start using Fabric Databases

Starting December 3, join live sessions with database experts and the Fabric product team to learn just how easy it is to get started.

Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount! Early Bird pricing ends December 9th.

Nov PBI Update Carousel

Power BI Monthly Update - November 2024

Check out the November 2024 Power BI update to learn about new features.