Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

The ultimate Microsoft Fabric, Power BI, Azure AI & SQL learning event! Join us in Las Vegas from March 26-28, 2024. Use code MSCUST for a $100 discount. Register Now

Reply
rhaining
Employee
Employee

want derived table with only distinct values

My source table has dupes, by design.  It's denormalized.  Imagine something like this:

 

Item ID | Item Name | Item Color | Owning Org | Owning Person

=========================================

1 | Car | Green | Finance | Joe

1 | Car | Green | Finance | Sally

1 | Car | Green | HR | Bob

 

This data is exactly what I need for most of my views -- the same item must be reported under multiple owners -- but I want a 2nd view that has no duplicates.  All item attributes are identical across all rows (item #1 will always be a car and green), but some or all of the ownership attributes will vary across near duplicate rows.

 

In effect, I want to summarize by Item ID, but I never want to sum or count.  Instead, for item attributes, I want to take any or the first value -- they will be the same, so whatever is cheapest / fastest in terms of compute -- and for the ownership attributes, my ideal would be to take the "mode" or most common value, but I'd happily start with just any or the first value.  However, for all other ownership attributes, I want to take the matching values.  E.g. I can't have Owning Org be HR and Owning Person to be Joe.

 

Almost none of the attributes / fields are scalars, and even for those that are, I still don't want sum or count.  I could probably use average here but that seems needlessly complex given the values will all be the same.

 

1 ACCEPTED SOLUTION
Greg_Deckler
Super User
Super User

@rhaining Maybe:

 

New Table = 
  SUMMARIZE(
    'Table',
    [ID],
    [Item Name],
    [Item Color],
    "Owning Org", MIN('Table'[Owning Org]),
    "Owning Person", MIN('Table'[Owning Person])
  )


or

New Table = 
  VAR __Table = 
  SUMMARIZE(
    'Table',
    [ID],
    [Item Name],
    [Item Color],
    [Owning Org],
    [Owning Person],
    "Count", COUNTROWS('Table')
  )
  VAR __Max = MAXX(__Table, [Count])
  VAR __Result = TOPN( 1, FILTER(__Table, [Count] = __Max))
RETURN
  __Result

 


@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
Mastering Power BI 2nd Edition

DAX is easy, CALCULATE makes DAX hard...

View solution in original post

2 REPLIES 2
Greg_Deckler
Super User
Super User

@rhaining Maybe:

 

New Table = 
  SUMMARIZE(
    'Table',
    [ID],
    [Item Name],
    [Item Color],
    "Owning Org", MIN('Table'[Owning Org]),
    "Owning Person", MIN('Table'[Owning Person])
  )


or

New Table = 
  VAR __Table = 
  SUMMARIZE(
    'Table',
    [ID],
    [Item Name],
    [Item Color],
    [Owning Org],
    [Owning Person],
    "Count", COUNTROWS('Table')
  )
  VAR __Max = MAXX(__Table, [Count])
  VAR __Result = TOPN( 1, FILTER(__Table, [Count] = __Max))
RETURN
  __Result

 


@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
Mastering Power BI 2nd Edition

DAX is easy, CALCULATE makes DAX hard...

I haven't yet tried either of your specific suggestions, but they were enough to unblock me.  Thank you!  I decided to start from here:

 

New Table = 
  SUMMARIZE(
    'Table',
    [ID],
    [Item Name],
    [Item Color]
)

 

In effect throwing away all the ownership information.  At least I have a correctly deduped table.

 

I worry about your 1st suggestion -- how does MAX function w.r.t. strings?  And would it guarantee that the Owning Org + Owning Person were a valid pair?  Your 2nd suggestion looks like it would work perfectly and I may try that soon.

 

I have a follow-up question which I'm hoping you or someone else can answer.  I have my main page working fine, and users can filter in many different ways, and all changes they make reflect in all views -- great.  Now I wish my deduped item data set was also filtered to the selections made on report page 1, and would dynamically update as filter changes were applied on that page.

 

Helpful resources

Announcements
Fabric Community Conference

Microsoft Fabric Community Conference

Join us at our first-ever Microsoft Fabric Community Conference, March 26-28, 2024 in Las Vegas with 100+ sessions by community experts and Microsoft engineering.

February 2024 Update Carousel

Power BI Monthly Update - February 2024

Check out the February 2024 Power BI update to learn about new features.

Fabric Career Hub

Microsoft Fabric Career Hub

Explore career paths and learn resources in Fabric.

Fabric Partner Community

Microsoft Fabric Partner Community

Engage with the Fabric engineering team, hear of product updates, business opportunities, and resources in the Fabric Partner Community.

Top Solution Authors