Solved: want derived table with only distinct values

rhaining · ‎05-01-2023

My source table has dupes, by design. It's denormalized. Imagine something like this:

Item ID | Item Name | Item Color | Owning Org | Owning Person

=========================================

1 | Car | Green | Finance | Joe

1 | Car | Green | Finance | Sally

1 | Car | Green | HR | Bob

This data is exactly what I need for most of my views -- the same item must be reported under multiple owners -- but I want a 2nd view that has no duplicates. All item attributes are identical across all rows (item #1 will always be a car and green), but some or all of the ownership attributes will vary across near duplicate rows.

In effect, I want to summarize by Item ID, but I never want to sum or count. Instead, for item attributes, I want to take any or the first value -- they will be the same, so whatever is cheapest / fastest in terms of compute -- and for the ownership attributes, my ideal would be to take the "mode" or most common value, but I'd happily start with just any or the first value. However, for all other ownership attributes, I want to take the matching values. E.g. I can't have Owning Org be HR and Owning Person to be Joe.

Almost none of the attributes / fields are scalars, and even for those that are, I still don't want sum or count. I could probably use average here but that seems needlessly complex given the values will all be the same.

Greg_Deckler · ‎05-01-2023

@rhaining Maybe:

New Table = 
  SUMMARIZE(
    'Table',
    [ID],
    [Item Name],
    [Item Color],
    "Owning Org", MIN('Table'[Owning Org]),
    "Owning Person", MIN('Table'[Owning Person])
  )


or

New Table = 
  VAR __Table = 
  SUMMARIZE(
    'Table',
    [ID],
    [Item Name],
    [Item Color],
    [Owning Org],
    [Owning Person],
    "Count", COUNTROWS('Table')
  )
  VAR __Max = MAXX(__Table, [Count])
  VAR __Result = TOPN( 1, FILTER(__Table, [Count] = __Max))
RETURN
  __Result

Follow on LinkedIn
@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!: Power BI Cookbook Third Edition (Color)

DAX is easy, CALCULATE makes DAX hard...

View solution in original post

Greg_Deckler · ‎05-01-2023

@rhaining Maybe:

New Table = 
  SUMMARIZE(
    'Table',
    [ID],
    [Item Name],
    [Item Color],
    "Owning Org", MIN('Table'[Owning Org]),
    "Owning Person", MIN('Table'[Owning Person])
  )


or

New Table = 
  VAR __Table = 
  SUMMARIZE(
    'Table',
    [ID],
    [Item Name],
    [Item Color],
    [Owning Org],
    [Owning Person],
    "Count", COUNTROWS('Table')
  )
  VAR __Max = MAXX(__Table, [Count])
  VAR __Result = TOPN( 1, FILTER(__Table, [Count] = __Max))
RETURN
  __Result

Follow on LinkedIn
@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!: Power BI Cookbook Third Edition (Color)

DAX is easy, CALCULATE makes DAX hard...

rhaining · ‎05-02-2023

I haven't yet tried either of your specific suggestions, but they were enough to unblock me. Thank you! I decided to start from here:

New Table = 
  SUMMARIZE(
    'Table',
    [ID],
    [Item Name],
    [Item Color]
)

In effect throwing away all the ownership information. At least I have a correctly deduped table.

I worry about your 1st suggestion -- how does MAX function w.r.t. strings? And would it guarantee that the Owning Org + Owning Person were a valid pair? Your 2nd suggestion looks like it would work perfectly and I may try that soon.

I have a follow-up question which I'm hoping you or someone else can answer. I have my main page working fine, and users can filter in many different ways, and all changes they make reflect in all views -- great. Now I wish my deduped item data set was also filtered to the selections made on report page 1, and would dynamically update as filter changes were applied on that page.

want derived table with only distinct values

Helpful resources

Power BI Monthly Update - June 2025

Fabric Community Update - June 2025

Become a Certified Power BI Data Analyst!

want derived table with only distinct values

Helpful resources

Power BI Monthly Update - June 2025

Fabric Community Update - June 2025