cancel
Showing results for 
Search instead for 
Did you mean: 

Fabric is Generally Available. Browse Fabric Presentations. Work towards your Fabric certification with the Cloud Skills Challenge.

Reply
rhaining
Microsoft
Microsoft

want derived table with only distinct values

My source table has dupes, by design.  It's denormalized.  Imagine something like this:

 

Item ID | Item Name | Item Color | Owning Org | Owning Person

=========================================

1 | Car | Green | Finance | Joe

1 | Car | Green | Finance | Sally

1 | Car | Green | HR | Bob

 

This data is exactly what I need for most of my views -- the same item must be reported under multiple owners -- but I want a 2nd view that has no duplicates.  All item attributes are identical across all rows (item #1 will always be a car and green), but some or all of the ownership attributes will vary across near duplicate rows.

 

In effect, I want to summarize by Item ID, but I never want to sum or count.  Instead, for item attributes, I want to take any or the first value -- they will be the same, so whatever is cheapest / fastest in terms of compute -- and for the ownership attributes, my ideal would be to take the "mode" or most common value, but I'd happily start with just any or the first value.  However, for all other ownership attributes, I want to take the matching values.  E.g. I can't have Owning Org be HR and Owning Person to be Joe.

 

Almost none of the attributes / fields are scalars, and even for those that are, I still don't want sum or count.  I could probably use average here but that seems needlessly complex given the values will all be the same.

 

1 ACCEPTED SOLUTION
Greg_Deckler
Super User
Super User

@rhaining Maybe:

 

New Table = 
  SUMMARIZE(
    'Table',
    [ID],
    [Item Name],
    [Item Color],
    "Owning Org", MIN('Table'[Owning Org]),
    "Owning Person", MIN('Table'[Owning Person])
  )


or

New Table = 
  VAR __Table = 
  SUMMARIZE(
    'Table',
    [ID],
    [Item Name],
    [Item Color],
    [Owning Org],
    [Owning Person],
    "Count", COUNTROWS('Table')
  )
  VAR __Max = MAXX(__Table, [Count])
  VAR __Result = TOPN( 1, FILTER(__Table, [Count] = __Max))
RETURN
  __Result

 


@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
Mastering Power BI 2nd Edition

DAX is easy, CALCULATE makes DAX hard...

View solution in original post

2 REPLIES 2
Greg_Deckler
Super User
Super User

@rhaining Maybe:

 

New Table = 
  SUMMARIZE(
    'Table',
    [ID],
    [Item Name],
    [Item Color],
    "Owning Org", MIN('Table'[Owning Org]),
    "Owning Person", MIN('Table'[Owning Person])
  )


or

New Table = 
  VAR __Table = 
  SUMMARIZE(
    'Table',
    [ID],
    [Item Name],
    [Item Color],
    [Owning Org],
    [Owning Person],
    "Count", COUNTROWS('Table')
  )
  VAR __Max = MAXX(__Table, [Count])
  VAR __Result = TOPN( 1, FILTER(__Table, [Count] = __Max))
RETURN
  __Result

 


@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
Mastering Power BI 2nd Edition

DAX is easy, CALCULATE makes DAX hard...

I haven't yet tried either of your specific suggestions, but they were enough to unblock me.  Thank you!  I decided to start from here:

 

New Table = 
  SUMMARIZE(
    'Table',
    [ID],
    [Item Name],
    [Item Color]
)

 

In effect throwing away all the ownership information.  At least I have a correctly deduped table.

 

I worry about your 1st suggestion -- how does MAX function w.r.t. strings?  And would it guarantee that the Owning Org + Owning Person were a valid pair?  Your 2nd suggestion looks like it would work perfectly and I may try that soon.

 

I have a follow-up question which I'm hoping you or someone else can answer.  I have my main page working fine, and users can filter in many different ways, and all changes they make reflect in all views -- great.  Now I wish my deduped item data set was also filtered to the selections made on report page 1, and would dynamically update as filter changes were applied on that page.

 

Helpful resources

Announcements
PBI November 2023 Update Carousel

Power BI Monthly Update - November 2023

Check out the November 2023 Power BI update to learn about new features.

Community News

Fabric Community News unified experience

Read the latest Fabric Community announcements, including updates on Power BI, Synapse, Data Factory and Data Activator.

Dashboard in a day with date

Exclusive opportunity for Women!

Join us for a free, hands-on Microsoft workshop led by women trainers for women where you will learn how to build a Dashboard in a Day!

Power BI Fabric Summit Carousel

The largest Power BI and Fabric virtual conference

130+ sessions, 130+ speakers, Product managers, MVPs, and experts. All about Power BI and Fabric. Attend online or watch the recordings.

Top Solution Authors
Top Kudoed Authors