Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Get inspired! Check out the entries from the Power BI DataViz World Championships preliminary rounds and give kudos to your favorites. View the vizzies.

Reply
ns29
Helper I
Helper I

Power BI question (Filter -> Group By -> Join)

I have a table that identifies videos with certain phrases and the risk level associated with those phrases and a source where the phrase was identified. Let's call this the phrase table. By definition, a video ID can appear multiple times in this table matching the number of occurrences of certain phrases that we look for. 

 

I have a second table that has revenue for a given video ID. Here, by definition, there's only one record per video ID. Let's call this the revenue table. 

 

What I'm looking to develop is an interactive tool in Power BI that will, in the following order of operations:

  1. Allow me to filter the phrase table for one or more sources 
  2. Then, Group by video ID and the calculate the max risk level observed for that video ID
  3. This will ensure that I have distilled the phrase table down to one record per video ID
  4. Then, join to the revenue table to allow me to understand the impact on Revenue when I filter by source (from Step 1)

The output would be in the form of a Power BI matrix  where the max level is the row header and the value is the revenue.

 

ns29_2-1616649587308.png

 

ns29_1-1616649435167.png

Thanks in advance!

6 REPLIES 6
ns29
Helper I
Helper I

Hi @Jihwan_Kim - thank you so much for your response. Your suggested path to a solution seems exactly right. I know how to solve it in SQL very easily, see query below - just don't know how to set it up in Power BI. I'm attaching 2 sample CSV's, they may differ slightly from the original post.

 

 

SELECT level_max, sum(revenue) AS revenue
FROM
(
	SELECT video_id_a, sum(revenue) AS revenue
	FROM #revenue
	GROUP BY video_id_a
)r
INNER JOIN
(
	SELECT video_id_a, max(level) AS level_max
	FROM #phrase
	WHERE source IN ('tag')
	GROUP BY video_id_a
)p USING (video_id_a)
GROUP BY level_max
ORDER BY level_max
;

 

The key is to be able to execute in Power BI the analogous of the WHERE clause in the second sub-query first before any other calculations happen. 

 

Some sample results:

image.png

 

Sample data can be downloaded from here: https://drive.google.com/drive/folders/1VY2yV7MKE-DHRpQcgl9e2U9Xd4YbYTxV?usp=sharing

 

Very excited to see what you come up with!

Hi, @ns29 

Please kindly check the below picture if it is what you are looking for.

Please correct me if I am wrong.

 

the sample pbix link is down below, and you can check all DAX measures in the file.

The crucial part was to group the phrase table by videoID where each videoID shows maximum Level.

But, showing by Video ID & showing by Level need a little bit different approach.

I think there should be a better way, but at this moment, I could only come up with this.

Once I find a better way, I will share it.

 

Picture1.pngPicture2.png

 

https://www.dropbox.com/s/dwd2ggzn6f9r8x6/ns29%202021%200326.pbix?dl=0 

 

Did I answer your question? Mark my post as a solution!

Appreciate your Kudos!!

 

If this post helps, then please consider accepting it as the solution to help other members find it faster, and give a big thumbs up.


Visit my LinkedIn page by clicking here.


Schedule a meeting with me to discuss further by clicking here.

@Jihwan_Kim thank you for continuing to work with me. The solution offered will be difficult to scale if there are more sources in the future, I need a solution where I can actually use the source as a multi-select slicer.

 

Couple of things I found in my research since yesterday, I just don't know enough DAX to test them, but they all are along the lines of not solving using calculated measures but rather using a single calculated column. Something like a MAX OVER window function in SQL - it calculates the MAX of the level for each video id across all the rows that contain that particular video id. 

 

  • For example, this solution below actually works perfectly and is infinitely sclabale when selecting a single source, see screenshot and compare to sample results earlier. However, it doesn't work when I select multiple sources or if I select all.

image.png

  • If I select all sources, the solution below works

image.png
 

 

Hopefully this generates some additional ideas.

 

Thank you!

Hi, @ns29 

Thank you very much for your feedback.

I just added a slicer (SOURCE) for the multi-selection.

And it works. I just changed one function inside my measure.

Please check the below link.

 

https://www.dropbox.com/s/24obefjlv8tfuve/ns29%202021%200326.pbix?dl=0 

 

Jihwan Kim

If this post helps, then please consider accept it as the solution to help the other members find it more quickly.

 

If this post helps, then please consider accepting it as the solution to help other members find it faster, and give a big thumbs up.


Visit my LinkedIn page by clicking here.


Schedule a meeting with me to discuss further by clicking here.

Unfortunately, this doesn't work since the solution provided is solving for a static set/combination of sources, it has to be dynamic since the number of sources are likely to change. 

 

Any thoughts of going on the window function route?

Jihwan_Kim
Super User
Super User

Hi, @ns29 

Please correct me if I understood wrongly.

I think, in this case, it is very much related to designing a relationship between tables, and creating dim-table(s).

If it is OK with you, please share a sample data by file, then I can try to look into it to come up with a desirable outcome.

 

Thank you very much.

 

Did I answer your question? Mark my post as a solution! Appreciate your Kudos!!

If this post helps, then please consider accepting it as the solution to help other members find it faster, and give a big thumbs up.


Visit my LinkedIn page by clicking here.


Schedule a meeting with me to discuss further by clicking here.

Helpful resources

Announcements
Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!

FebPBI_Carousel

Power BI Monthly Update - February 2025

Check out the February 2025 Power BI update to learn about new features.

March2025 Carousel

Fabric Community Update - March 2025

Find out what's new and trending in the Fabric community.

Top Kudoed Authors