Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

The Power BI DataViz World Championships are on! With four chances to enter, you could win a spot in the LIVE Grand Finale in Las Vegas. Show off your skills.

Reply
rouelandrew
Frequent Visitor

How do I remove duplicates specific to a set of customer names only?

Hello everyone. Hope you can help me with this one. I'm creating a line chart for 7 different customer names in one table.

 

3 of the 7 customer names are what is considered TARGET ones while the 4 are Customer names that contain all device_ids. The device_ids in the TARGET ones are also showing under the 4 Customer names that has all.

I can create the line chart just fine but in order for the data to be accurate, I need to show the TARGET device_ids, and make sure those device_ids do not show up under the other 4 Customer names.

 

My query looks like this:

select * from mtd_availability

where customer_name IN ('TARGET_SITES_ALL','TARGET_SITES_OURS','TARGET_SITES_BURROUGHS','CUSTOMER_1','CUSTOMER_2','CUSTOMER_3','CUSTOMER_4')

and device_id is not null

--and fom = '2025-01'

UNION

select * from historical_availability

where customer_name IN ('TARGET_SITES_ALL','TARGET_SITES_OURS','TARGET_SITES_BURROUGHS','CUSTOMER_1','CUSTOMER_2','CUSTOMER_3','CUSTOMER_4')

and device_id is not null

--and fom = '2025-01'

order by customer_availability

 

For those who want to know I'm doing a Union since the mtd_availability only pulls recent month and not past months which I need for my visualization.

 

Now the problem is, the device_ids in TARGET_SITES_ALL, TARGET_SITES_OURS, TARGET_SITES_BURROUGHS also show up under the customer names CUSTOMER_1,CUSTOMER_2, CUSTOMER_3, and CUSTOMER_4

 

How do I make my query remove all the device IDs that are in TARGET_SITES_ALL, TARGET_SITES_OURS, TARGET_SITES_BURROUGHS from the ones in CUSTOMER_1,CUSTOMER_2, CUSTOMER_3, and CUSTOMER_4???

 

I tried doing this:

select * from mtd_availability

where customer_name IN ('TARGET_SITES_ALL','TARGET_SITES_OURS','TARGET_SITES_BURROUGHS')

where customer_name IN ('CUSTOMER_1','CUSTOMER_2','CUSTOMER_3','CUSTOMER_4')

and device_id not in (**List of device IDs that were in the TARGETs**)

--and fom = '2025-01'

UNION

select * from historical_availability

where customer_name IN ('TARGET_SITES_ALL','TARGET_SITES_OURS','TARGET_SITES_BURROUGHS')

where customer_name IN ('CUSTOMER_1','CUSTOMER_2','CUSTOMER_3','CUSTOMER_4')

and device_id not in (**List of device IDs that were in the TARGETs**)

--and fom = '2025-01'

order by customer_availability

 

But I get "Only one expression can be specified in the select list when the subquery is not introduced with EXISTS."

 

Also, Is there a way to do it in PBI or Power Query even without changing the original Query? 

 

Really appreciate anyone who can help with this.

1 ACCEPTED SOLUTION
rohit1991
Super User
Super User

To solve this in Power BI without altering the original query:

  1. Load your data: Import your data into Power BI from both mtd_availability and historical_availability.

  2. Create a Calculated Table for TARGET device IDs:

    • Go to Modeling > New Table and create a table:

TargetDeviceIDs =DISTINCT(UNION(SELECTCOLUMNS(FILTER(mtd_availability,mtd_availability[customer_name] IN {"TARGET_SITES_ALL", "TARGET_SITES_OURS", "TARGET_SITES_BURROUGHS"}),"device_id",

mtd_availability[device_id]),SELECTCOLUMNS(FILTER(historical_availability,historical_availability[customer_name] IN {"TARGET_SITES_ALL", "TARGET_SITES_OURS", "TARGET_SITES_BURROUGHS"}),"device_id", historical_availability[device_id])))

 

Filter Non-TARGET Customers:

  • Use a calculated column or measure to exclude device_ids that exist in the TargetDeviceIDs table:
  • IsTargetDevice =
    NOT (RELATED(SELECTCOLUMNS(TargetDeviceIDs,"device_id", TargetDeviceIDs[device_id])))

View solution in original post

3 REPLIES 3
v-shex-msft
Community Support
Community Support

Hi @rouelandrew ,

Did the above suggestions help with your scenario? if that is the case, you can consider Kudo or Accept the helpful suggestions to help others who faced similar requirements.

If these also don't help, please share more detailed information and description to help us clarify your scenario to test.

How to Get Your Question Answered Quickly 

Regards,

Xiaoxin Sheng

Community Support Team _ Xiaoxin
If this post helps, please consider accept as solution to help other members find it more quickly.
rohit1991
Super User
Super User

To solve this in Power BI without altering the original query:

  1. Load your data: Import your data into Power BI from both mtd_availability and historical_availability.

  2. Create a Calculated Table for TARGET device IDs:

    • Go to Modeling > New Table and create a table:

TargetDeviceIDs =DISTINCT(UNION(SELECTCOLUMNS(FILTER(mtd_availability,mtd_availability[customer_name] IN {"TARGET_SITES_ALL", "TARGET_SITES_OURS", "TARGET_SITES_BURROUGHS"}),"device_id",

mtd_availability[device_id]),SELECTCOLUMNS(FILTER(historical_availability,historical_availability[customer_name] IN {"TARGET_SITES_ALL", "TARGET_SITES_OURS", "TARGET_SITES_BURROUGHS"}),"device_id", historical_availability[device_id])))

 

Filter Non-TARGET Customers:

  • Use a calculated column or measure to exclude device_ids that exist in the TargetDeviceIDs table:
  • IsTargetDevice =
    NOT (RELATED(SELECTCOLUMNS(TargetDeviceIDs,"device_id", TargetDeviceIDs[device_id])))
Ashish_Mathur
Super User
Super User

Hi,

Share some data to work with and show the expected result.  Share data in a format that can be pasted in an MS Excel file.


Regards,
Ashish Mathur
http://www.ashishmathur.com
https://www.linkedin.com/in/excelenthusiasts/

Helpful resources

Announcements
Feb2025 Sticker Challenge

Join our Community Sticker Challenge 2025

If you love stickers, then you will definitely want to check out our Community Sticker Challenge!

Jan NL Carousel

Fabric Community Update - January 2025

Find out what's new and trending in the Fabric community.