Power BI is turning 10! Tune in for a special live episode on July 24 with behind-the-scenes stories, product evolution highlights, and a sneak peek at what’s in store for the future.
Save the dateEnhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends August 31st. Request your voucher.
Finally gave up looking for examples to learn from, and decided to post my first question to the community.
Hello, i have a table similar to this:
Row# Col A Col B
1 A1 B1
2 A1 B2
3 A1 B3
4 A2 B2
5 A2 B3
6 A3 B1
7 A3 B3
I would like to remove duplicates on Col A such that -- whenever Col B = B1, I want to keep that row. So between rows 1, 2 and 3, I would like to keep row 1. Between rows 6 and 7, I will keep row 6.
When Col A has duplicated and Col B doesn't have any B1 (a specific value), I don't care which row I keep as long as I keep only one. So between rows 4 and 5, I could keep either, doesn't matter.
My resulting table should look like this:
Row# Col A Col B
1 A1 B1
4 A2 B2
6 A3 B1
OR
Row# Col A Col B
1 A1 B1
5 A2 B3
6 A3 B1
(as i said above, rows 4 and 5 mean the same to me)
Can anyone help out please?
Solved! Go to Solution.
Thanks Zubair. Sorry I didn't specify that B1 in my data could appear on any row -- not necessarily on the first row all the time. That makes the remove duplicates function not suitable for my requirement.
I got help from a colleague and ended up learning a bit of M to get the job done. My code looks something like this:
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content], #"B1 rows" = Table.Distinct(Table.SelectRows(Source, each ([Col B] = "B1")), {"Col A"}), #"A1 list" = Table.Column(#"B1 rows", "Col A"), #"Non B1 Rows" = Table.Distinct(Table.SelectRows(#"Table 1", each not List.Contains(#"A1 list", [DNS])), {"Col A"}), #"Appended Query" = Table.Combine({#"B1 rows", #"Non B1 Rows"}) in #"Appended Query"
The exception join (not List.Contains) must not be efficient because my query runs 10 minutes or longer. But it gets the job done, and that's what i care about.
If your data is sorted (as in your sample data) i.e. B1 is the first row for each unique ColA item
you simple have to select Col A>>right click>>remove duplicates
Thanks Zubair. Sorry I didn't specify that B1 in my data could appear on any row -- not necessarily on the first row all the time. That makes the remove duplicates function not suitable for my requirement.
I got help from a colleague and ended up learning a bit of M to get the job done. My code looks something like this:
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content], #"B1 rows" = Table.Distinct(Table.SelectRows(Source, each ([Col B] = "B1")), {"Col A"}), #"A1 list" = Table.Column(#"B1 rows", "Col A"), #"Non B1 Rows" = Table.Distinct(Table.SelectRows(#"Table 1", each not List.Contains(#"A1 list", [DNS])), {"Col A"}), #"Appended Query" = Table.Combine({#"B1 rows", #"Non B1 Rows"}) in #"Appended Query"
The exception join (not List.Contains) must not be efficient because my query runs 10 minutes or longer. But it gets the job done, and that's what i care about.
Thanks Zubair. Sorry I didn't specify that B1 in my data could appear on any row -- not necessarily on the first row all the time. That makes the remove duplicates function is not suitable for my requirement.
I got help from a colleague and ended up learning a bit of M to get the job done. My code looks something like this:
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content], #"B1 rows" = Table.Distinct(Table.SelectRows(Source, each ([Col B] = "B1")), {"Col A"}), #"A1 list" = Table.Column(#"B1 rows", "Col A"), #"Non B1 Rows" = Table.Distinct(Table.SelectRows(#"Table 1", each not List.Contains(#"A1 list", [DNS])), {"Col A"}), #"Appended Query" = Table.Combine({#"B1 rows", #"Non B1 Rows"}) in #"Appended Query"
The exception join (not List.Contains) must not be efficient because my query runs 10 minutes or longer. But it gets the job done, and that's what i care about.