Power BI is turning 10, and we’re marking the occasion with a special community challenge. Use your creativity to tell a story, uncover trends, or highlight something unexpected.
Get startedJoin us for an expert-led overview of the tools and concepts you'll need to become a Certified Power BI Data Analyst and pass exam PL-300. Register now.
Hello all
I am working with data with duplicates; however, I can't remove them as I need to use other columns in other charts.
I would ideally like to create the below - where the first three columns are checked and, based on how many times they are or aren't duplicated, create a column with their counts. This will allow me to filter duplicates on the relevant charts.
Note that I don't need to know if the duplicate is 2,3, etc. It could just be "first" and "duplicate"—whichever is easier.
Many thanks
Nicole
Solved! Go to Solution.
Hi, @collis
You can try the following methods. Start by adding the index column to the Power query.
Count =
CALCULATE ( COUNT ( 'Table'[Site] ),
FILTER (
ALLEXCEPT ('Table','Table'[Scheduled Start],'Table'[Scheduled End],'Table'[Site]),
[Index] <= EARLIER ( 'Table'[Index] )
)
)
Is this the result you were expecting?
Best Regards,
Community Support Team _Charlotte
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
Create a new column
Duplicate Status =
VAR CurrentRowKey =
[Scheduled Start] & " " & [Scheduled End] & " " & [Site]
VAR RowCount =
CALCULATE(
COUNTROWS('YourTable'),
FILTER(
'YourTable',
'YourTable'[Scheduled Start] = EARLIER('YourTable'[Scheduled Start]) &&
'YourTable'[Scheduled End] = EARLIER('YourTable'[Scheduled End]) &&
'YourTable'[Site] = EARLIER('YourTable'[Site])
)
)
VAR RowNumber =
RANKX(
FILTER(
'YourTable',
'YourTable'[Scheduled Start] = EARLIER('YourTable'[Scheduled Start]) &&
'YourTable'[Scheduled End] = EARLIER('YourTable'[Scheduled End]) &&
'YourTable'[Site] = EARLIER('YourTable'[Site])
),
'YourTable'[Scheduled Start],
,
ASC,
Dense
)
RETURN
IF(RowNumber = 1, "First", "Duplicate")
Give this a try and let me know how it works for you!
Hi,
Share data in a format that can be pasted in an MS Excel file.
Scheduled Start | Scheduled End | Site |
3/11/2023 9:00 | 3/11/2023 17:00 | NSW |
8/11/2023 9:00 | 8/11/2023 17:00 | NSW |
10/11/2023 9:30 | 10/11/2023 10:30 | NSW |
10/11/2023 10:30 | 10/11/2023 11:30 | NSW |
10/11/2023 10:30 | 10/11/2023 11:30 | NSW |
13/11/2023 13:00 | 13/11/2023 16:00 | NSW |
13/11/2023 13:30 | 13/11/2023 21:30 | NSW |
14/11/2023 9:30 | 14/11/2023 17:30 | NSW |
16/11/2023 9:00 | 16/11/2023 17:00 | NSW |
16/11/2023 12:00 | 16/11/2023 20:00 | NSW |
17/11/2023 11:30 | 17/11/2023 19:30 | NSW |
22/11/2023 11:30 | 22/11/2023 11:30 | NSW |
22/11/2023 11:30 | 22/11/2023 19:30 | NSW |
23/11/2023 9:30 | 23/11/2023 17:30 | NSW |
24/11/2023 9:00 | 24/11/2023 17:00 | NSW |
24/11/2023 9:00 | 24/11/2023 17:00 | NSW |
24/11/2023 9:00 | 24/11/2023 17:00 | NSW |
24/11/2023 9:00 | 24/11/2023 17:00 | NSW |
27/11/2023 11:00 | 27/11/2023 14:00 | NSW |
29/11/2023 9:30 | 29/11/2023 17:30 | NSW |
29/11/2023 9:30 | 29/11/2023 17:30 | NSW |
29/11/2023 9:30 | 29/11/2023 17:30 | NSW |
29/11/2023 10:00 | 29/11/2023 18:00 | NSW |
Many thanks
Hi, @collis
You can try the following methods. Start by adding the index column to the Power query.
Count =
CALCULATE ( COUNT ( 'Table'[Site] ),
FILTER (
ALLEXCEPT ('Table','Table'[Scheduled Start],'Table'[Scheduled End],'Table'[Site]),
[Index] <= EARLIER ( 'Table'[Index] )
)
)
Is this the result you were expecting?
Best Regards,
Community Support Team _Charlotte
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
Hi,
This M code works
let
Source = Excel.CurrentWorkbook(){[Name="Data"]}[Content],
#"Grouped Rows" = Table.Group(Source, {"Scheduled Start", "Scheduled End", "Site"}, {{"Count", each Table.AddIndexColumn(_,"Index",1)}}),
#"Expanded Count" = Table.ExpandTableColumn(#"Grouped Rows", "Count", {"Index"}, {"Index"}),
#"Changed Type" = Table.TransformColumnTypes(#"Expanded Count",{{"Scheduled Start", type datetime}, {"Scheduled End", type datetime}, {"Site", type text}, {"Index", Int64.Type}})
in
#"Changed Type"
Hope this helps.
Thank you. So fast. This works well, but it deletes other columns. How would I keep other columns to be used for other charts?
Thank you again for the reply. Unfortunately, my knowledge isn't advanced enough. Trying though... If I list all the columns, the count won't work as the data in the other columns isn't duplicated. I only need to check duplicates across three columns, and then the other columns I need to keep to be able to do other charts. I have seen people add
{"all", each _, type table [membercode=number, capacity=number]}
but I can't get it to work. Sorry, this is really stretching my brain!
You are welcome. In the Grouped Rows step, mention all the columns.
Sorry - I just quickly put together a demonstration to help communicate (and keep the data private). Imagine they are duplicates.
Hi Nicole,
The last 3 rows in your image aren't duplicates because the time in the Scheduled End column are different?
Regards
Phil
Proud to be a Super User!
This is your chance to engage directly with the engineering team behind Fabric and Power BI. Share your experiences and shape the future.
Check out the June 2025 Power BI update to learn about new features.
User | Count |
---|---|
72 | |
71 | |
57 | |
38 | |
36 |
User | Count |
---|---|
82 | |
67 | |
61 | |
46 | |
45 |