cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
Anonymous
Not applicable

Conditionally remove duplicates

Hello Community,

 

Is there a way in PowerQuery or Excel to conditionally remove duplicates? 

Very quick context there are employee id numbers, their planned hours for the year, the financial year and Hrs of absence. However the raw lists their yearly hours on every row, i want to only keep one per financial year. 


I've explored the forum and tried using a helper column but have not had any success.

 

Any help is greatly appreciated.

 

 

 PowerBi Question.JPG

2 ACCEPTED SOLUTIONS
parry2k
Super User
Super User

@Anonymous here is the example power query script which you can use to achieve it.

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlTSUTI0ABFA7BZpaKYUq4MkaoRP1BxV1BhZFKTECCRqgmwCXNQUqyh2E/Cai12tBVa1lljdABSNBQA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Emp = _t, Planned = _t, Hours = _t, Year = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Emp", Int64.Type}, {"Planned", Int64.Type}, {"Hours", Int64.Type}, {"Year", type text}}),
    #"Grouped Rows" = Table.Group(#"Changed Type", {"Emp", "Year"}, {{"Rank", each Table.AddIndexColumn(_, "Rank", 1)}}),
    #"Expanded Rank" = Table.ExpandTableColumn(#"Grouped Rows", "Rank", {"Hours", "Planned", "Rank"}, {"Hours", "Planned", "Rank.1"}),
    #"Added Custom" = Table.AddColumn(#"Expanded Rank", "New Planned", each (if [Rank.1] = 1 then [Planned] else null), type number),
    #"Removed Columns" = Table.RemoveColumns(#"Added Custom",{"Planned", "Rank.1"}),
    #"Changed Type1" = Table.TransformColumnTypes(#"Removed Columns",{{"Hours", Int64.Type}})
in
    #"Changed Type1"

 






Did I answer your question? Mark my post as a solution.

Proud to be a Super User! Appreciate your Kudos 🙂
Feel free to email me with any of your BI needs.





View solution in original post

V-lianl-msft
Community Support
Community Support

Hi @Anonymous ,
 
First, you can select "pers.no" and "FY" columns in the edit query, then click "Remove rows"→" Remove duplicate".
remove_duplicates.png
 
Best Regards,
Liang
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

View solution in original post

3 REPLIES 3
V-lianl-msft
Community Support
Community Support

Hi @Anonymous ,
 
Is this problem sloved?
If not, please let me know.
 
Best Regards,
Liang
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
V-lianl-msft
Community Support
Community Support

Hi @Anonymous ,
 
First, you can select "pers.no" and "FY" columns in the edit query, then click "Remove rows"→" Remove duplicate".
remove_duplicates.png
 
Best Regards,
Liang
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
parry2k
Super User
Super User

@Anonymous here is the example power query script which you can use to achieve it.

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlTSUTI0ABFA7BZpaKYUq4MkaoRP1BxV1BhZFKTECCRqgmwCXNQUqyh2E/Cai12tBVa1lljdABSNBQA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Emp = _t, Planned = _t, Hours = _t, Year = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Emp", Int64.Type}, {"Planned", Int64.Type}, {"Hours", Int64.Type}, {"Year", type text}}),
    #"Grouped Rows" = Table.Group(#"Changed Type", {"Emp", "Year"}, {{"Rank", each Table.AddIndexColumn(_, "Rank", 1)}}),
    #"Expanded Rank" = Table.ExpandTableColumn(#"Grouped Rows", "Rank", {"Hours", "Planned", "Rank"}, {"Hours", "Planned", "Rank.1"}),
    #"Added Custom" = Table.AddColumn(#"Expanded Rank", "New Planned", each (if [Rank.1] = 1 then [Planned] else null), type number),
    #"Removed Columns" = Table.RemoveColumns(#"Added Custom",{"Planned", "Rank.1"}),
    #"Changed Type1" = Table.TransformColumnTypes(#"Removed Columns",{{"Hours", Int64.Type}})
in
    #"Changed Type1"

 






Did I answer your question? Mark my post as a solution.

Proud to be a Super User! Appreciate your Kudos 🙂
Feel free to email me with any of your BI needs.





Helpful resources

Announcements
May 2023 update

Power BI May 2023 Update

Find out more about the May 2023 update.

Submit your Data Story

Data Stories Gallery

Share your Data Story with the Community in the Data Stories Gallery.