Join us at FabCon Atlanta from March 16 - 20, 2026, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.
Register now!Join the Fabric FabCon Global Hackathon—running virtually through Nov 3. Open to all skill levels. $10,000 in prizes! Register now.
Hi Team,
I have a Employee Dataset and I need to remove duplicates in the power query based certain condtions.
Remove duplicates where the Employee_status = "Inactive" and where majority of the columns are blank.
emp ID | name | pno | add | state | emp_status |
1000001 | xxx | 2433 | Bangalore | Karnataka | Inactive |
1000001 | xxx | Bangalore | Inactive | ||
1000002 | yyy | 456 | Inactive | ||
1000002 | yyy | NY | US | Inactive |
In the above example. for 100001 - second row where pno and state are blank has to be removed.
for 100002 - first row where add & state are blank has to be removed.
Please help me how this can be achieved in powerquery.
Solved! Go to Solution.
Hi @LP280388 ,
Here is the whole M syntax. Please paste it to the Advanced Editor:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMjQAAUMlHaWKigogaWRibAyknBLz0hNz8otSgWzvxKK8xJLE7EQg2zMvMbkksyxVKVYHU68CmkYF7BqMgMKVlZVA0sTUDKqMoFKQAr9IIBEajKI0FgA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"emp ID" = _t, name = _t, pno = _t, add = _t, state = _t, emp_status = _t]),
#"Replaced Value" = Table.ReplaceValue(Source,"","“ ”",Replacer.ReplaceValue,{"emp ID", "name", "pno", "add", "state", "emp_status"}),
#"Added Custom" = Table.AddColumn(#"Replaced Value", "Custom", each List.Count( List.Select( Record.FieldValues(_), each _ <> " " ))),
#"Grouped Rows" =Table.Group(#"Added Custom", {"emp ID", "name"}, {{"All", each _, type table [emp ID=nullable text, name=nullable text, pno=nullable text, add=nullable text, state=nullable text, emp_status=nullable text, Custom=number]}} ),
#"Added Custom1" = Table.AddColumn(#"Grouped Rows", "Custom", each Table.Max([All],"Custom")),
#"Expanded Custom" = Table.ExpandRecordColumn(#"Added Custom1", "Custom", {"pno", "add", "state", "emp_status"}, {"pno", "add", "state", "emp_status"}),
#"Removed Columns" = Table.RemoveColumns(#"Expanded Custom",{"All"})
in
#"Removed Columns"
Output:
In addition, if you want do something in Power Query later , there is a specialized forum: Power Query - Microsoft Power BI Community You could post your issues here to get further helps.
Best Regards,
Eyelyn Qin
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
Hi @LP280388 ,
Here is the whole M syntax. Please paste it to the Advanced Editor:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMjQAAUMlHaWKigogaWRibAyknBLz0hNz8otSgWzvxKK8xJLE7EQg2zMvMbkksyxVKVYHU68CmkYF7BqMgMKVlZVA0sTUDKqMoFKQAr9IIBEajKI0FgA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"emp ID" = _t, name = _t, pno = _t, add = _t, state = _t, emp_status = _t]),
#"Replaced Value" = Table.ReplaceValue(Source,"","“ ”",Replacer.ReplaceValue,{"emp ID", "name", "pno", "add", "state", "emp_status"}),
#"Added Custom" = Table.AddColumn(#"Replaced Value", "Custom", each List.Count( List.Select( Record.FieldValues(_), each _ <> " " ))),
#"Grouped Rows" =Table.Group(#"Added Custom", {"emp ID", "name"}, {{"All", each _, type table [emp ID=nullable text, name=nullable text, pno=nullable text, add=nullable text, state=nullable text, emp_status=nullable text, Custom=number]}} ),
#"Added Custom1" = Table.AddColumn(#"Grouped Rows", "Custom", each Table.Max([All],"Custom")),
#"Expanded Custom" = Table.ExpandRecordColumn(#"Added Custom1", "Custom", {"pno", "add", "state", "emp_status"}, {"pno", "add", "state", "emp_status"}),
#"Removed Columns" = Table.RemoveColumns(#"Expanded Custom",{"All"})
in
#"Removed Columns"
Output:
In addition, if you want do something in Power Query later , there is a specialized forum: Power Query - Microsoft Power BI Community You could post your issues here to get further helps.
Best Regards,
Eyelyn Qin
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
Hi @LP280388 ,
You could try to create a custom column as below(outside the PQ) and use it as filer.
_filter =
VAR _blnk_count =
IF ( TRIM ( 'Table'[state] ) = BLANK (), 1, 0 )
+ IF ( TRIM ( 'Table'[pno] ) = BLANK (), 1, 0 )
+ IF ( TRIM ( 'Table'[name] ) = BLANK (), 1, 0 )
+ IF ( TRIM ( 'Table'[add] ) = BLANK (), 1, 0 )
VAR result =
COUNTROWS (
FILTER (
'Table',
'Table'[emp ID] = EARLIER ( 'Table'[emp ID] )
&& 'Table'[emp_status] = "Inactive"
&& _blnk_count >= 2
)
)
RETURN
IF ( result >= 2, 0, 1 )
BR,
Samarth
Best Regards,
Samarth
If this post helps, please consider accepting it as the solution to help the other members find it more quickly.
Appreciate your Kudos!!
Connect on Linkedin
@Samarth_18 Thank you for the response. i cant have a calculated column. I need to filter out the data in model itself. So i request you to let me know if there how this can be achieved in PowerQuery.