Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Get Fabric Certified for FREE during Fabric Data Days. Don't miss your chance! Request now

Reply
Cevola
Helper I
Helper I

Pseudonymizing the personal data

Hello all,

 

I have 2 columns with personal data in my table(Employee ID and Badge ID). I need to provide data to the customer but I cannot share these 2 columns as they contain personal data.

 

How can I pseudonymize these 2 columns either seperately or together. Also to be noted the ps

 

Thanks

1 ACCEPTED SOLUTION
PwerQueryKees
Super User
Super User

Here it is:

Original table:

PwerQueryKees_0-1733481323610.png

The indexes created. For this sample it runs 1:1, but I assume that in the actual data the same combination of employee id and badge id will occur multiple times:

PwerQueryKees_1-1733481413471.png

The merge:

PwerQueryKees_2-1733481442349.png

The end result:

PwerQueryKees_3-1733481499046.png

All queries:

// Original Table
let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Employee Id", Int64.Type}, {"Start Location", type text}, {"End Location", type text}, {"Start Time", type datetime}, {"End Time", type datetime}, {"Day of the Week", type text}, {"Duration", type time}, {"Badge ID", Int64.Type}})
in
    #"Changed Type"

// Indexes
let
    Source = #"Original Table",
    #"Removed Other Columns" = Table.SelectColumns(Source,{"Employee Id", "Badge ID"}),
    #"Removed Duplicates" = Table.Distinct(#"Removed Other Columns"),
    #"Added Index" = Table.AddIndexColumn(#"Removed Duplicates", "Index", 1, 1, Int64.Type)
in
    #"Added Index"

// End Result
let
    Source = #"Original Table",
    #"Merged Queries" = Table.NestedJoin(Source, {"Employee Id", "Badge ID"}, Indexes, {"Employee Id", "Badge ID"}, "Indexes", JoinKind.LeftOuter),
    #"Expanded Indexes" = Table.ExpandTableColumn(#"Merged Queries", "Indexes", {"Index"}, {"Index"}),
    #"Removed Columns" = Table.RemoveColumns(#"Expanded Indexes",{"Employee Id", "Badge ID"})
in
    #"Removed Columns"

View solution in original post

6 REPLIES 6
Anonymous
Not applicable

Hi @Cevola 

Did the solution PwerQueryKees  offered help you solve the problem, if it helps, you can consider to accept it as a solution so that more user can refer to, or if you have other questions you can offer some information so that can provide more suggestion for you.

 

Best Regards!

Yolo Zhu

 

PwerQueryKees
Super User
Super User

Here it is:

Original table:

PwerQueryKees_0-1733481323610.png

The indexes created. For this sample it runs 1:1, but I assume that in the actual data the same combination of employee id and badge id will occur multiple times:

PwerQueryKees_1-1733481413471.png

The merge:

PwerQueryKees_2-1733481442349.png

The end result:

PwerQueryKees_3-1733481499046.png

All queries:

// Original Table
let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Employee Id", Int64.Type}, {"Start Location", type text}, {"End Location", type text}, {"Start Time", type datetime}, {"End Time", type datetime}, {"Day of the Week", type text}, {"Duration", type time}, {"Badge ID", Int64.Type}})
in
    #"Changed Type"

// Indexes
let
    Source = #"Original Table",
    #"Removed Other Columns" = Table.SelectColumns(Source,{"Employee Id", "Badge ID"}),
    #"Removed Duplicates" = Table.Distinct(#"Removed Other Columns"),
    #"Added Index" = Table.AddIndexColumn(#"Removed Duplicates", "Index", 1, 1, Int64.Type)
in
    #"Added Index"

// End Result
let
    Source = #"Original Table",
    #"Merged Queries" = Table.NestedJoin(Source, {"Employee Id", "Badge ID"}, Indexes, {"Employee Id", "Badge ID"}, "Indexes", JoinKind.LeftOuter),
    #"Expanded Indexes" = Table.ExpandTableColumn(#"Merged Queries", "Indexes", {"Index"}, {"Index"}),
    #"Removed Columns" = Table.RemoveColumns(#"Expanded Indexes",{"Employee Id", "Badge ID"})
in
    #"Removed Columns"
PwerQueryKees
Super User
Super User

And replacing these with a unique number per combination of employee id and badge id will meet your goal?

yes

PwerQueryKees
Super User
Super User

Hard to judge whether it meets your use case, but I would do something in PowerQuery like:

  • Make a reference to the base table
  • Remove all columns except the two sensitive columns
  • Remove duplicates
  • Add an index
  • Make a new querey referencing the base table
  • merge the indexed table on the 2 columns
  • Expand the index column
  • Remove the sensistive columns

Share a sample of your data and a sample of the desired output if you would like me to give it a try....

Cevola_0-1733479830474.png

I want to share this data but I want to pseudonymize the employee id and badge id as they contain personal information

Helpful resources

Announcements
November Power BI Update Carousel

Power BI Monthly Update - November 2025

Check out the November 2025 Power BI update to learn about new features.

Fabric Data Days Carousel

Fabric Data Days

Advance your Data & AI career with 50 days of live learning, contests, hands-on challenges, study groups & certifications and more!

FabCon Atlanta 2026 carousel

FabCon Atlanta 2026

Join us at FabCon Atlanta, March 16-20, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.