Hi everyone,
Got this interesting request for masking data under Family Educational Rights and Privacy Act (FERPA). Below are the parameters of the request from my colleague. My initial thoughts are in Blue. Can you let me know if I am correct in my thinking?
Here is the fake summarized data set. Any raw value under 10 would get masked under the agency's FERPA rule.
I believe the gist of the request is "can we mask data under FERPA and still have the data roll up to an accurate count?"
Assuming multiple tables do not need to be created because possible duplicates were assigned to campus of first reporting:
- Can Power BI ‘mask’ data through the application, or must a masked table be used? I only know of row level security.
- Row level security results in:
- Lower query performance - Not that I am aware.
- More complex data models - I don't believe RLS complicates the data model.
- Limited Power BI features enabled
- With RLS, we can have:
- Fewer duplication of content, i.e., one table per multiple types of users - correct
- Privilege users - correct
- One workspace for dashboard development - correct
- If we go without RLS, what is best way to code null values versus masked values in the data source? - I've seen index columns used to mask sensitive data. Is there another way?
- We would want to distinguish true null versus masked; however, we will need to consult with FERPA officer as in our Tri-Agency data sharing agreement, TEA forced other two agencies to mask zeros as well.
- This is still good question though.
- If we go without RLS, what is best way to roll up data with masked and null values to get accurate counts?
- Example, sum of campus data for district data is inaccurate if Null/masked set to other value.
- In past, I have used “OTHER” category that sums counts below 5. With complimentary masking, sometimes, OTHER cannot be shown as total allows you to figure out missing value.
- My current solution is to create these other “MASKED AGGREGATE” rows with a “STATE” code that then gets included with roll up. In this example, it would be an additional non-existent CAMPUS code that gets included with TOTAL when all campuses are aggregated. - this is the part that I would need the most crowdsourcing.