The ultimate Fabric, Power BI, SQL, and AI community-led learning event. Save €200 with code FABCOMM.
Get registeredCompete to become Power BI Data Viz World Champion! First round ends August 18th. Get started.
Snowflake Data Masking implementation in Power BI: Has anyone implemented Snowflake data masking implementation in Power BI?
We did implement the SSO with Snowflake, doing a POC with data masking functionality using Snowflake Policies but it works only when:
Each User accessing the report has to take over the semantic model and refresh it before accessing the Report. It doesn't work when the Users directly access the report without refreshing the Semantic model.
Are there any additional things to configure in Power BI other than SSO setup?
I tried below:
Say User1 creates a Power BI "Report1" using Direct query mode in Power BI desktop and publishes to Power BI Server. This Report1 uses Person1's OAUTH credentials to login to Snowflake and fetches data into the semantic model.
Now, if the Report1 is being accessed by User2, they would directly see the data in the Report1 which was from the semantic model refreshed using User1's credentials. That is, User2 just views what's available in Report1 and its underlying semantic model but would not connect to the snowflake.
Now when the User 2 takes over the sematic model and refreshes it using his OAUTH credentials, he can see the report with the data masking policies implemented to his account in Snowflake. While User1 also see the same report that User2 sees irrespective of his data masking policies in Snowflake.
So how do we implement the Snowflake data masking policy to the Power BI reports?
The issue here is: When a dashboard developer creates a Dashboard, then publishes the Dashboard to Power BI service, it will continue to use the original credential set EVEN IF SNOWFLAKE SSO IS ENABLED. So, you basically get RBAC for the development process but you're then stuck once you deploy. It's a tricky hole to fall into.
Power BI service needs to offer a way to say, "When SSO is configured, use the Viewers credentials to access the data. Do not use the Developers credentials." That needs to be coupled with a way to say, "Always query this data" or "Use a per-user caching mechagnism to persist RBAC behavior from the data source." I think the best part about this approach is that it shouldn't be uppended by whichever user pool you use in the backend. This is just general interoperability functionality that Microsoft could prioritize to have a more robust ecosystem, but for some reason have not--a possibly simple change in behavior, order of operations on the backend, namespace configuration for caching, ... It shouldn't require they partner with Snowflake at all.
What would be great is to see this show up as a feature request, get prioritized, and be released. Not to scope creep this, but it would also be amazing if Power BI service could do this with any remote User Pool backend, such as AWS Cognito. Because we'd love to do the same thing but withn S3, Glue, Athena, and LakeFormation based lakehouse.
The data masking policies applied in Snowflake are not reflected in Power BI reports unless the semantic model is refreshed with the credentials of the user viewing the report.
This situation is typically caused by the way Power BI caches data and handles DirectQuery connections.
Instead of relying solely on Snowflake's data masking policies, you can implement row-level security (RLS) in Power BI. RLS allows you to control data access based on the user accessing the report.
Roles and rules can be defined in Power BI Desktop to filter data based on user identification.
Row-level security (RLS) with Power BI - Microsoft Fabric | Microsoft Learn
Regards,
Nono Chen
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
@Anonymous Thank you for your response! I know we can do it by RLS but was checking to see if anyone could directly implement the masking policies in Snowflake.