Find everything you need to get certified on Fabric—skills challenges, live sessions, exam prep, role guidance, and more. Get started
New article. Hope this can help 🙂
In a Power BI Embedded App Owns Data implementation, targeting Paginated Reports, it is possible to overcome the character size limitation on RLS filter values.
This can be beneficial in a variety of scenarios, one of which is using Kusto as a data source (aka Azure Data Explorer, aka Microsoft Fabric Real-Time Analytics).
Implementing row level security (RLS) in a Power BI Paginated Report requires manually adding predicates to filter the data. In a Power BI Embedded App Owns Data scenario, the standard flow to accomplish RLS is as follows:
There is a caveat though - the username field cannot exceed 256 characters. While this upper bound may be raised in the future, there are cases where a much longer string is beneficial. One such case is passing a list of RLS filter values to a Paginated Report which uses Azure Data Explorer (ADX) as a data source. Some background is required to explain the reasoning.
ADX is a big data analytical PaaS offering. It runs on the Kusto engine which provides impressive performance, auto optimizations, and scale-out support.
Each ADX cluster also provides management instances which orchestrate the data ingestion processes to the Kusto engine (in streams or batches). Ingested data can be wrangled using Data Mappings and Update Policies.
The native Kusto Query Language (KQL) is user friendly and elegant. It provides many useful built-in functions for handling semi-structured data, performing advanced statistical analytics, and rendering results.
Besides KQL, ADX can be queried using a subset of the T-SQL syntax (obviously, the recommended approach is to use the native KQL). In fact, ADX supports the Tabular Data Stream (TDS) protocol, thus it can present itself as a Microsoft SQL Server DB. This is how a Paginated Report can query ADX.
The Kusto engine and the core ADX capabilities are also available, in a SaaS-like flavor, as part of Microsoft Fabric Real-Time Analytics.
Kusto stores its data in a partitioned columnar format. This is a common concept with varied implementations and augmentations.
In Kusto terminology, the main data-holding units are called Extents. In addition to compressed data structures and metadata, Extents hold an index for each column (including detailed indexes for dynamic data type columns which represent semi-structured data).
Over time, configurable automatic backend processes maintain the Extents:
Technically speaking, Extents are immutable. Thus, ingesting data to Kusto always involves an append operation.
Deletion of a subset of an Extent’s records is accomplished by recomputing the whole Extent and transactionally switching it with the original version (this is referred to as Data Purge in Kusto terminology). A more performant option is to use the Soft Delete feature. Soft deletion creates a new Extent, with minimal changes, using pointers to the original Extent and an additional column indicating a soft deletion flag (this is a similar feature to the Delta Lake Deletion Vectors optimization).
There is no update records command. Updating records involve performing separate append and delete operations which require planning to be executed transactionally. Materialized Views can be used to easily automate update records processes transactionally. However, they have some limitations and a query performance/latency cost.
Simply put, deleting or updating records in Kusto is a resource intensive (and potentially non-trivial) task. It should be planned carefully and, if acceptable, be avoided. This is the main reason for preferring to pass the list of RLS filter values to a Paginated Report.
Permissions change over time and often require many-to-many types of logic. Implementing an RLS table (i.e., a mapping of user IDs to RLS values used as foreign keys in the fact tables) in Kusto can be cumbersome and error prone. Even more so at streaming ingestion scenarios that may require tombstone logic to imitate deletes.
Another reason for preferring to pass the list of RLS filter values to a Paginated Report has to do with performance.
Relying on an RLS table requires the use of Exists logic (i.e., Semi-Join/In) at query time.
Depending on the size of the tables and the complexity of the query, this can lead to non-neglectable performance overhead. Populating the list of RLS values in the query string is one way to achieve applicative session level caching.
So how can we securely pass a long string value from the web backend to a Paginated Report?
We can use common techniques by employing the native capability of Paginated Reports to run Visual Basic (VB) code.
A mandatory disclaimer: You should consult a professional digital security expert before attempting to implement custom security related solutions, like those described in this blog post.
One technique is to use hashing. A standard hashing algorithm takes any string as an input and produces a short, fixed length string as an output. Additionally, it has the following characteristics:
For example, sha-256 is a commonly used hashing algorithm. It has a practically non existing collision chance, and its output can be expressed as 64 hexadecimal characters.
In this technique, the web backend uses a known hashing algorithm and passes the following to the web frontend:
The Paginated Report employs VB to hash the plain text value using the known hashing algorithm. It then verifies the result by comparing it to the hashed value.
Security wise, this technique may be used only if the long string value does not contain sensitive information and the implications of a possible collision are acceptable.
Another technique is to use symmetric encryption. A standard symmetric encryption algorithm takes any string as an input and produces an encrypted output based on a set of parameters (a key and other parameters specific to the algorithm). Additionally, it has the following characteristics:
For example, aes-256-cbc is a commonly used symmetric encryption algorithm which requires a Key and an Initialization Vector (IV) parameters.
In this technique, the web backend uses a known symmetric encryption algorithm with randomly generated parameter values. It passes the following to the web frontend:
The paginated report extracts the parameter values (they can be separated based on number of characters or a predefined delimiter). It then employs VB to decrypt the long text value using the known symmetric encryption algorithm.
The two techniques can also be combined by passing an encrypted long string value that should be decrypted before comparing it to a hashed value.
Hi @SpartaBI ,
Thanks for your sharing and contribution. Much appreciated!
Best Regards
Check out the September 2024 Power BI update to learn about new features.
Learn from experts, get hands-on experience, and win awesome prizes.
User | Count |
---|---|
41 | |
4 | |
4 | |
3 | |
3 |