Check your eligibility for this 50% exam voucher offer and join us for free live learning sessions to get prepared for Exam DP-700.
Get StartedDon't miss out! 2025 Microsoft Fabric Community Conference, March 31 - April 2, Las Vegas, Nevada. Use code MSCUST for a $150 discount. Prices go up February 11th. Register now.
Recently someone asked me what is the best way to filter on multi-valued columns in Power BI. The question was in the context of Tags property of the Work Items - Today table in the Visual Studio Team Services [1]. Although the original question was very specific, the solution I came up with can be generalized to multivalued columns in any data model.
Multivalued column is a database design pattern where instead of normalizing and splitting data across multiple tables you keep multiple values in a single table. You can see it typically in the data warehouses where normalization would lead to a too granular fact tables. One of the best examples is the Categories column in the Product table where want to allow users to select multiple values, but you don't want to create a separate Categories table.
In the context of Visual Studio Team Services there is one table where this pattern was applied - Work Items - Today. It contains Tags column which is a "; " delimited list of tags like in the example below.
Work Item Id | Title | Tags |
1 | Add column A to table B | database; milestone1 |
2 | Create migration script | database |
3 | Improve performance of slow queries | performance; database |
... | ... | ... |
Given table with a multivalued column prepare data model that will allow users to easily filter on distinct values. For example, we can start with the table below, which has multivalued Tags column.
let Source = #table( {"Work Item Id", "Title", "Tags"}, { { "1", "Add column A to table B", "database; milestone1" }, { "2", "Create migration script", "database" }, { "3", "Improve performance of slow queries", "performance; database" } }) in Source
If we simply selected Tags for the slicer it would produce the following result. Instead of values users could only select combinations that appear in the dataset. That's not what we want.
A much better design is to extract distinct values from the Tags column so that we can build the following slicer.
The solution I would like to show you is based on the post by SQLJason where he talks about handling delimited rows [2]. I modernized and improved it a little to cover columns of arbitrary length and to avoid contaminating model with auxiliary tables. The idea stays the same and can be broken down into the following steps.
In my previous post "Creating index table in DAX" I explained how to create index table for a given N. Here, N should be selected as the max number of elements in the multivalued column.
MaxLength =
VAR Separator = "; "
RETURN
MAXX (
'Work Items - Today',
1 + LEN ( [Tags] )
- LEN ( SUBSTITUTE ( [Tags], Separator, "" ) )
)
Now we can use this DAX expression and create Indexes table.
Indexes =
VAR Separator = "; "
RETURN
FILTER (
SUMMARIZE (
ADDCOLUMNS (
CALENDAR (
DATE ( 2000, 1, 1 ),
DATE ( 2000
+ MAXX (
'Work Items - Today',
1 + LEN ( [Tags] )
- LEN ( SUBSTITUTE ( [Tags], Separator, "" ) )
), 1, 1 )
),
"Index", YEAR ( [Date] ) - 2000
),
[Index]
),
[Index] > 0
)
Index |
1 |
2 |
3 |
We do not need to store this table in our model. Instead, we can simply save it in the DAX variable and reuse later.
The final expression consists of the following operations:
Tags =
VAR Separator = "; "
VAR Indexes =
FILTER (
SUMMARIZE (
ADDCOLUMNS (
CALENDAR (
DATE ( 2000, 1, 1 ),
DATE ( 2000
+ MAXX (
'Work Items - Today',
1 + LEN ( [Tags] )
- LEN ( SUBSTITUTE ( [Tags], Separator, "" ) )
), 1, 1 )
),
"Index", YEAR ( [Date] ) - 2000
),
[Index]
),
[Index] > 0
)
RETURN
SUMMARIZE (
ADDCOLUMNS (
FILTER (
CROSSJOIN (
ADDCOLUMNS (
'Work Items - Today',
"TagsCount", 1
+ ( LEN ( [Tags] ) - LEN ( SUBSTITUTE ( [Tags], Separator, "" ) ) )
/ LEN ( Separator )
),
Indexes
),
[Index] <= [TagsCount]
),
"Tag", PATHITEM ( SUBSTITUTE ( [Tags], Separator, "|" ), [Index] )
),
[Work Item Id],
[Tag]
)
It will produce the following result. This table captures relationship between work items and tags.
Work Item Id | Tag |
1 | database |
2 | database |
3 | performance |
1 | milestone1 |
3 | database |
Now, we need to define relationships and specify cross filtering direction. First, task is easy because most likely Power BI will automatically detect the relationship like in the example below.
The automatic relationship is a standard one-to-many relationship, which means that it will allow us to filter Tags based on Work Item - Today selection. That is exactly opposite of what we need. Double-click on the relationship to open the advanced editor and under "Cross filter direction" select "Both".
Finally, create a new slicer with Tag field from the newly created Tags table to get the best filtering experience! You can also try out amazing Smart Filter custom visual, which fits perfectly for this scenario.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.