Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Be one of the first to start using Fabric Databases. View on-demand sessions with database experts and the Microsoft product team to learn just how easy it is to get started. Watch now

v-kongfanf-msft

Fuzzy queries by using DAX codes

Scenario:
We often use the concept of fuzzy queries in our life or work. In Power BI Desktop, we can do the same thing with fuzzy queries. In this article, I will show you how to write fuzzy queries using DAX code.

 

Expected Result:

I have a slicer and a table in a report. I hope to realize the following results:

  • When I select a letter, it filters the table to show all results that have that letter.

vkongfanfmsft_0-1716961963856.png

vkongfanfmsft_1-1716961970267.png

  • When I select two or three letters, it filters results with all of these letters.

vkongfanfmsft_2-1716961988181.png

vkongfanfmsft_3-1716961994283.png

  • I select more than three letters, I can also do fuzzy queries without adding DAX code.

vkongfanfmsft_4-1716962013331.png

 

Sample Data:
First I create two tables Table A and Table B, and I use Table A column in a table visual and Table B column in a slicer.

GetImage.png

 

How:
Firstly I want to realize the expected result when selecting only one character. I create the following measure. It uses FIND function to identify whether [Column a] string contains the selected character from the slicer. Then add the measure to the table visual.

 

 

MEASURE =
VAR _Slicer =
    SELECTEDVALUE ( 'Table B'[Column b] )
RETURN
    IF (
        FIND ( _Slicer, SELECTEDVALUE ( 'Table A'[Column a] ),, BLANK () ) <> BLANK (),
        _Slicer
    )

 

 

 

vkongfanfmsft_5-1716962324473.png

 

However, the previous measure only works when selecting only one character. If we hope to filter by two or three characters, I think of another measure. Here we need to use the FIND function and MID function. FIND function is to return a text string in another text string in the starting position. MID function is to provide the starting position and length of the case, from the text string to return to the middle of the string.

 

 

 

Measure 2 =
VAR _slicer =
    CONCATENATEX ( 'Table B', 'Table B'[Column b] )
RETURN
    IF (
        COUNT ( 'Table B'[Column b] ) = 1
            && FIND ( MID ( _slicer, 1, 1 ), SELECTEDVALUE ( 'Table A'[Column a] ),, BLANK () )
                <> BLANK (),
        _slicer,
        IF (
            COUNT ( 'Table B'[Column b] ) = 2
                && FIND ( MID ( _slicer, 1, 1 ), SELECTEDVALUE ( 'Table A'[Column a] ),, BLANK () )
                    <> BLANK ()
                && FIND ( MID ( _slicer, 2, 1 ), SELECTEDVALUE ( 'Table A'[Column a] ),, BLANK () )
                    <> BLANK (),
            _slicer,
            IF (
                COUNT ( 'Table B'[Column b] ) = 3
                    && FIND ( MID ( _slicer, 1, 1 ), SELECTEDVALUE ( 'Table A'[Column a] ),, BLANK () )
                        <> BLANK ()
                    && FIND ( MID ( _slicer, 2, 1 ), SELECTEDVALUE ( 'Table A'[Column a] ),, BLANK () )
                        <> BLANK ()
                    && FIND ( MID ( _slicer, 3, 1 ), SELECTEDVALUE ( 'Table A'[Column a] ),, BLANK () )
                        <> BLANK (),
                _slicer
            )
        )
    )

 

 

 

vkongfanfmsft_9-1716962941310.png

vkongfanfmsft_10-1716962947874.png

 

 

Through the above DAX code we can indeed select three or fewer than three characters to filter the table. However it’s not very smart as if we hope to select more characters, it won’t work. And I don’t want to repeat many similar codes like

 

 

FIND ( MID ( _slicer, 1, 1 ), SELECTEDVALUE ( 'Table A'[Column a] ),, BLANK () )

 

 

to achieve the expected result. So I think if it can be optimized further. After some research, I finally get the following DAX code.

 

 

 

 

Measure_4 =
VAR _Slicer =
    VALUES ( 'Table B'[Column b] ) // This variable gets all the unique values of 'Column b' in 'Table B'. 
VAR _vtable =
    ADDCOLUMNS (
        GENERATE ( 'Table A', _Slicer ),
        "AAA",
            IF ( FIND ( [Column b], [Column a],, BLANK () ) <> BLANK (), [Column b] ) // Combine ‘Table A’ with _Slicer 
    )
VAR _vtable2 =
    SUMMARIZE (
        _vtable,
        [Column a],
        "BBB",
            IF (
                LEN ( CONCATENATEX ( _Slicer, [Column b] ) )
                    = LEN (
                        CONCATENATEX ( FILTER ( _vtable, [Column a] = EARLIER ( [Column a] ) ), [AAA] )
                    ),
                CONCATENATEX ( FILTER ( _vtable, [Column a] = EARLIER ( [Column a] ) ), [AAA] )
            )
    )
RETURN
    IF (
        CONCATENATEX ( FILTER ( _vtable2, [Column a] = 'Table A'[Column a] ), [BBB] )
            <> BLANK (),
        CONCATENATEX ( FILTER ( _vtable2, [Column a] = 'Table A'[Column a] ), [BBB] )
    )

 

 

 

 

This allows fuzzy queries no matter how many characters you select.

vkongfanfmsft_7-1716962824303.png

vkongfanfmsft_8-1716962834743.png

 

Summary:

As you can see from the above example, when we keep updating the DAX code, we can change from the original more clunky approach to the final dynamic yet concise approach. In addition, when we come to a practical scenario that we may want to filter by more strings in the slicer, the last DAX solution is applicable too. Below is a fuzzy query I did for a car brand, the exact steps can be found in my .pbix file for your own reference.

vkongfanfmsft_6-1716962777431.png

 

Author:Dragon Chou
Reviewer:Ula and Kerry