Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Learn from the best! Meet the four finalists headed to the FINALS of the Power BI Dataviz World Championships! Register now

Reply
NewbiePBIX88
New Member

Count the matching string between tables

Hi Everyone, 

I am new to PowerBI and hope you can help me with this case 🙂

I have table 1

Nr
12345
12345678
1234567
21342145
1234
1231341


and table 2:

Nr
12345
12345678


And Table 3: 

Nr
1234567
213A-4/2145


I am able to achieve the following Result, while comparing table 1 vs 2 and 3

Nr

Found

12345yes
12345678yes
1234567

yes

21342145no
1234no
1231341no


My question is: For the no result, I would like to count the matching string (when more than 4 character are matching): so the end result should look like this:

NrFoundMatching
12345yes100%
12345678yes100%
1234567yes100%
21342145nopartly matching
1234no0 %
1231341no0 %


Please let me know if that can be achieveable. 
Thank you so mi,




1 ACCEPTED SOLUTION
Anonymous
Not applicable

Hi, @NewbiePBIX88 

Based on the data you provided, I used the following example data:

vjianpengmsft_0-1725415284936.png

Use the following DAX expression to determine whether the string in Table1 is contained in Table2 or Table3:

Found = 
IF(
    OR(
        CONTAINS(Table2, Table2[Nr], Table1[Nr]),
        CONTAINS(Table3, Table3[Nr], Table1[Nr])
    ),
    "yes",
    "no"
)

vjianpengmsft_1-1725415387403.png

Use the following DAX to determine if there is a 100% match:

Column = 
VAR _table2= SELECTCOLUMNS('Table2','Table2'[Nr])
VAR _table3 = SELECTCOLUMNS('Table3','Table3'[Nr])
RETURN
IF(
    'Table1'[Nr] IN _table2 || 'Table1'[Nr] IN _table3,"100%", IF(
        OR(
        CONTAINS(Table2, Table2[Nr], Table1[Nr]),
        CONTAINS(Table3, Table3[Nr], Table1[Nr])
    ),"Partly","0%"
    )
)

vjianpengmsft_2-1725415479442.png

So there's a problem, as you described, when there are more than 4 character matches, a partial match is displayed, otherwise 0% is displayed.
Take the example you provided, for example, there is a 21342145 in Table, and there is 213A-4/2145 in Table3. The two strings should show a partial match.
In Power BI, each cell is treated as a finished string, and it's hard to exhaustively enumerate each character in another column.
Using the above two strings as an example, power BI treats the 21342145 as a whole and searches for them at 213A-4/2145. But 213A-4/2145 strings, DAX or PowerQuery are difficult to implement, or you need to split each character into separate columns and then match each of these columns.
It's going to be a complicated process. On the contrary, this requirement can be easily implemented in Python. In Python I used the pandas library and then the Python connector in Power BI desktop:

vjianpengmsft_3-1725417049020.png

 

 

Use the following Python script:

import pandas as pd

# Create data tables
table1 = pd.DataFrame({'Nr': ['12345', '12345678', '1234567', '21342145', '1234', '1231341']})
table2 = pd.DataFrame({'Nr': ['12345', '12345678']})
table3 = pd.DataFrame({'Nr': ['1234567', '213A-4/2145']})

# Define matching function
def match_percentage(nr, table):
    for t_nr in table['Nr']:
        # Check if more than 4 characters match
        if len(set(nr) & set(t_nr)) > 4:
            return "partly matching"
    return "0 %"

# Calculate matching results
results = []
for nr in table1['Nr']:
    if nr in table2['Nr'].values or nr in table3['Nr'].values:
        results.append({'Nr': nr, 'Found': 'yes', 'Matching': '100%'})
    else:
        matching = match_percentage(nr, table2) if match_percentage(nr, table2) == "partly matching" else match_percentage(nr, table3)
        results.append({'Nr': nr, 'Found': 'no', 'Matching': matching})

# Convert to DataFrame
result_df = pd.DataFrame(results)

print(result_df)

vjianpengmsft_4-1725417205822.png

Here are the results:

vjianpengmsft_9-1725417329368.png

 

 

vjianpengmsft_6-1725417268242.png

vjianpengmsft_7-1725417277829.png

vjianpengmsft_8-1725417289205.png

The above is a feasible idea or a method that can be implemented at the moment. I've provided the PBIX file used this time below.

 

 

 

Best Regards

Jianpeng Li

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

 

View solution in original post

1 REPLY 1
Anonymous
Not applicable

Hi, @NewbiePBIX88 

Based on the data you provided, I used the following example data:

vjianpengmsft_0-1725415284936.png

Use the following DAX expression to determine whether the string in Table1 is contained in Table2 or Table3:

Found = 
IF(
    OR(
        CONTAINS(Table2, Table2[Nr], Table1[Nr]),
        CONTAINS(Table3, Table3[Nr], Table1[Nr])
    ),
    "yes",
    "no"
)

vjianpengmsft_1-1725415387403.png

Use the following DAX to determine if there is a 100% match:

Column = 
VAR _table2= SELECTCOLUMNS('Table2','Table2'[Nr])
VAR _table3 = SELECTCOLUMNS('Table3','Table3'[Nr])
RETURN
IF(
    'Table1'[Nr] IN _table2 || 'Table1'[Nr] IN _table3,"100%", IF(
        OR(
        CONTAINS(Table2, Table2[Nr], Table1[Nr]),
        CONTAINS(Table3, Table3[Nr], Table1[Nr])
    ),"Partly","0%"
    )
)

vjianpengmsft_2-1725415479442.png

So there's a problem, as you described, when there are more than 4 character matches, a partial match is displayed, otherwise 0% is displayed.
Take the example you provided, for example, there is a 21342145 in Table, and there is 213A-4/2145 in Table3. The two strings should show a partial match.
In Power BI, each cell is treated as a finished string, and it's hard to exhaustively enumerate each character in another column.
Using the above two strings as an example, power BI treats the 21342145 as a whole and searches for them at 213A-4/2145. But 213A-4/2145 strings, DAX or PowerQuery are difficult to implement, or you need to split each character into separate columns and then match each of these columns.
It's going to be a complicated process. On the contrary, this requirement can be easily implemented in Python. In Python I used the pandas library and then the Python connector in Power BI desktop:

vjianpengmsft_3-1725417049020.png

 

 

Use the following Python script:

import pandas as pd

# Create data tables
table1 = pd.DataFrame({'Nr': ['12345', '12345678', '1234567', '21342145', '1234', '1231341']})
table2 = pd.DataFrame({'Nr': ['12345', '12345678']})
table3 = pd.DataFrame({'Nr': ['1234567', '213A-4/2145']})

# Define matching function
def match_percentage(nr, table):
    for t_nr in table['Nr']:
        # Check if more than 4 characters match
        if len(set(nr) & set(t_nr)) > 4:
            return "partly matching"
    return "0 %"

# Calculate matching results
results = []
for nr in table1['Nr']:
    if nr in table2['Nr'].values or nr in table3['Nr'].values:
        results.append({'Nr': nr, 'Found': 'yes', 'Matching': '100%'})
    else:
        matching = match_percentage(nr, table2) if match_percentage(nr, table2) == "partly matching" else match_percentage(nr, table3)
        results.append({'Nr': nr, 'Found': 'no', 'Matching': matching})

# Convert to DataFrame
result_df = pd.DataFrame(results)

print(result_df)

vjianpengmsft_4-1725417205822.png

Here are the results:

vjianpengmsft_9-1725417329368.png

 

 

vjianpengmsft_6-1725417268242.png

vjianpengmsft_7-1725417277829.png

vjianpengmsft_8-1725417289205.png

The above is a feasible idea or a method that can be implemented at the moment. I've provided the PBIX file used this time below.

 

 

 

Best Regards

Jianpeng Li

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

 

Helpful resources

Announcements
Join our Fabric User Panel

Join our Fabric User Panel

Share feedback directly with Fabric product managers, participate in targeted research studies and influence the Fabric roadmap.

February Power BI Update Carousel

Power BI Monthly Update - February 2026

Check out the February 2026 Power BI update to learn about new features.

FabCon Atlanta 2026 carousel

FabCon Atlanta 2026

Join us at FabCon Atlanta, March 16-20, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.