Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Don't miss out! 2025 Microsoft Fabric Community Conference, March 31 - April 2, Las Vegas, Nevada. Use code MSCUST for a $150 discount. Prices go up February 11th. Register now.

Reply
Msri
Regular Visitor

splitting string by delimiter and comparing the results for mismatch values

Hi All,

 

I have a dataset containing two string columns(Region names). The values are combined with a delimiter ; 

I need a distinct count of rows that have a mismatch between RegionA and RegionB for a date.

I will need to use the measure in a table to show the count of mismatch rows (values not matching or when there is null) when Agency = "Yes".

From the below example, For 11/1/2024, the total should be 2(345 is not counted as Agency = No),

for 11/8/2024 the total will be 1

Thanks in advance for your time and help.

DateColumnAAgencyRegionARegionBMismatch
11/1/2024123YesPhoenix;CharlottePhoenix1
11/8/2024123YesPhoenix;CharlottePhoenix;Charlotte0
11/1/2024345NoEdison;Atlanta 1
11/1/2024567Yes Tampa;Raleigh1
11/8/2024345YesEdison;AtlantaEdison1
11/8/2024567YesTampa;RaleighTampa;Raleigh0
1 ACCEPTED SOLUTION
v-kaiyue-msft
Community Support
Community Support

Hi @Msri ,

 

You can try the following expression.

Column = 
var _regiona=
CONCATENATEX(
    FILTER(ALL('Table'),
    [Date]=EARLIER([Date])&&[ColumnA]=EARLIER([ColumnA])&&[Agency]="Yes"),
    [RegionA],"-")
var _regionb=
CONCATENATEX(
    FILTER(ALL('Table'),
    [Date]=EARLIER([Date])&&[ColumnA]=EARLIER([ColumnA])&&[Agency]="Yes"),
    [RegionB],"-")
var _ifb=
IF(
    CONTAINSSTRING(
        _regiona,[RegionB])=TRUE()&&[RegionB]<>BLANK(),1,0)
var _ifa=
IF(
    CONTAINSSTRING(
        _regionb,[RegionA])=TRUE()&&[RegionA]<>BLANK(),1,0)
return
IF(
    _ifa=1&&_ifb=1,1,0)
Column 2 = 
CALCULATE(DISTINCTCOUNT('Table'[ColumnA]),FILTER(ALL('Table'),[Date]=EARLIER([Date])&&[Column]=1))

vkaiyuemsft_0-1732001878796.png

If your Current Period does not refer to this, please clarify in a follow-up reply.

 

Best Regards,

Clara Gong

If there is any post helps, then please consider Accept it as the solution  to help the other members find it more quickly.

View solution in original post

11 REPLIES 11
v-kaiyue-msft
Community Support
Community Support

Hi @Msri ,

 

You can try the following expression.

Column = 
var _regiona=
CONCATENATEX(
    FILTER(ALL('Table'),
    [Date]=EARLIER([Date])&&[ColumnA]=EARLIER([ColumnA])&&[Agency]="Yes"),
    [RegionA],"-")
var _regionb=
CONCATENATEX(
    FILTER(ALL('Table'),
    [Date]=EARLIER([Date])&&[ColumnA]=EARLIER([ColumnA])&&[Agency]="Yes"),
    [RegionB],"-")
var _ifb=
IF(
    CONTAINSSTRING(
        _regiona,[RegionB])=TRUE()&&[RegionB]<>BLANK(),1,0)
var _ifa=
IF(
    CONTAINSSTRING(
        _regionb,[RegionA])=TRUE()&&[RegionA]<>BLANK(),1,0)
return
IF(
    _ifa=1&&_ifb=1,1,0)
Column 2 = 
CALCULATE(DISTINCTCOUNT('Table'[ColumnA]),FILTER(ALL('Table'),[Date]=EARLIER([Date])&&[Column]=1))

vkaiyuemsft_0-1732001878796.png

If your Current Period does not refer to this, please clarify in a follow-up reply.

 

Best Regards,

Clara Gong

If there is any post helps, then please consider Accept it as the solution  to help the other members find it more quickly.

Msri
Regular Visitor

i split both the columns in power query and then use DAX to do a comparision on the columns for every ID column and date? 

Please let me know the DAX that is needed for that situation

DateColumnAAgencyRegionARegionB
11/1/2024123YesPhoenixPhoenix
11/1/2024123YesCharlotte 
11/8/2024123YesPhoenixPhoenix
11/8/2024123YesCharlotteCharlotte
11/1/2024345NoEdison 
11/1/2024345NoAtlanta 
11/1/2024567Yes Tampa
11/1/2024567Yes Raleigh
11/1/2024567Yes Memphis
11/8/2024345YesAtlantaEdison
11/8/2024345YesEdison 
11/8/2024567YesTampaTampa
11/8/2024567YesRaleighRaleigh
shafiz_p
Super User
Super User

Hi @Msri  Try below code:

MismatchCountByDate_V2 = 
CALCULATE(
    COUNTROWS(
        FILTER(
            'Table',
            [Agency] = "Yes" &&
            (
                ISBLANK([RegionA]) || 
                ISBLANK([RegionB]) || 
                [RegionA] <> [RegionB]
            )
        )
    ),
    ALLEXCEPT('Table', 'Table'[Date])
)

 

Output:

shafiz_p_0-1731554726911.png

Hope this helps!!

If this solved your problem, please accept it as a solution and a kudos!!

 

Best Regards,
Shahariar Hafiz

Thanks Shahariar, it is giving me the same count for every zone.

FreemanZ
Super User
Super User

hi @Msri ,

 

try to plot a visual with date column and a measure like:

measure = 
COUNTROWS(
    CALCULATETABLE(    
        data, 
        data[Agency]="Yes",
        data[RegionA]<>data[RegionB]
    )
)

 

it worked like:

FreemanZ_0-1731551892851.png

 

@FreemanZ 

I guess the problem "which is not described clearly" is when the regions order is different between Region A and Region B columns yet they're still need to be considered as match. 

hi @tamerj1 , 

 

Highly probably yes, as that looks more like reality. 

 

If that is the case, DAX code will need to split, sort, and then combine the regions, before any comparison. Or first process it in Power Query.

 

How would you solve issues like this?

@FreemanZ 

Exactly 

Yes, in SQL i used string_split which didnt need to sort the dataset, but that is what needed to be done in DAX. Both the Region columns have grouped rows. Will it be better to have seperate DAX to split them and then use it to do comparision on both columns. Can you help me with splitting out the columns as well

@Msri 

This calculation can be very expensive in terms of performance. Therefore, I would recommend to have it as a calculated column. The measure can be a simple sum of the column. 

Mismatched =
VAR StringA = 'Table'[RegionA]
VAR ItemsA =
SUBSTITUTE ( StringA, ";", "|" )
VAR LengthA =
COALESCE ( PATHLENGTH ( ItemsA ), 1 )
VAR T1A =
GENERATESERIES ( 1, LengthA, 1 )
VAR T2A =
SELECTCOLUMNS ( T1A, "@ItemA", PATHITEM ( ItemsA, [Value] ) )
VAR StringB = 'Table'[RegionB]
VAR ItemsB =
SUBSTITUTE ( StringB, ";", "|" )
VAR LengthB =
COALESCE ( PATHLENGTH ( ItemsB ), 1 )
VAR T1B =
GENERATESERIES ( 1, LengthB, 1 )
VAR T2B =
SELECTCOLUMNS ( T1B, "@ItemB", PATHITEM ( ItemsA, [Value] ) )
VAR Result =
INT ( NOT ISEMPTY ( EXCEPT ( INTERSECT ( T2A, T2B ), T2A ) ) )
&& 'Table'[Agency] = "Yes"
RETURN
Result

Thank you @tamerj1 , Will it be better if i split both the columns in power query and then use DAX to do a comparision on the columns for every ID column and date? 

Please let me know the DAX that is needed for that situation

DateColumnAAgencyRegionARegionB
11/1/2024123YesPhoenixPhoenix
11/1/2024123YesCharlotte 
11/8/2024123YesPhoenixPhoenix
11/8/2024123YesCharlotteCharlotte
11/1/2024345NoEdison 
11/1/2024345NoAtlanta 
11/1/2024567Yes Tampa
11/1/2024567Yes Raleigh
11/1/2024567Yes Memphis
11/8/2024345YesAtlantaEdison
11/8/2024345YesEdison 
11/8/2024567YesTampaTampa
11/8/2024567YesRaleighRaleigh

Helpful resources

Announcements
Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount! Prices go up Feb. 11th.

Jan25PBI_Carousel

Power BI Monthly Update - January 2025

Check out the January 2025 Power BI update to learn about new features in Reporting, Modeling, and Data Connectivity.

Jan NL Carousel

Fabric Community Update - January 2025

Find out what's new and trending in the Fabric community.