We've captured the moments from FabCon & SQLCon that everyone is talking about, and we are bringing them to the community, live and on-demand. Starts on April 14th. Register now
Hi and thanks for reading this.
Have tried to find an answer for this but no luck.
Is it possible to do a textsearch that search in blocks of every 3 letter in a string.
Lets say i am looking for the string abc
My text is:ababcbcazabccad
In this text there is 2 abc but only one if its based on every 3 letter.(The boxes are: aba,bcb,caz,abc,cab)
Thanks
Solved! Go to Solution.
Hey @Wresen ,
first I created a table that contains the "predefined blocks", I called this table "searchstrings" and the column "searchstring", the table looks like this:
The table is NOT related to the item-table.
Then I created this measure:
Measure =
var __SearchstringsToFind = VALUES( 'Searchstrings'[searchstring] )
var lengthOfBlock = 3
return
COUNTX(
Sheet1
, var afunnystring = 'Sheet1'[String]
var lengthofstring = LEN( afunnystring )
var noofoccurrences =
IF( MOD( lengthofstring , lengthOfBlock ) = 0
, INT( DIVIDE( lengthofstring , lengthOfBlock ) )
, INT( DIVIDE( lengthofstring , lengthOfBlock ) ) + 1
)
var __blocks =
ADDCOLUMNS(
GENERATESERIES( 1 , noofoccurrences )
, "block" , MID( afunnystring , ''[Value] * lengthOfBlock - 2 , lengthOfBlock )
)
return
COUNTROWS(
FILTER(
__blocks
, [block] in __SearchstringsToFind
)
)
)
It seems the measure is returning the expected result:
Hopefully, this provides what you are looking for.
Regards,
Tom
Hey @Wresen ,
this can be done using DAX inside a measure. The idea behind this approach is the following:
You have to be aware that this can become complex and slow, as text analysis is not the domain of Power BI.
Please take the time and create some sample data and provide a link to your sample data, if there is more than the "simple" example you provided.
Regards,
Tom
Hi @TomMartens
First i would like to apoligize for a very late answer and reply ( have been on vaction)
I have tried it now and it seems to work perfectly as i want , thank you so much for this.
Hi @TomMartens
I have added some more data , still simple but more:
If we look for abc , the correct ones are Green , Yellow, Black and Purple.
My thoug it that the user should be able trough a filter choose combinations and get the value of the items summed .
Ex he chooses abc and cee . The sum would be total 5 items with these combo (Green, Yellow, Black, Purple and also now Red (cee combo)
I hope you understand what i try to explain .
/Thanks
edit , added link:
https://drive.google.com/file/d/1VTKJf6ZXxG99vrZBEa6lbzGedRzuyVsz/view?usp=sharing
| Item | String |
| Green | abcbabcdd |
| Blue | cdeababcd |
| Yellow | bbcabcabd |
| Brown | babcabcda |
| Black | aabcbaabc |
| Red | ababcacde |
| Purple | acdacaabc |
Hey @Wresen ,
first I created a table that contains the "predefined blocks", I called this table "searchstrings" and the column "searchstring", the table looks like this:
The table is NOT related to the item-table.
Then I created this measure:
Measure =
var __SearchstringsToFind = VALUES( 'Searchstrings'[searchstring] )
var lengthOfBlock = 3
return
COUNTX(
Sheet1
, var afunnystring = 'Sheet1'[String]
var lengthofstring = LEN( afunnystring )
var noofoccurrences =
IF( MOD( lengthofstring , lengthOfBlock ) = 0
, INT( DIVIDE( lengthofstring , lengthOfBlock ) )
, INT( DIVIDE( lengthofstring , lengthOfBlock ) ) + 1
)
var __blocks =
ADDCOLUMNS(
GENERATESERIES( 1 , noofoccurrences )
, "block" , MID( afunnystring , ''[Value] * lengthOfBlock - 2 , lengthOfBlock )
)
return
COUNTROWS(
FILTER(
__blocks
, [block] in __SearchstringsToFind
)
)
)
It seems the measure is returning the expected result:
Hopefully, this provides what you are looking for.
Regards,
Tom
Hi @Wresen ,
My final result is this
The overall process is to copy a table first, then separate it, find the row where abc is located, filter and retain abc, use GROUPBY to count, and finally merge the two tables.
You can check the details from the attachment.
Best Regards,
Stephen Tao
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
HI @Anonymous and thx so much for your help.
I am not sure this will work since its need to be in a measure since in the end the user will be able to filter on some predefined 3 blocks to get a resualt.
Also it need to be in 3 blocks it does its search (string possition 1-3,4-6,7-9 and so on, the sting is very long and can contain duplicats if its not done in a 3block search, otherwise its is unique)
/Thanks
Hi @Wresen ,
Believe this can be done using an adjustment of a code used before but the question is on the split by the values you are searching.
The link is the following
In your case I have made the following measure:
Find Word Formula =
// Character that split phrase into words
VAR SplitByCharacter = "abc"
// Temporary table that splits selected phrase into words
VAR Words_table =
--FILTER (
ADDCOLUMNS (
GENERATE (
SELECTCOLUMNS (
ALLSELECTED ( 'Table'[Column1] ),
"Find_Text", 'Table'[Column1]
),
VAR TokenCount =
PATHLENGTH ( SUBSTITUTE ( [Find_Text], SplitByCharacter, "|" ) )
RETURN
GENERATESERIES ( 1, TokenCount )
),
"Word", PATHITEM ( SUBSTITUTE ( [Find_Text], SplitByCharacter, "|" ), [Value] )
)
RETURN
COUNTROWS(Words_table) - 1
Has you can see below waht I'm doing is picking up your text line and replacing the text you are looking by a "|" and then creating a line for each of the |
In this case you would get the following table:
This is your text but without the ABC if we count the number of rows and subtract 1 we get howe many ABC were in the text:
Please see attach file.
Regards
Miguel Félix
Proud to be a Super User!
Check out my blog: Power BI em PortuguêsHi @MFelix and thx so much for the code.
I am not sure the code will work since the code "only" looks for abc , not that its in the right place in the string (possition 1-3 or 4 -6 or7-9 and so on, that what i call in 3 blocks)
Also the file will have over 500k rows of items that has a this long string on each row and my thougt was that the user could filter these 500k items on a few predifined 3 block strings (the 3 block strings is a defining of the object).
In Power Query, you can split the column by number of characters (repeatedly).
Now you can search for the string in each column (if TextColumn.1 = "abc" or TextColumn.2 = "abc"...).
Proud to be a Super User!
Thanks so much , but the problem is that string is VERY big so i belive it will not be good to spilt it up in a colum for each string.
Also in the long run my thougt is to be able to do a filtersearch for some of these sting of 3 letters.
/thanks
Another option is to split into rows:
Proud to be a Super User!
If you have recently started exploring Fabric, we'd love to hear how it's going. Your feedback can help with product improvements.
A new Power BI DataViz World Championship is coming this June! Don't miss out on submitting your entry.
Share feedback directly with Fabric product managers, participate in targeted research studies and influence the Fabric roadmap.
| User | Count |
|---|---|
| 53 | |
| 38 | |
| 33 | |
| 17 | |
| 17 |
| User | Count |
|---|---|
| 67 | |
| 62 | |
| 38 | |
| 34 | |
| 22 |