Power BI is turning 10! Tune in for a special live episode on July 24 with behind-the-scenes stories, product evolution highlights, and a sneak peek at what’s in store for the future.
Save the dateEnhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends August 31st. Request your voucher.
Hi All,
We have a column named Tags against each ID which contains the bill numbers. Have to pull out the bill numbers against each ID from tags column. Bill number doesn't have standard format. Few Syntax from below screens hot -- Itemname:BillBillnumber , Itemname:BILLBillnumber , Itemname: BillBillnumber. Each tag is seperated by a ";" . Each ID can have many bill numbers or dont have it. There may be some other number in the tags list other than Bill number.
Tried using Power Query and also dax(new column), unable to figure it out. Please help.
Solved! Go to Solution.
@Abbi Still some clean up to do but this should get you started:
Bill number =
VAR __Text = [Tags]
VAR __1stBill = SEARCH("bill", __Text,,0)
VAR __1stSemi = IF(IF(__1stBill<>0,SEARCH(";", __Text, __1stBill,0),0)=0,LEN(__Text)+1,SEARCH(";", __Text, __1stBill,0))
VAR __2ndBill = IF(__1stSemi<>0,SEARCH("bill", __Text,__1stSemi,0),0)
VAR __2ndSemi = IF(IF(__2ndBill<>0,SEARCH(";", __Text, __2ndBill,0),0)=0,LEN(__Text)+1,SEARCH(";", __Text, __2ndBill,0))
VAR __1stBillNum = IF(__1stBill=0,BLANK(),MID(__Text,__1stBill+4,__1stSemi-__1stBill-4))
VAR __2ndBillNum = IF(__2ndBill=0,BLANK(),MID(__Text,__2ndBill+4,__2ndSemi - __2ndBill - 4))
VAR __1stTales = SEARCH("tales:", __Text,,0)
VAR __1stTalesSemi = IF(IF(__1stBill<>0,SEARCH(";", __Text, __1stTales,0),0)=0,LEN(__Text)+1,SEARCH(";", __Text, __1stTales,0))
VAR __1stTalesNum = IF(__1stTales=0,BLANK(),MID(__Text,__1stTales+6,__1stTalesSemi - __1stTales - 6))
RETURN
SWITCH(TRUE(),
ISBLANK(__1stTalesNum) && ISBLANK(__2ndBillNum),__1stBillNum,
ISBLANK(__1stTalesNum) && NOT(ISBLANK(__2ndBillNum)),__1stBillNum & ";" & __2ndBillNum,
ISBLANK(__2ndBillNum) && NOT(ISBLANK(__1stTalesNum)),__1stBillNum & ";" & __1stTalesNum,
__1stBillNum & ";" & __2ndBillNum & ";" & __1stTalesNum
)
@Abbi You didn't post sample data as text so you may get a few syntax errors because I can't test this but try:
Bill number column =
VAR __Text = [Tags]
VAR __1stBill = SEARCH("bill", __Text,,BLANK())
VAR __1stSemi = SEARCH(";", __Text, __1stBill,BLANK())
VAR __2ndBill = SEARCH("bill", __Text,__1stSemi,BLANK())
VAR __2ndSemi = SEARCH(";", __Text, __2ndBill,BLANK())
VAR __1stBillNum = IF(ISBLANK(__1stBill),BLANK(),MID(__Text,__1stBill+4,__1stSemi - 1stBill - 4))
VAR __2ndBillNum = IF(ISBLANK(__2ndBill),BLANK(),MID(__Text,__2ndBill+4,__2ndSemi - 2ndBill - 4))
RETURN
IF(ISBLANK(__2ndBillNum),__1stBillNum, __1stBillNum & ";" & __2ndBillNum)
@Greg_Deckler thanks for looking at my post. Pasting down the data for you to explore. I modified the data. Please help me out.
ID | Tags | Bill number |
456324 | Salt:Bill3334223; shoe;Paper;Note;6428 | 3334223 |
43548 | Chain;Keys;Socks:BILL7685;Tomato;Billdetails | 7685 |
545666 | Figs;Pops;Goods: Bill37682 | 37682 |
32466 | Hats;Yogurt: BiLL 21889; Salt:Bill48205 | 21889;48205 |
3350987 | Fanta;Clips;Onti:bill 23487;Billonfine | 23487 |
34228 | Sause;Grapes;Pads | |
58791 | Napkens;Thread:Bill 396742;Eggs;Trays: BILL 341730 | 396742;3411730 |
876537 | Bowl;Jackets: Bill23710;gloves:Bill365901;Tales:610472 | 23710;365901;610472 |
@Abbi you can achieve this with native PQ
let
regex=
let
fx=(input)=>
Web.Page(
"<script>
var x='"&input&"'; // this is the input string for regex
var b=x.match(/bill\d+|bill \d+/gmi); // specify the desired regular expression inside string.match()
//https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/match
var c =b.toString().replace(/bill|bill /gmi,'')
document.write(c);
</script>"){0}[Data]{0}[Children]{1}[Children]{0}[Text]
in
fx,
Source = Web.BrowserContents("https://community.powerbi.com/t5/DAX-Commands-and-Tips/How-to-Split-a-specific-value-from-a-column/m-p/2199869#M51627"),
#"Extracted Table From Html" = Html.Table(Source, {{"Column1", "TABLE:nth-child(3) > * > TR > :nth-child(1)"}, {"Column2", "TABLE:nth-child(3) > * > TR > :nth-child(2)"}, {"Column3", "TABLE:nth-child(3) > * > TR > :nth-child(3)"}}, [RowSelector="TABLE:nth-child(3) > * > TR"]),
#"Promoted Headers" = Table.PromoteHeaders(#"Extracted Table From Html", [PromoteAllScalars=true]),
#"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"ID", Int64.Type}, {"Tags", type text}, {"Bill number", type text}}),
#"Removed Columns" = Table.RemoveColumns(#"Changed Type",{"Bill number"}),
#"Added Custom" = Table.AddColumn(#"Removed Columns", "Bill Number", each try regex([Tags]) otherwise null)
in
#"Added Custom"
Apart from PQ, this regex pattern is easier,
/(?<=bill) *\d+/gmi
The following snippet of js code works well,
<Script>
var regex = /(?<=bill) *\d+/gmi;
var str = 'Hats;Yogurt: BiLL 21889; Salt:Bill48205';
var matches = str.matchAll(regex);
console.log(Array.from(matches, x => x[0].trim()).join('; '))
</Script>
but it throws error when embedded in PQ. PQ doesn't support embedded js with lookbehind assertion, I'm afraid; but I didn't find any official documentation in this regard. Pls share it if you find.
On top of regex, I'd use native functions in this simple scenario,
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("PZDLbsIwEEV/xcq6i8TvZHaglj4iikQ2FWLhEjdYuDGKTSv+vhODurXvmblndruCC8koLx6KrfGpWTjvGWOcUgYkHoOFjTnbCdYhWZCc6mL/gAwTXCOyPBo3wpu9RtiGwyk2i5e2VVIL6MK3SQHmcb1NxvmYQYHbpETyyQ0RNuEcYRVCHxuSFyNKcw4b5dizSRE+wnCZ0hxpW0IrrWsg/2W5pqW4MUyUtVbzcDMmA0vvcPz7mFzziUFCGdcqNwrjlxvtDUJTneUv0cJqQlnsZfp7Xa3qCn/X5nyyY4TuOFnT572E1VJxCo8DinSTuc4OaE8YrxQrM66VFGwutAi/Hl7N4WTTXZUyVZUw+PBj4+3oUtRlBZ3x+CCrkiu8xP4P", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [ID = _t, Tags = _t]),
Bills = Table.AddColumn(Source, "Bills", each Text.Combine(List.Accumulate(Text.Split([Tags], ";"), {}, (s,c) => let num = if Text.Contains(c, "bill", Comparer.OrdinalIgnoreCase) then Text.Select(c, {"0".."9"}) else "" in if Text.Length(num) > 0 then s & {num} else s), "; "))
in
Bills
Thanks to the great efforts by MS engineers to simplify syntax of DAX! Most beginners are SUCCESSFULLY MISLED to think that they could easily master DAX; but it turns out that the intricacy of the most frequently used RANKX() is still way beyond their comprehension! |
DAX is simple, but NOT EASY! |
@CNENFRNL positive lookbehind doesnot work in PQ and my best guess is because the browser in PQ is IE11. IE11 does not support positive lookbehind Hence, my js code was improvised.
https://caniuse.com/?search=positive%20lookbehind
Also, console.log will not work, you need to use document.write.
@Abbi Still some clean up to do but this should get you started:
Bill number =
VAR __Text = [Tags]
VAR __1stBill = SEARCH("bill", __Text,,0)
VAR __1stSemi = IF(IF(__1stBill<>0,SEARCH(";", __Text, __1stBill,0),0)=0,LEN(__Text)+1,SEARCH(";", __Text, __1stBill,0))
VAR __2ndBill = IF(__1stSemi<>0,SEARCH("bill", __Text,__1stSemi,0),0)
VAR __2ndSemi = IF(IF(__2ndBill<>0,SEARCH(";", __Text, __2ndBill,0),0)=0,LEN(__Text)+1,SEARCH(";", __Text, __2ndBill,0))
VAR __1stBillNum = IF(__1stBill=0,BLANK(),MID(__Text,__1stBill+4,__1stSemi-__1stBill-4))
VAR __2ndBillNum = IF(__2ndBill=0,BLANK(),MID(__Text,__2ndBill+4,__2ndSemi - __2ndBill - 4))
VAR __1stTales = SEARCH("tales:", __Text,,0)
VAR __1stTalesSemi = IF(IF(__1stBill<>0,SEARCH(";", __Text, __1stTales,0),0)=0,LEN(__Text)+1,SEARCH(";", __Text, __1stTales,0))
VAR __1stTalesNum = IF(__1stTales=0,BLANK(),MID(__Text,__1stTales+6,__1stTalesSemi - __1stTales - 6))
RETURN
SWITCH(TRUE(),
ISBLANK(__1stTalesNum) && ISBLANK(__2ndBillNum),__1stBillNum,
ISBLANK(__1stTalesNum) && NOT(ISBLANK(__2ndBillNum)),__1stBillNum & ";" & __2ndBillNum,
ISBLANK(__2ndBillNum) && NOT(ISBLANK(__1stTalesNum)),__1stBillNum & ";" & __1stTalesNum,
__1stBillNum & ";" & __2ndBillNum & ";" & __1stTalesNum
)
@Greg_Deckler the above DAX formula is working good, but when i have Bill number in Tags for more than twice then we will be missing all the bill number after 2nd occurences. So can we re-model the formula so that any number occurences of Bill number can acheived.
@Abbi Well, another approach would be to use Text to Table: Text to Table - Microsoft Power BI Community
Column =
VAR __Separator = ";"
VAR __SearchText = [Tags]
VAR __Text = SUBSTITUTE(__SearchText,__Separator,"|")
VAR __Table =
ADDCOLUMNS(
GENERATESERIES(1,__Count,1),
"__Word",PATHITEM(__Text,[Value],TEXT)
)
Basically from there you would add a column that searches for your key words and returns 1 if found and 0 if not. Then you filter to just the 1's. Then you use CONCATENATEX to return all of them.
User | Count |
---|---|
25 | |
11 | |
8 | |
6 | |
6 |
User | Count |
---|---|
27 | |
13 | |
11 | |
9 | |
6 |