Don't miss your chance to take the Fabric Data Engineer (DP-700) exam on us!
Learn moreWe've captured the moments from FabCon & SQLCon that everyone is talking about, and we are bringing them to the community, live and on-demand. Starts on April 14th. Register now
Greetings. I am trying to figure out how to extract text from a column which is proving to be more complicated than I anticipated. Here is some of the sample data from that column. I need to extract InTASC (or INTASC) plus the number/letter after.
| Current Data | Data Needed |
| [NAEYC 2010 5a, NAEYC 2020 5a, InTASC 4l] | InTASC 4l |
| [NAEYC 2020 6a] | (blank) |
| [NAEYC 2020 6e, InTASC 10j] | InTASC 10j |
| [InTASC 10m, NAEYC 2020 3d, ISTE 2.4.d.] | InTASC 10m |
| [ISTE 2.3.a., INTASC 2m] | INTASC 2m |
| [NAEYC 2020 1b, InTASC 1e, ISTE 2.5.a.] | InTASC 1e |
I am open to any method - DAX, M Code, etc. Thank you!
@afaherty Try this:
let
Source = Excel.CurrentWorkbook(){[ Name = "Table1" ]}[Content],
Answer = Table.AddColumn (
Source,
"Answer",
( x ) =>
[
a = x[Current Data],
b = Text.Remove ( Text.Remove ( a, "[" ), "]" ),
c = Text.Split ( b, ", " ),
d = List.Select ( c, ( y ) => Text.StartsWith ( Text.Lower ( y ), "intasc" ) ),
e = List.First ( d )
][e],
type nullable text
)
in
Answer
DAX isn't designed for string manipulation; M doesn't consist of native regex so far. Thus, use embedded python if you're in possession of knowledge of it.
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WivZzdI10VjAyMDRQME3UUYBxjSBcz7wQx2BnBZOcWKVYHSTVQGmzRCxiqXAthgZZUHm4QC6K8cYpQLXBIa4KRnomeil6QI4fWJ2ZYTFMI0TWWC8RIWuUi2mrYRLC1lS4oaZAbUC1sQA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"Current Data" = _t]),
#"Run Python script" = Python.Execute("df['Extract'] = df['Current Data'].str.findall(r'intasc .+?\b', 34).str.join(' | ')",[df=Source]),
df = #"Run Python script"{[Name="df"]}[Value]
in
df
In fact, Excel formula is already powerful enough to handle it easily.
Expertise = List.Accumulate( {Days as from Today}, {Skills and Knowledge}, (Current, Everyday) => Current & Day.LearnAndPractise(Everyday) ) |
Here's a more pedestrian version with "native" Power Query
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WivZzdI10VjAyMDRQME3UUYBxjSBcz7wQx2BnBZOcWKVYHSTVQGmzRCxiqXAthgZZUHm4QC6K8cYpQLXBIa4KRnomeil6QI4fWJ2ZYTFMI0TWWC8RIWuUi2mrYRLC1lS4oaZAbUC1sQA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"Current Data" = _t]),
#"Added Custom" = Table.AddColumn(
Source,
"Result",
each
let
Start = Text.PositionOf(Text.Upper([Current Data]), "INTASC"),
Count = Text.PositionOf(Text.Replace(Text.Range([Current Data], Start), "]", ","), ",")
in
if Start = - 1 then null else Text.Range([Current Data], Start, Count)
)
in
#"Added Custom"
@lbendlin Thank you. I think something is going awry however. It is pulling in incorrect values into the new custom column (example below). Also, I apologize but I neglected to mention that some rows have more than 1 "InTASC" within them. For example: [InTASC 10m, NAEYC 2020 3d, ISTC 2.4.d., INTASC 61s]
Here is an example of the problem that is now happening which can be seen in the attached file.
| Current Data | Custom.Current Data | Custom.Result |
| [CEC 2020 2.1] | [NAEYC 2010 5a, NAEYC 2020 5a, InTASC 4l] | InTASC 4l |
| [CEC 2020 2.1] | [NAEYC 2020 6a] | |
| [CEC 2020 2.1] | [NAEYC 2020 6e, InTASC 10j] | InTASC 10j |
| [CEC 2020 2.1] | [InTASC 10m, NAEYC 2020 3d, ISTE 2.4.d., INTASC 61s] | InTASC 10m |
| [CEC 2020 2.1] | [ISTE 2.3.a., INTASC 2m] | INTASC 2m |
| [CEC 2020 2.1] | [NAEYC 2020 1b, InTASC 1e, ISTE 2.5.a.] | InTASC 1e |
Thanks again!
What is your expected outcome in such situations?
If you have recently started exploring Fabric, we'd love to hear how it's going. Your feedback can help with product improvements.
A new Power BI DataViz World Championship is coming this June! Don't miss out on submitting your entry.
Share feedback directly with Fabric product managers, participate in targeted research studies and influence the Fabric roadmap.
| User | Count |
|---|---|
| 9 | |
| 6 | |
| 3 | |
| 2 | |
| 2 |
| User | Count |
|---|---|
| 22 | |
| 14 | |
| 10 | |
| 6 | |
| 5 |