Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

We've captured the moments from FabCon & SQLCon that everyone is talking about, and we are bringing them to the community, live and on-demand. Starts on April 14th. Register now

Reply
afaherty
Helper V
Helper V

Extracting complicated text from a column

Greetings. I am trying to figure out how to extract text from a column which is proving to be more complicated than I anticipated. Here is some of the sample data from that column. I need to extract InTASC (or INTASC) plus the number/letter after. 

 

Current DataData Needed
[NAEYC 2010 5a, NAEYC 2020 5a, InTASC 4l]InTASC 4l
[NAEYC 2020 6a](blank)
[NAEYC 2020 6e, InTASC 10j]InTASC 10j
[InTASC 10m, NAEYC 2020 3d, ISTE 2.4.d.]InTASC 10m
[ISTE 2.3.a., INTASC 2m]INTASC 2m
[NAEYC 2020 1b, InTASC 1e, ISTE 2.5.a.]InTASC 1e

 

I am open to any method - DAX, M Code, etc. Thank you!

5 REPLIES 5
AntrikshSharma
Super User
Super User

@afaherty Try this:

let
    Source = Excel.CurrentWorkbook(){[ Name = "Table1" ]}[Content],
    Answer = Table.AddColumn (
        Source,
        "Answer",
        ( x ) =>
            [
                a = x[Current Data],
                b = Text.Remove ( Text.Remove ( a, "[" ), "]" ),
                c = Text.Split ( b, ", " ),
                d = List.Select ( c, ( y ) => Text.StartsWith ( Text.Lower ( y ), "intasc" ) ),
                e = List.First ( d )
            ][e],
        type nullable text
    )
in
    Answer

 

ThxAlot
Super User
Super User

DAX isn't designed for string manipulation; M doesn't consist of native regex so far. Thus, use embedded python if you're in possession of knowledge of it.

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WivZzdI10VjAyMDRQME3UUYBxjSBcz7wQx2BnBZOcWKVYHSTVQGmzRCxiqXAthgZZUHm4QC6K8cYpQLXBIa4KRnomeil6QI4fWJ2ZYTFMI0TWWC8RIWuUi2mrYRLC1lS4oaZAbUC1sQA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"Current Data" = _t]),

    #"Run Python script" = Python.Execute("df['Extract'] = df['Current Data'].str.findall(r'intasc .+?\b', 34).str.join(' | ')",[df=Source]),
    df = #"Run Python script"{[Name="df"]}[Value]
in
    df

ThxAlot_0-1775683871592.png

 

In fact, Excel formula is already powerful enough to handle it easily.

ThxAlot_2-1775684565240.png



Expertise = List.Accumulate(


        {Days as from Today},


        {Skills and Knowledge},


        (Current, Everyday) => Current & Day.LearnAndPractise(Everyday)


)



Here's a more pedestrian version with "native"  Power Query

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WivZzdI10VjAyMDRQME3UUYBxjSBcz7wQx2BnBZOcWKVYHSTVQGmzRCxiqXAthgZZUHm4QC6K8cYpQLXBIa4KRnomeil6QI4fWJ2ZYTFMI0TWWC8RIWuUi2mrYRLC1lS4oaZAbUC1sQA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"Current Data" = _t]),
    #"Added Custom" = Table.AddColumn(
    Source, 
    "Result", 
    each 
      let
        Start = Text.PositionOf(Text.Upper([Current Data]), "INTASC"), 
        Count = Text.PositionOf(Text.Replace(Text.Range([Current Data], Start), "]", ","), ",")
      in
        if Start = - 1 then null else Text.Range([Current Data], Start, Count)
  )
in
  #"Added Custom"

@lbendlin Thank you. I think something is going awry however. It is pulling in incorrect values into the new custom column (example below). Also, I apologize but I neglected to mention that some rows have more than 1 "InTASC" within them. For example: [InTASC 10m, NAEYC 2020 3d, ISTC 2.4.d., INTASC 61s]

 

Here is an example of the problem that is now happening which can be seen in the attached file. 

 

Current DataCustom.Current DataCustom.Result
[CEC 2020 2.1][NAEYC 2010 5a, NAEYC 2020 5a, InTASC 4l]InTASC 4l
[CEC 2020 2.1][NAEYC 2020 6a] 
[CEC 2020 2.1][NAEYC 2020 6e, InTASC 10j]InTASC 10j
[CEC 2020 2.1][InTASC 10m, NAEYC 2020 3d, ISTE 2.4.d., INTASC 61s]InTASC 10m
[CEC 2020 2.1][ISTE 2.3.a., INTASC 2m]INTASC 2m
[CEC 2020 2.1][NAEYC 2020 1b, InTASC 1e, ISTE 2.5.a.]InTASC 1e

 

Thanks again!

What is your expected outcome in such situations?

Helpful resources

Announcements
New to Fabric survey Carousel

New to Fabric Survey

If you have recently started exploring Fabric, we'd love to hear how it's going. Your feedback can help with product improvements.

Power BI DataViz World Championships carousel

Power BI DataViz World Championships - June 2026

A new Power BI DataViz World Championship is coming this June! Don't miss out on submitting your entry.

Join our Fabric User Panel

Join our Fabric User Panel

Share feedback directly with Fabric product managers, participate in targeted research studies and influence the Fabric roadmap.

March Power BI Update Carousel

Power BI Community Update - March 2026

Check out the March 2026 Power BI update to learn about new features.