Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Get certified in Microsoft Fabric—for free! For a limited time, get a free DP-600 exam voucher to use by the end of 2024. Register now

Reply
mukhan311
Frequent Visitor

Splitting text into separate column with inconsistent delimter

Please find below the image of the result I am  trying to achieve wtih the delimiters being inconsistent in power query

Raw data   
Company item    
A_B_C_96_50140_1   
A_B_C_D_96_50140_1   
X_Y_for PJ227270   
    
Result    
Company item    
A_B_C96501401
A_B_C_D96501401
X_Y_for PJ227270   

 

Any help would be appreciated.

Thank you. 

 

3 ACCEPTED SOLUTIONS
ronrsnfld
Super User
Super User

If your Raw Data encompasses the actual possible patterns, then the following will work:

Raw Data

ronrsnfld_0-1720179379605.png

Code (inserted after the step producing the above screen shot):

    #"Added Custom" = Table.AddColumn(#"Previous Step", "Split", (r)=> 
        let 
            Split=Text.Split(r[#"Company item "],"_"),
            LastTwo = Text.Combine(List.LastN(Split,2),""),
            Test = (try Number.From(LastTwo))[HasError]=true,
            SplitRest = if Test then r[#"Company item "] else List.RemoveLastN(Text.Split(r[#"Company item "],"_"),2),
            Mid = if Test then null else List.Last(SplitRest),
            First = if Test then SplitRest else Text.Combine(List.RemoveLastN(SplitRest),"_")
        in 
            {First, Mid, if not Test then LastTwo else null}, type {text}),
    #"Extracted Values" = Table.TransformColumns(#"Added Custom", {"Split", each Text.Combine(List.Transform(_, Text.From), ";"), type text}),
    #"Split Column by Delimiter" = Table.SplitColumn(#"Extracted Values", "Split", 
        Splitter.SplitTextByDelimiter(";", QuoteStyle.Csv), {"Split.1", "Split.2", "Split.3"}),
    #"Changed Type" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Company item ", type text}, {"Split.1", type text}, {"Split.2", Int64.Type}, {"Split.3", Int64.Type}})
in
    #"Changed Type"

Results:

ronrsnfld_1-1720179647175.png

 

 

View solution in original post

AlienSx
Super User
Super User

let
    Source = Excel.CurrentWorkbook(){[Name="Table5"]}[Content],
    split = Table.ToList(
        Source, 
        (x) => Splitter.SplitTextByCharacterTransition({"_"}, (x) => Text.Contains("0123456789", x))(x{0})
    ),
    tbl = Table.FromList(
        split, 
        (x) => 
            [trim = List.Buffer(List.Transform(x, (w) => Text.Trim(w, "_"))),
            out = {trim{0}, trim{1}?, trim{2}? & trim{3}?}][out]
    )
in
    tbl

View solution in original post

dufoq3
Super User
Super User

Hi @mukhan311

 

Result

dufoq3_0-1720457179024.png

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45Wcox3ineOtzSLNzUwNDGIN1SK1YEJuqALR8RHxqflFykEeBkZmRuZGyjFxgIA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"Company item" = _t]),
    Ad_Helper = Table.AddColumn(Source, "Splitted", each 
        [ a = Splitter.SplitTextByCharacterTransition({"_"}, (x)=> List.Contains({"0".."9"}, x))([Company item]),
          b = List.Transform(List.FirstN(a, 3), (x)=> Text.Trim(x, {"_", " "})),
          c = Text.Combine(b, "||")
        ][c], type text),
    #"Split Column by Delimiter" = Table.SplitColumn(Ad_Helper, "Splitted", Splitter.SplitTextByDelimiter("||", QuoteStyle.Csv))
in
    #"Split Column by Delimiter"

Note: Check this link to learn how to use my query.
Check this link if you don't know how to provide sample data.

View solution in original post

6 REPLIES 6
dufoq3
Super User
Super User

Hi @mukhan311

 

Result

dufoq3_0-1720457179024.png

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45Wcox3ineOtzSLNzUwNDGIN1SK1YEJuqALR8RHxqflFykEeBkZmRuZGyjFxgIA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"Company item" = _t]),
    Ad_Helper = Table.AddColumn(Source, "Splitted", each 
        [ a = Splitter.SplitTextByCharacterTransition({"_"}, (x)=> List.Contains({"0".."9"}, x))([Company item]),
          b = List.Transform(List.FirstN(a, 3), (x)=> Text.Trim(x, {"_", " "})),
          c = Text.Combine(b, "||")
        ][c], type text),
    #"Split Column by Delimiter" = Table.SplitColumn(Ad_Helper, "Splitted", Splitter.SplitTextByDelimiter("||", QuoteStyle.Csv))
in
    #"Split Column by Delimiter"

Note: Check this link to learn how to use my query.
Check this link if you don't know how to provide sample data.

AlienSx
Super User
Super User

let
    Source = Excel.CurrentWorkbook(){[Name="Table5"]}[Content],
    split = Table.ToList(
        Source, 
        (x) => Splitter.SplitTextByCharacterTransition({"_"}, (x) => Text.Contains("0123456789", x))(x{0})
    ),
    tbl = Table.FromList(
        split, 
        (x) => 
            [trim = List.Buffer(List.Transform(x, (w) => Text.Trim(w, "_"))),
            out = {trim{0}, trim{1}?, trim{2}? & trim{3}?}][out]
    )
in
    tbl
ronrsnfld
Super User
Super User

If your Raw Data encompasses the actual possible patterns, then the following will work:

Raw Data

ronrsnfld_0-1720179379605.png

Code (inserted after the step producing the above screen shot):

    #"Added Custom" = Table.AddColumn(#"Previous Step", "Split", (r)=> 
        let 
            Split=Text.Split(r[#"Company item "],"_"),
            LastTwo = Text.Combine(List.LastN(Split,2),""),
            Test = (try Number.From(LastTwo))[HasError]=true,
            SplitRest = if Test then r[#"Company item "] else List.RemoveLastN(Text.Split(r[#"Company item "],"_"),2),
            Mid = if Test then null else List.Last(SplitRest),
            First = if Test then SplitRest else Text.Combine(List.RemoveLastN(SplitRest),"_")
        in 
            {First, Mid, if not Test then LastTwo else null}, type {text}),
    #"Extracted Values" = Table.TransformColumns(#"Added Custom", {"Split", each Text.Combine(List.Transform(_, Text.From), ";"), type text}),
    #"Split Column by Delimiter" = Table.SplitColumn(#"Extracted Values", "Split", 
        Splitter.SplitTextByDelimiter(";", QuoteStyle.Csv), {"Split.1", "Split.2", "Split.3"}),
    #"Changed Type" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Company item ", type text}, {"Split.1", type text}, {"Split.2", Int64.Type}, {"Split.3", Int64.Type}})
in
    #"Changed Type"

Results:

ronrsnfld_1-1720179647175.png

 

 

mukhan311
Frequent Visitor

This is because there are field which has been listed as components for certain project numbers, but the other fields are different types of components with their their specification. 

Hi @mukhan311 ,

Based on your description, I think it might be a bit tricky to achieve your desired result because you need to manually differentiate between the parts that don't need to be merged and the parts that do need to be merged.
I can give you the following suggestions:
First, I suggest you add an identifying column with the number 0 to identify all the rows that do not need to be split (e.g. the third row of the data you provided), and the number 1 to identify all the other rows that need to be split.
Then, if all the rows you need to be split are as you provided, with English letters in the front and numbers in the back, then you can do Non-Digit to Digit split for all the rows identified as 1, and then split them according to "_", and then merge the parts you need to merge at last;

vjunyantmsft_0-1720162861730.png

however, if your data is not as you provided However, if your data is not as regular as you have provided, then it is not recommended to split it in Power Query, and it may be more convenient to split it manually.

Best Regards,
Dino Tao
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

v-junyant-msft
Community Support
Community Support

Hi @mukhan311 ,

What is the basis of your judgment in deciding whether the row should be split or not?

vjunyantmsft_0-1720058854200.png

And what judgmental basis is there for deciding which parts are merged together and which parts remain separate after the row has been split?

vjunyantmsft_1-1720059015555.png

Without a fixed basis for judging splits and merges, I'm having a hard time realizing your needs.

Best Regards,
Dino Tao

Helpful resources

Announcements
November Carousel

Fabric Community Update - November 2024

Find out what's new and trending in the Fabric Community.

Live Sessions with Fabric DB

Be one of the first to start using Fabric Databases

Starting December 3, join live sessions with database experts and the Fabric product team to learn just how easy it is to get started.

Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount! Early Bird pricing ends December 9th.

Nov PBI Update Carousel

Power BI Monthly Update - November 2024

Check out the November 2024 Power BI update to learn about new features.