Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Calling all Data Engineers! Fabric Data Engineer (Exam DP-700) live sessions are back! Starting October 16th. Sign up.

Reply
Rinspiration
Regular Visitor

Splitting a messy text column and re-arange the splitted text in the right columns

Hello to all,

I have a column [Description] with text and I want to split this text into seperate columns so that each part fits into a particular column. The problem is that this text column is a inconsistent messy bunch of text. If I split it by demiliter ":" or " " for example, it will not give the result I want. I tried all kind of things, (listst, trim, split, conditional columns, etc...) but honestly I am not experience enough to solve this problem in a efficient way. I have absolutely no clue how to solve this. Would somebody like to help me?

 

Below you will see the raw data and the result I want to achieve.

IDDescription
1name: andrew walker description: runninglaps from 1 to 11 numero: ab1000 item: 001a date: 2013-10-28
2name: grocery shop ltd button description: payment cash numero: ab1012 date: 2013-11-01
3name: e.m. wellington description: 2022111310 numero: de123314 date: 2013-11-04
4transaction from abber 2 2012-10-20 date: 2013-11-15 
5name: e.m. wellington description: 2022111310 numero: de123314 date: 2013-11-20
6name: w.a. east description: railrunner card numero: ab1102 date: 2013-12-02
7name: bording'28 description: paybill nr 495. description: contribution 2014-01-01 numero: ab887765 item: 20 id number: nn10zz34992245 date: 2014-01-02 
8passnumber: 011 2014-01-03 14:53 action: 69669a term: 1thq date: 2014-01-04
9name: cz group description: item 80182234 ref.nr. 4044493 period 10/11 numero: df101123 item: 200ab id number: 67994 id: uta mister zz23nn232003
10passnumber: 012 2014-04-03 14:23 action: 693539a term: 2thq date: 2014-04-04
11name: farmecy lid description: peanuts and beans numero: ab4536 date: 2014-04-16
12name: Ai photoshop b.v. description: 0000000001123 / period 2014-04-20 / ab1102 numero: ab8877 item: a22340 id number: nna468-01 power mgm
131 oct t/m 31 dec 2014 bakery n.v. date: 2014-09-01
14name: m.a. mol description: richard adems numero: ab689500 item: 4656544 date: 2015-09-20
15passnumber: 011 2014-09-22 15:56 action: 5959m3 term: 94gm04 date: 201409-22
16borne-lane    8798 passnumber 011 2014-11-13 14:01 action 10301 date: 2014-11-12
17name: taxes description: month January no. 11988r 2014 www.sddf.df numero: ab1224 item: aa34 date: 2014-12-10

 

IDNameDescriptionNumeroItemPassnumberActionTermID numberDate
1andrew walkerrunninglaps from 1 to 11ab1000001a    28-10-2013
2grocery shop ltd buttonpayment cashab1012     1-11-2013
3e.m. wellington2022111310de123314     4-11-2013
4transaction from abber 2 2012-10-20       15-11-2013
5e.m. wellington2022111310de123314     20-11-2013
6w.a. eastrailrunner cardab1102     2-12-2013
7bording'28paybill nr 495. description: contribution 2014-01-01ab88776520   nn10zz349922452-1-2014
8    011 2014-01-03 14:5369669a1thq 4-1-2014
9cz groupitem 80182234 ref.nr. 4044493 period 10/11df101123200ab   67994 id: uta mister zz23nn232003 
10    012 2014-04-03 14:23693539a2thq 4-4-2014
11farmecy lidpeanuts and beansab4536     16-4-2014
12Ai photoshop b.v.0000000001123 / period 2014-04-20 / ab1102ab8877a22340   nna468-01 power mgm 
131 oct t/m 31 dec 2014 bakery n.v.       1-9-2014
14m.a. molrichard ademsab6895004656544    20-9-2015
15    011 2014-09-22 15:565959m394gm04 22-9-2014
16borne-lane    8798   011 2014-11-13 14:0110301  12-11-2014
17taxesmonth January no. 11988r 2014 www.sddf.dfab1224aa34    10-12-2014
2 ACCEPTED SOLUTIONS
Akash_Varuna
Super User
Super User

Hi @Rinspiration , You could use Text.BetweenDelimeters funtiona nd create a custom columns for each requirement
example : if Text.Contains([Description], "description:") then Text.BetweenDelimiters([Description], "description: ", " numero:") else null 
Like the above adjust for each column and create custom columns
If this post helped please do give a kudos and accept this as a solution 
Thanks In Advance

View solution in original post

Check this:

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("rVXJkpswEP2VLp9jrNaCgVuu+YXJHATINjVIECHHGX99WmBjYCq3UD54kV6/pbv99rbD3bed09YUoF3tzQ1uuv0wHmozVL7pQ9O5AvzVucadW90PcPKdBYTQASK4qzW+o7slMsagCcYWwBhqqHUgTM5Q7JHtebZ7//a243Oxs+8q4z9huHQ9tKGG8hpC59Zle/1pjQtQ6eGyKoV8hY97hiO+mPFNYhO4mbYl2l9wOeMcEQWyGbU2yIVAucWVI64k3OC1G3QVESYPdFmSTzye5aNGtrmMCsbb6n+z4mzETWfcW6ITMHoIm9h008boiGalfb20ENnaQr5nfAQ9zqBl52vi+fPKGD/y7Es0ZdO24DzIXCXrH6vOBd9QotErwpeUD70W9bPseEzVo1/IuGYkR34W4Byy+13IPOdcqhfJCYRPlmbEstfD8LzEqBfnMwJQFkrAFFYBaZ6muYZgPBXDcPm1BZ1Czmfl1T3257Vfq4pkIWOYcS4keHNKnE9AMillLqA3vulqQHZYjEV9ol6lCGehTJdLrekxzyV9UcA1aLDNQBzhfufCOS7otBiJIfuilj/Iy4davlQrlJjl8q1c+ZSLr8k/aW9N9QktUVunbLS7hiGuBijp/bCIUCqRbpAxnZBfY/69gf7ShW4c8zL5vWkU9nxGkw5PE5941BiHZ7Oue+dhqI5RbLpHyzSLzdZ3NzLTnu3EKa4GhK4KEA4WBBKRaiwEpf6Im8iN7BZ68udWQTnrsXHQbNdu5qypLnG+dG3s0qI0y9W8FWWqUiUXo6xiiccoo/pnQ9MZDqgKlc4Rq1zlVjwSzuXZsgWsHG9MqHFD0Bg7s2+1M0BPdswzeBV61Yn7amwlsu6x5ZAJ+rCwJJ55IL/WRNB/zLD2w9L8X+AH9Y6OxnYJ/VXkWeYnv2+3WzLU9SmpT8uNRNP+TFULuSob1+vu/f0v", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [ID = _t, Description = _t]),
    Ad_T = Table.AddColumn(Source, "T", each
        [ a = Text.Split([Description], ": "),
          b = List.Accumulate({1..List.Count(a)-1},
                    { if not List.Contains({"name", "passnumber"}, a{0}) then {"name", Text.BeforeDelimiter(a{0}, " ", {0, RelativePosition.FromEnd})} else {a{0}, Text.BeforeDelimiter(a{1}, " ", {0, RelativePosition.FromEnd})}}, (st, cur)=> st & {
                    let Y = if List.Contains({"name", "passnumber"}, a{0}) then 0 else 1 in if a{cur+1-Y}? = null then {"Remove"} else {Text.AfterDelimiter(a{cur-Y}, " ", {0, RelativePosition.FromEnd}), let X = Text.BeforeDelimiter(a{cur+1-Y}, " ", {0, RelativePosition.FromEnd}) in if X = "" then a{cur+1-Y} else X }}),
          c = Table.TransformColumnNames(Table.RemoveColumns(Table.PromoteHeaders(Table.FromColumns({{"ID", [ID]}} & b)), {"Remove"}, MissingField.Ignore), (x)=> if x = "ID" then x else Text.Proper(x))
        ][c], type table),
    T = Table.Combine(Ad_T[T])
in
    T

Note: Check this link to learn how to use my query.
Check this link if you don't know how to provide sample data.

View solution in original post

7 REPLIES 7
dufoq3
Super User
Super User

Hi @Rinspiration, check this, but I'm not sure what would you like to do with these 3 rows:

 

dufoq3_0-1739439099578.png

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("rVXJkpswEP2VLp9jrNaCgVuu+YXJHATINjVIECHHGX99WmBjYCq3UD54kV6/pbv99rbD3bed09YUoF3tzQ1uuv0wHmozVL7pQ9O5AvzVucadW90PcPKdBYTQASK4qzW+o7slMsagCcYWwBhqqHUgTM5Q7JHtebZ7//a243Oxs+8q4z9huHQ9tKGG8hpC59Zle/1pjQtQ6eGyKoV8hY97hiO+mPFNYhO4mbYl2l9wOeMcEQWyGbU2yIVAucWVI64k3OC1G3QVESYPdFmSTzye5aNGtrmMCsbb6n+z4mzETWfcW6ITMHoIm9h008boiGalfb20ENnaQr5nfAQ9zqBl52vi+fPKGD/y7Es0ZdO24DzIXCXrH6vOBd9QotErwpeUD70W9bPseEzVo1/IuGYkR34W4Byy+13IPOdcqhfJCYRPlmbEstfD8LzEqBfnMwJQFkrAFFYBaZ6muYZgPBXDcPm1BZ1Czmfl1T3257Vfq4pkIWOYcS4keHNKnE9AMillLqA3vulqQHZYjEV9ol6lCGehTJdLrekxzyV9UcA1aLDNQBzhfufCOS7otBiJIfuilj/Iy4davlQrlJjl8q1c+ZSLr8k/aW9N9QktUVunbLS7hiGuBijp/bCIUCqRbpAxnZBfY/69gf7ShW4c8zL5vWkU9nxGkw5PE5941BiHZ7Oue+dhqI5RbLpHyzSLzdZ3NzLTnu3EKa4GhK4KEA4WBBKRaiwEpf6Im8iN7BZ68udWQTnrsXHQbNdu5qypLnG+dG3s0qI0y9W8FWWqUiUXo6xiiccoo/pnQ9MZDqgKlc4Rq1zlVjwSzuXZsgWsHG9MqHFD0Bg7s2+1M0BPdswzeBV61Yn7amwlsu6x5ZAJ+rCwJJ55IL/WRNB/zLD2w9L8X+AH9Y6OxnYJ/VXkWeYnv2+3WzLU9SmpT8uNRNP+TFULuSob1+vu/f0v", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [ID = _t, Description = _t]),
    Ad_T = Table.AddColumn(Source, "T", each
        [ a = Text.Split([Description], ": "),
          b = List.Accumulate({1..List.Count(a)-2}, {{a{0}, Text.BeforeDelimiter(a{1}, " ", {0, RelativePosition.FromEnd})}}, (st, cur)=> st & {
                     {Text.AfterDelimiter(a{cur}, " ", {0, RelativePosition.FromEnd}), let X = Text.BeforeDelimiter(a{cur+1}, " ", {0, RelativePosition.FromEnd}) in if X = "" then a{cur+1} else X }}),
          c = Table.TransformColumnNames(Table.PromoteHeaders(Table.FromColumns({{"ID", [ID]}} & b)), (x)=> if x = "ID" then x else Text.Proper(x))
        ][c], type table),
    CombinedT = Table.Combine(Ad_T[T])
in
    CombinedT

Note: Check this link to learn how to use my query.
Check this link if you don't know how to provide sample data.

Hi @dufoq3 ,

It works! 😀 I so am happy. Thanks a lot! 

 

May I ask two more questions?

- The rows you highlighted are inconcistent rows and do not start with "name:" or "passnumber:". The first part of that text I want to place in the column [Name]. How should you code this in m-code?

-  In real life, this table includes over more then 20k rows and is growing over time. How to use Table.Buffer to speed up the handlings?

 Greetings, Rinspiration

Check this:

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("rVXJkpswEP2VLp9jrNaCgVuu+YXJHATINjVIECHHGX99WmBjYCq3UD54kV6/pbv99rbD3bed09YUoF3tzQ1uuv0wHmozVL7pQ9O5AvzVucadW90PcPKdBYTQASK4qzW+o7slMsagCcYWwBhqqHUgTM5Q7JHtebZ7//a243Oxs+8q4z9huHQ9tKGG8hpC59Zle/1pjQtQ6eGyKoV8hY97hiO+mPFNYhO4mbYl2l9wOeMcEQWyGbU2yIVAucWVI64k3OC1G3QVESYPdFmSTzye5aNGtrmMCsbb6n+z4mzETWfcW6ITMHoIm9h008boiGalfb20ENnaQr5nfAQ9zqBl52vi+fPKGD/y7Es0ZdO24DzIXCXrH6vOBd9QotErwpeUD70W9bPseEzVo1/IuGYkR34W4Byy+13IPOdcqhfJCYRPlmbEstfD8LzEqBfnMwJQFkrAFFYBaZ6muYZgPBXDcPm1BZ1Czmfl1T3257Vfq4pkIWOYcS4keHNKnE9AMillLqA3vulqQHZYjEV9ol6lCGehTJdLrekxzyV9UcA1aLDNQBzhfufCOS7otBiJIfuilj/Iy4davlQrlJjl8q1c+ZSLr8k/aW9N9QktUVunbLS7hiGuBijp/bCIUCqRbpAxnZBfY/69gf7ShW4c8zL5vWkU9nxGkw5PE5941BiHZ7Oue+dhqI5RbLpHyzSLzdZ3NzLTnu3EKa4GhK4KEA4WBBKRaiwEpf6Im8iN7BZ68udWQTnrsXHQbNdu5qypLnG+dG3s0qI0y9W8FWWqUiUXo6xiiccoo/pnQ9MZDqgKlc4Rq1zlVjwSzuXZsgWsHG9MqHFD0Bg7s2+1M0BPdswzeBV61Yn7amwlsu6x5ZAJ+rCwJJ55IL/WRNB/zLD2w9L8X+AH9Y6OxnYJ/VXkWeYnv2+3WzLU9SmpT8uNRNP+TFULuSob1+vu/f0v", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [ID = _t, Description = _t]),
    Ad_T = Table.AddColumn(Source, "T", each
        [ a = Text.Split([Description], ": "),
          b = List.Accumulate({1..List.Count(a)-1},
                    { if not List.Contains({"name", "passnumber"}, a{0}) then {"name", Text.BeforeDelimiter(a{0}, " ", {0, RelativePosition.FromEnd})} else {a{0}, Text.BeforeDelimiter(a{1}, " ", {0, RelativePosition.FromEnd})}}, (st, cur)=> st & {
                    let Y = if List.Contains({"name", "passnumber"}, a{0}) then 0 else 1 in if a{cur+1-Y}? = null then {"Remove"} else {Text.AfterDelimiter(a{cur-Y}, " ", {0, RelativePosition.FromEnd}), let X = Text.BeforeDelimiter(a{cur+1-Y}, " ", {0, RelativePosition.FromEnd}) in if X = "" then a{cur+1-Y} else X }}),
          c = Table.TransformColumnNames(Table.RemoveColumns(Table.PromoteHeaders(Table.FromColumns({{"ID", [ID]}} & b)), {"Remove"}, MissingField.Ignore), (x)=> if x = "ID" then x else Text.Proper(x))
        ][c], type table),
    T = Table.Combine(Ad_T[T])
in
    T

Note: Check this link to learn how to use my query.
Check this link if you don't know how to provide sample data.

Hello @dufoq3 ,

Thanks again for your help. It works and finally I understand how the code works. I would never came to this solution myself. It's dynamic and it works fine. Thanks a lot.

Greetings

You're welcome 😉


Note: Check this link to learn how to use my query.
Check this link if you don't know how to provide sample data.

Akash_Varuna
Super User
Super User

Hi @Rinspiration , You could use Text.BetweenDelimeters funtiona nd create a custom columns for each requirement
example : if Text.Contains([Description], "description:") then Text.BetweenDelimiters([Description], "description: ", " numero:") else null 
Like the above adjust for each column and create custom columns
If this post helped please do give a kudos and accept this as a solution 
Thanks In Advance

Hi @Akash_Varuna ,

I want to thank you very much for your help and to think along with me to find a solution. I got two solutions. Both solutions works very well. Your solution is easier to follow. The other solution is more advanced. So it took me a while to understand. finally I choose the other solution, because it is more dynamic.

But any way.... I give you my kudo's!

Greetings from Rinspiration

 

 

Helpful resources

Announcements
FabCon Global Hackathon Carousel

FabCon Global Hackathon

Join the Fabric FabCon Global Hackathon—running virtually through Nov 3. Open to all skill levels. $10,000 in prizes!

FabCon Atlanta 2026 carousel

FabCon Atlanta 2026

Join us at FabCon Atlanta, March 16-20, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.

Top Kudoed Authors