Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Microsoft is giving away 50,000 FREE Microsoft Certification exam vouchers. Get Fabric certified for FREE! Learn more

Reply
Rinspiration
Regular Visitor

Splitting a messy text column and re-arange the splitted text in the right columns

Hello to all,

I have a column [Description] with text and I want to split this text into seperate columns so that each part fits into a particular column. The problem is that this text column is a inconsistent messy bunch of text. If I split it by demiliter ":" or " " for example, it will not give the result I want. I tried all kind of things, (listst, trim, split, conditional columns, etc...) but honestly I am not experience enough to solve this problem in a efficient way. I have absolutely no clue how to solve this. Would somebody like to help me?

 

Below you will see the raw data and the result I want to achieve.

IDDescription
1name: andrew walker description: runninglaps from 1 to 11 numero: ab1000 item: 001a date: 2013-10-28
2name: grocery shop ltd button description: payment cash numero: ab1012 date: 2013-11-01
3name: e.m. wellington description: 2022111310 numero: de123314 date: 2013-11-04
4transaction from abber 2 2012-10-20 date: 2013-11-15 
5name: e.m. wellington description: 2022111310 numero: de123314 date: 2013-11-20
6name: w.a. east description: railrunner card numero: ab1102 date: 2013-12-02
7name: bording'28 description: paybill nr 495. description: contribution 2014-01-01 numero: ab887765 item: 20 id number: nn10zz34992245 date: 2014-01-02 
8passnumber: 011 2014-01-03 14:53 action: 69669a term: 1thq date: 2014-01-04
9name: cz group description: item 80182234 ref.nr. 4044493 period 10/11 numero: df101123 item: 200ab id number: 67994 id: uta mister zz23nn232003
10passnumber: 012 2014-04-03 14:23 action: 693539a term: 2thq date: 2014-04-04
11name: farmecy lid description: peanuts and beans numero: ab4536 date: 2014-04-16
12name: Ai photoshop b.v. description: 0000000001123 / period 2014-04-20 / ab1102 numero: ab8877 item: a22340 id number: nna468-01 power mgm
131 oct t/m 31 dec 2014 bakery n.v. date: 2014-09-01
14name: m.a. mol description: richard adems numero: ab689500 item: 4656544 date: 2015-09-20
15passnumber: 011 2014-09-22 15:56 action: 5959m3 term: 94gm04 date: 201409-22
16borne-lane    8798 passnumber 011 2014-11-13 14:01 action 10301 date: 2014-11-12
17name: taxes description: month January no. 11988r 2014 www.sddf.df numero: ab1224 item: aa34 date: 2014-12-10

 

IDNameDescriptionNumeroItemPassnumberActionTermID numberDate
1andrew walkerrunninglaps from 1 to 11ab1000001a    28-10-2013
2grocery shop ltd buttonpayment cashab1012     1-11-2013
3e.m. wellington2022111310de123314     4-11-2013
4transaction from abber 2 2012-10-20       15-11-2013
5e.m. wellington2022111310de123314     20-11-2013
6w.a. eastrailrunner cardab1102     2-12-2013
7bording'28paybill nr 495. description: contribution 2014-01-01ab88776520   nn10zz349922452-1-2014
8    011 2014-01-03 14:5369669a1thq 4-1-2014
9cz groupitem 80182234 ref.nr. 4044493 period 10/11df101123200ab   67994 id: uta mister zz23nn232003 
10    012 2014-04-03 14:23693539a2thq 4-4-2014
11farmecy lidpeanuts and beansab4536     16-4-2014
12Ai photoshop b.v.0000000001123 / period 2014-04-20 / ab1102ab8877a22340   nna468-01 power mgm 
131 oct t/m 31 dec 2014 bakery n.v.       1-9-2014
14m.a. molrichard ademsab6895004656544    20-9-2015
15    011 2014-09-22 15:565959m394gm04 22-9-2014
16borne-lane    8798   011 2014-11-13 14:0110301  12-11-2014
17taxesmonth January no. 11988r 2014 www.sddf.dfab1224aa34    10-12-2014
2 ACCEPTED SOLUTIONS
Akash_Varuna
Resident Rockstar
Resident Rockstar

Hi @Rinspiration , You could use Text.BetweenDelimeters funtiona nd create a custom columns for each requirement
example : if Text.Contains([Description], "description:") then Text.BetweenDelimiters([Description], "description: ", " numero:") else null 
Like the above adjust for each column and create custom columns
If this post helped please do give a kudos and accept this as a solution 
Thanks In Advance

View solution in original post

Check this:

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("rVXJkpswEP2VLp9jrNaCgVuu+YXJHATINjVIECHHGX99WmBjYCq3UD54kV6/pbv99rbD3bed09YUoF3tzQ1uuv0wHmozVL7pQ9O5AvzVucadW90PcPKdBYTQASK4qzW+o7slMsagCcYWwBhqqHUgTM5Q7JHtebZ7//a243Oxs+8q4z9huHQ9tKGG8hpC59Zle/1pjQtQ6eGyKoV8hY97hiO+mPFNYhO4mbYl2l9wOeMcEQWyGbU2yIVAucWVI64k3OC1G3QVESYPdFmSTzye5aNGtrmMCsbb6n+z4mzETWfcW6ITMHoIm9h008boiGalfb20ENnaQr5nfAQ9zqBl52vi+fPKGD/y7Es0ZdO24DzIXCXrH6vOBd9QotErwpeUD70W9bPseEzVo1/IuGYkR34W4Byy+13IPOdcqhfJCYRPlmbEstfD8LzEqBfnMwJQFkrAFFYBaZ6muYZgPBXDcPm1BZ1Czmfl1T3257Vfq4pkIWOYcS4keHNKnE9AMillLqA3vulqQHZYjEV9ol6lCGehTJdLrekxzyV9UcA1aLDNQBzhfufCOS7otBiJIfuilj/Iy4davlQrlJjl8q1c+ZSLr8k/aW9N9QktUVunbLS7hiGuBijp/bCIUCqRbpAxnZBfY/69gf7ShW4c8zL5vWkU9nxGkw5PE5941BiHZ7Oue+dhqI5RbLpHyzSLzdZ3NzLTnu3EKa4GhK4KEA4WBBKRaiwEpf6Im8iN7BZ68udWQTnrsXHQbNdu5qypLnG+dG3s0qI0y9W8FWWqUiUXo6xiiccoo/pnQ9MZDqgKlc4Rq1zlVjwSzuXZsgWsHG9MqHFD0Bg7s2+1M0BPdswzeBV61Yn7amwlsu6x5ZAJ+rCwJJ55IL/WRNB/zLD2w9L8X+AH9Y6OxnYJ/VXkWeYnv2+3WzLU9SmpT8uNRNP+TFULuSob1+vu/f0v", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [ID = _t, Description = _t]),
    Ad_T = Table.AddColumn(Source, "T", each
        [ a = Text.Split([Description], ": "),
          b = List.Accumulate({1..List.Count(a)-1},
                    { if not List.Contains({"name", "passnumber"}, a{0}) then {"name", Text.BeforeDelimiter(a{0}, " ", {0, RelativePosition.FromEnd})} else {a{0}, Text.BeforeDelimiter(a{1}, " ", {0, RelativePosition.FromEnd})}}, (st, cur)=> st & {
                    let Y = if List.Contains({"name", "passnumber"}, a{0}) then 0 else 1 in if a{cur+1-Y}? = null then {"Remove"} else {Text.AfterDelimiter(a{cur-Y}, " ", {0, RelativePosition.FromEnd}), let X = Text.BeforeDelimiter(a{cur+1-Y}, " ", {0, RelativePosition.FromEnd}) in if X = "" then a{cur+1-Y} else X }}),
          c = Table.TransformColumnNames(Table.RemoveColumns(Table.PromoteHeaders(Table.FromColumns({{"ID", [ID]}} & b)), {"Remove"}, MissingField.Ignore), (x)=> if x = "ID" then x else Text.Proper(x))
        ][c], type table),
    T = Table.Combine(Ad_T[T])
in
    T

Note: Check this link to learn how to use my query.
Check this link if you don't know how to provide sample data.

View solution in original post

7 REPLIES 7
dufoq3
Super User
Super User

Hi @Rinspiration, check this, but I'm not sure what would you like to do with these 3 rows:

 

dufoq3_0-1739439099578.png

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("rVXJkpswEP2VLp9jrNaCgVuu+YXJHATINjVIECHHGX99WmBjYCq3UD54kV6/pbv99rbD3bed09YUoF3tzQ1uuv0wHmozVL7pQ9O5AvzVucadW90PcPKdBYTQASK4qzW+o7slMsagCcYWwBhqqHUgTM5Q7JHtebZ7//a243Oxs+8q4z9huHQ9tKGG8hpC59Zle/1pjQtQ6eGyKoV8hY97hiO+mPFNYhO4mbYl2l9wOeMcEQWyGbU2yIVAucWVI64k3OC1G3QVESYPdFmSTzye5aNGtrmMCsbb6n+z4mzETWfcW6ITMHoIm9h008boiGalfb20ENnaQr5nfAQ9zqBl52vi+fPKGD/y7Es0ZdO24DzIXCXrH6vOBd9QotErwpeUD70W9bPseEzVo1/IuGYkR34W4Byy+13IPOdcqhfJCYRPlmbEstfD8LzEqBfnMwJQFkrAFFYBaZ6muYZgPBXDcPm1BZ1Czmfl1T3257Vfq4pkIWOYcS4keHNKnE9AMillLqA3vulqQHZYjEV9ol6lCGehTJdLrekxzyV9UcA1aLDNQBzhfufCOS7otBiJIfuilj/Iy4davlQrlJjl8q1c+ZSLr8k/aW9N9QktUVunbLS7hiGuBijp/bCIUCqRbpAxnZBfY/69gf7ShW4c8zL5vWkU9nxGkw5PE5941BiHZ7Oue+dhqI5RbLpHyzSLzdZ3NzLTnu3EKa4GhK4KEA4WBBKRaiwEpf6Im8iN7BZ68udWQTnrsXHQbNdu5qypLnG+dG3s0qI0y9W8FWWqUiUXo6xiiccoo/pnQ9MZDqgKlc4Rq1zlVjwSzuXZsgWsHG9MqHFD0Bg7s2+1M0BPdswzeBV61Yn7amwlsu6x5ZAJ+rCwJJ55IL/WRNB/zLD2w9L8X+AH9Y6OxnYJ/VXkWeYnv2+3WzLU9SmpT8uNRNP+TFULuSob1+vu/f0v", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [ID = _t, Description = _t]),
    Ad_T = Table.AddColumn(Source, "T", each
        [ a = Text.Split([Description], ": "),
          b = List.Accumulate({1..List.Count(a)-2}, {{a{0}, Text.BeforeDelimiter(a{1}, " ", {0, RelativePosition.FromEnd})}}, (st, cur)=> st & {
                     {Text.AfterDelimiter(a{cur}, " ", {0, RelativePosition.FromEnd}), let X = Text.BeforeDelimiter(a{cur+1}, " ", {0, RelativePosition.FromEnd}) in if X = "" then a{cur+1} else X }}),
          c = Table.TransformColumnNames(Table.PromoteHeaders(Table.FromColumns({{"ID", [ID]}} & b)), (x)=> if x = "ID" then x else Text.Proper(x))
        ][c], type table),
    CombinedT = Table.Combine(Ad_T[T])
in
    CombinedT

Note: Check this link to learn how to use my query.
Check this link if you don't know how to provide sample data.

Hi @dufoq3 ,

It works! 😀 I so am happy. Thanks a lot! 

 

May I ask two more questions?

- The rows you highlighted are inconcistent rows and do not start with "name:" or "passnumber:". The first part of that text I want to place in the column [Name]. How should you code this in m-code?

-  In real life, this table includes over more then 20k rows and is growing over time. How to use Table.Buffer to speed up the handlings?

 Greetings, Rinspiration

Check this:

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("rVXJkpswEP2VLp9jrNaCgVuu+YXJHATINjVIECHHGX99WmBjYCq3UD54kV6/pbv99rbD3bed09YUoF3tzQ1uuv0wHmozVL7pQ9O5AvzVucadW90PcPKdBYTQASK4qzW+o7slMsagCcYWwBhqqHUgTM5Q7JHtebZ7//a243Oxs+8q4z9huHQ9tKGG8hpC59Zle/1pjQtQ6eGyKoV8hY97hiO+mPFNYhO4mbYl2l9wOeMcEQWyGbU2yIVAucWVI64k3OC1G3QVESYPdFmSTzye5aNGtrmMCsbb6n+z4mzETWfcW6ITMHoIm9h008boiGalfb20ENnaQr5nfAQ9zqBl52vi+fPKGD/y7Es0ZdO24DzIXCXrH6vOBd9QotErwpeUD70W9bPseEzVo1/IuGYkR34W4Byy+13IPOdcqhfJCYRPlmbEstfD8LzEqBfnMwJQFkrAFFYBaZ6muYZgPBXDcPm1BZ1Czmfl1T3257Vfq4pkIWOYcS4keHNKnE9AMillLqA3vulqQHZYjEV9ol6lCGehTJdLrekxzyV9UcA1aLDNQBzhfufCOS7otBiJIfuilj/Iy4davlQrlJjl8q1c+ZSLr8k/aW9N9QktUVunbLS7hiGuBijp/bCIUCqRbpAxnZBfY/69gf7ShW4c8zL5vWkU9nxGkw5PE5941BiHZ7Oue+dhqI5RbLpHyzSLzdZ3NzLTnu3EKa4GhK4KEA4WBBKRaiwEpf6Im8iN7BZ68udWQTnrsXHQbNdu5qypLnG+dG3s0qI0y9W8FWWqUiUXo6xiiccoo/pnQ9MZDqgKlc4Rq1zlVjwSzuXZsgWsHG9MqHFD0Bg7s2+1M0BPdswzeBV61Yn7amwlsu6x5ZAJ+rCwJJ55IL/WRNB/zLD2w9L8X+AH9Y6OxnYJ/VXkWeYnv2+3WzLU9SmpT8uNRNP+TFULuSob1+vu/f0v", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [ID = _t, Description = _t]),
    Ad_T = Table.AddColumn(Source, "T", each
        [ a = Text.Split([Description], ": "),
          b = List.Accumulate({1..List.Count(a)-1},
                    { if not List.Contains({"name", "passnumber"}, a{0}) then {"name", Text.BeforeDelimiter(a{0}, " ", {0, RelativePosition.FromEnd})} else {a{0}, Text.BeforeDelimiter(a{1}, " ", {0, RelativePosition.FromEnd})}}, (st, cur)=> st & {
                    let Y = if List.Contains({"name", "passnumber"}, a{0}) then 0 else 1 in if a{cur+1-Y}? = null then {"Remove"} else {Text.AfterDelimiter(a{cur-Y}, " ", {0, RelativePosition.FromEnd}), let X = Text.BeforeDelimiter(a{cur+1-Y}, " ", {0, RelativePosition.FromEnd}) in if X = "" then a{cur+1-Y} else X }}),
          c = Table.TransformColumnNames(Table.RemoveColumns(Table.PromoteHeaders(Table.FromColumns({{"ID", [ID]}} & b)), {"Remove"}, MissingField.Ignore), (x)=> if x = "ID" then x else Text.Proper(x))
        ][c], type table),
    T = Table.Combine(Ad_T[T])
in
    T

Note: Check this link to learn how to use my query.
Check this link if you don't know how to provide sample data.

Hello @dufoq3 ,

Thanks again for your help. It works and finally I understand how the code works. I would never came to this solution myself. It's dynamic and it works fine. Thanks a lot.

Greetings

You're welcome 😉


Note: Check this link to learn how to use my query.
Check this link if you don't know how to provide sample data.

Akash_Varuna
Resident Rockstar
Resident Rockstar

Hi @Rinspiration , You could use Text.BetweenDelimeters funtiona nd create a custom columns for each requirement
example : if Text.Contains([Description], "description:") then Text.BetweenDelimiters([Description], "description: ", " numero:") else null 
Like the above adjust for each column and create custom columns
If this post helped please do give a kudos and accept this as a solution 
Thanks In Advance

Hi @Akash_Varuna ,

I want to thank you very much for your help and to think along with me to find a solution. I got two solutions. Both solutions works very well. Your solution is easier to follow. The other solution is more advanced. So it took me a while to understand. finally I choose the other solution, because it is more dynamic.

But any way.... I give you my kudo's!

Greetings from Rinspiration

 

 

Helpful resources

Announcements
March PBI video - carousel

Power BI Monthly Update - March 2025

Check out the March 2025 Power BI update to learn about new features.

March2025 Carousel

Fabric Community Update - March 2025

Find out what's new and trending in the Fabric community.