Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Be one of the first to start using Fabric Databases. View on-demand sessions with database experts and the Microsoft product team to learn just how easy it is to get started. Watch now

Reply
MarkusEng1998
Resolver II
Resolver II

Split Paragraph into sentences with numbering

I want to compare paragraphs by sentence. I can split the paragraph into sentences when there is a line feed and/or carriage return. Do you have any suggestions, how I may split a paragraph into separate sentences with numbered lists.

 

 

 

1. Bicycle racks shall be installed on a durable surface, preferably near the associate entrance without pedestrian route conflicts. 2. Designated bicycle corridors connecting the public right-of-way with the bicycle parking shall not be provided unless required per local code. If required and provided, the bike corridor shall consist of two (1.2m) wide pavement lanes (on the outside of travel lanes, one in each direction) separated from the vehicle travel lane by pavement striping. Bike lanes should be signed and striped in accordance to local code and regulations for all traffic signs. 3. At sites with a designated bike corridor, bikes should not travel on sidewalks (with exception of parking) or in vehicle travel lanes.

 

 

I can split this paragraph with a (period) (space), but this will also split the 1, 2, 3, etc.

The desired output is three records, one for each sentence.

Please advise.

1 ACCEPTED SOLUTION
AlexisOlson
Super User
Super User

One possible method is finding the positions of all the digits and then selecting those digit positions immediately followed by a period and a space. Those are the positions you need to split on.

 

Here's an example query putting this all together:

let
    Source = Table.FromRows({{"1. Bicycle racks shall be installed on a durable surface, preferably near the associate entrance without pedestrian route conflicts. 2. Designated bicycle corridors connecting the public right-of-way with the bicycle parking shall not be provided unless required per local code. If required and provided, the bike corridor shall consist of two (1.2m) wide pavement lanes (on the outside of travel lanes, one in each direction) separated from the vehicle travel lane by pavement striping. Bike lanes should be signed and striped in accordance to local code and regulations for all traffic signs. 3. At sites with a designated bike corridor, bikes should not travel on sidewalks (with exception of parking) or in vehicle travel lanes."}}, type table [Text = text]),
    SplitToList = Table.AddColumn(Source, "Split", each
        [
            Digits = Text.PositionOfAny([Text], {"0".."9"}, Occurrence.All),
            Positions = List.Select(Digits, (i) => Text.Middle([Text], i + 1, 2) = ". "),
            Split = Splitter.SplitTextByPositions(Positions)([Text])
        ][Split], type {text}),
    ExplandList = Table.ExpandListColumn(SplitToList, "Split")
in
    ExplandList

View solution in original post

3 REPLIES 3
dufoq3
Super User
Super User

Hi @MarkusEng1998, another solution here:

 

Result:

dufoq3_0-1718203703252.png

let
    Source = Table.FromRows({{"1. Bicycle racks shall be installed on a durable surface, preferably near the associate entrance without pedestrian route conflicts. 2. Designated bicycle corridors connecting the public right-of-way with the bicycle parking shall not be provided unless required per local code. If required and provided, the bike corridor shall consist of two (1.2m) wide pavement lanes (on the outside of travel lanes, one in each direction) separated from the vehicle travel lane by pavement striping. Bike lanes should be signed and striped in accordance to local code and regulations for all traffic signs. 3. At sites with a designated bike corridor, bikes should not travel on sidewalks (with exception of parking) or in vehicle travel lanes."}}, type table [Text = text]),
    Ad_Splitted = Table.AddColumn(Source, "Splitted", each 
        [ a = Text.Split([Text], " "),
          b = List.Select(a, (x)=> Text.EndsWith(x, ".")),
          delimiters = List.Select(b, (x)=> (try Number.From(Text.Remove(x, ".")) otherwise false) is number),
          d = List.Accumulate( {0..List.Count(delimiters)-1}, {}, (s,c)=> s & { try delimiters{c} & " " & Text.Trim(Text.BetweenDelimiters([Text], delimiters{c}, delimiters{c+1})) otherwise delimiters{c} & " " & Text.Trim(Text.AfterDelimiter([Text], delimiters{c})) } )
        ][d], type list),
    ExpandedSplitted = Table.ExpandListColumn(Ad_Splitted, "Splitted")
in
    ExpandedSplitted

Note: Check this link to learn how to use my query.
Check this link if you don't know how to provide sample data.

MarkusEng1998
Resolver II
Resolver II

Thank you, @AlexisOlson . I will give this a try.

AlexisOlson
Super User
Super User

One possible method is finding the positions of all the digits and then selecting those digit positions immediately followed by a period and a space. Those are the positions you need to split on.

 

Here's an example query putting this all together:

let
    Source = Table.FromRows({{"1. Bicycle racks shall be installed on a durable surface, preferably near the associate entrance without pedestrian route conflicts. 2. Designated bicycle corridors connecting the public right-of-way with the bicycle parking shall not be provided unless required per local code. If required and provided, the bike corridor shall consist of two (1.2m) wide pavement lanes (on the outside of travel lanes, one in each direction) separated from the vehicle travel lane by pavement striping. Bike lanes should be signed and striped in accordance to local code and regulations for all traffic signs. 3. At sites with a designated bike corridor, bikes should not travel on sidewalks (with exception of parking) or in vehicle travel lanes."}}, type table [Text = text]),
    SplitToList = Table.AddColumn(Source, "Split", each
        [
            Digits = Text.PositionOfAny([Text], {"0".."9"}, Occurrence.All),
            Positions = List.Select(Digits, (i) => Text.Middle([Text], i + 1, 2) = ". "),
            Split = Splitter.SplitTextByPositions(Positions)([Text])
        ][Split], type {text}),
    ExplandList = Table.ExpandListColumn(SplitToList, "Split")
in
    ExplandList

Helpful resources

Announcements
Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!

Dec Fabric Community Survey

We want your feedback!

Your insights matter. That’s why we created a quick survey to learn about your experience finding answers to technical questions.

ArunFabCon

Microsoft Fabric Community Conference 2025

Arun Ulag shares exciting details about the Microsoft Fabric Conference 2025, which will be held in Las Vegas, NV.

December 2024

A Year in Review - December 2024

Find out what content was popular in the Fabric community during 2024.