Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

The Power BI DataViz World Championships are on! With four chances to enter, you could win a spot in the LIVE Grand Finale in Las Vegas. Show off your skills.

Reply
heriberto_mb
Frequent Visitor

Extract data after html tag

I want to extract the system status from the following page: 

https://www.akamaistatus.com/

 

I was able to pull the html data, now I just want to pull the line that says "All Systems Operational", is there a way to use the <div class="page-status status-none"> as a delimiter so I can extract the data that is 2 rows below?

 

p.s. I'm thinking about using the html tag as a delimiter, because if I use the line number... it might change with the time.

heriberto_mb_0-1727383015936.png

 

1 ACCEPTED SOLUTION
Ahmedx
Super User
Super User

pls try this

 

let
    Source = Text.FromBinary(Web.Contents("https://www.akamaistatus.com/")),
    Text = Text.BetweenDelimiters( Source, "<span class=""status font-large"">", "<span class=""last-updated-stamp  font-small""></span>" ),
    ImportedText = List.RemoveMatchingItems( List.Transform(Lines.FromText(Text),(x)=> Text.Trim(x)),{""}){0}
in
    ImportedText

 

View solution in original post

5 REPLIES 5
Ahmedx
Super User
Super User

pls try this

 

let
    Source = Text.FromBinary(Web.Contents("https://www.akamaistatus.com/")),
    Text = Text.BetweenDelimiters( Source, "<span class=""status font-large"">", "<span class=""last-updated-stamp  font-small""></span>" ),
    ImportedText = List.RemoveMatchingItems( List.Transform(Lines.FromText(Text),(x)=> Text.Trim(x)),{""}){0}
in
    ImportedText

 

Thank you @Ahmedx ! That works perfectly...

 

Can you explain the following line and tell me where can I learn more about it?

ImportedText = List.RemoveMatchingItems( List.Transform(Lines.FromText(Text),(x)=> Text.Trim(x)),{""}){0}

 

 

  • Lines.FromText(Text): This function converts the variable Text into a list of lines. Each line from the text becomes an individual element in the list.

  • List.Transform(..., (x) => Text.Trim(x)): This applies a transformation to each line in the list created in step 1. The transformation involves trimming the whitespace from both the beginning and end of each line using Text.Trim(x).

  • List.RemoveMatchingItems(..., {""}): This removes any empty strings (i.e., "") from the list that resulted from the transformation in step 2. If any lines were just whitespace, they are removed at this point because they become empty after trimming.

  • {0}: This refers to the first element of the remaining list after empty strings are removed. It extracts the first non-empty line of the original text.

 

lbendlin
Super User
Super User

Best to use Web.BrowserContents and then play with Html.Table parsing

 

let
    Source = Web.BrowserContents("https://www.akamaistatus.com/"),
    #"Extracted Table From Html" = Html.Table(Source, {{"Column1", ".font-small + *"}, {"Column2", ".component-inner-container:nth-child(2) .name"}, {"Column3", ".component-inner-container:nth-child(3) .name"}, {"Column4", ".component-status.tool"}, {"Column5", ".component-inner-container:nth-child(2) .component-status"}, {"Column6", ".component-inner-container:nth-child(3) .component-status"}, {"Column7", ".component-inner-container:nth-child(4) .name"}, {"Column8", ".component-inner-container:nth-child(4) .component-status"}}, [RowSelector=".component-container"])
in
    #"Extracted Table From Html"

Thank you very much for your answer, I see multiple outputs with their status...but I was looking just to capture the general status of the page: "All systems operational".

Helpful resources

Announcements
Feb2025 Sticker Challenge

Join our Community Sticker Challenge 2025

If you love stickers, then you will definitely want to check out our Community Sticker Challenge!

Jan NL Carousel

Fabric Community Update - January 2025

Find out what's new and trending in the Fabric community.

Top Solution Authors
Top Kudoed Authors