Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

We've captured the moments from FabCon & SQLCon that everyone is talking about, and we are bringing them to the community, live and on-demand. Starts on April 14th. Register now

Reply
Anonymous
Not applicable

Using 2 functions to web scrape

I am scraping cost to own data from edmunds. I used the site map to get all the Cost to own links site map here.

so I have a table of links and I used a function to get the model names from each of the pages (see example page here). So now I have 2 columns one with links and a second with the model names of the cars. I want a thired column that contains the cost to own table specific to each model. Right now it just gives me the first one and I can't find anything in the code to use the model name drop down menu as another secondary function. How can I do this? I used this video as a guide.image_2021-08-24_191805.png

As you can see above. I made `PageStart` my variable wich I plug the column loc into below and it outputs the column Model I tired to do the same with the Cost to Own column but instead of pulling the data from the specific model it just pulled the first model it found. I need it to go the link and pull the data specific to the model.

image_2021-08-21_122840.png

 

3 REPLIES 3
lbendlin
Super User
Super User

Your URL is incomplete.  You need to add the style query parameter

 

2016 Lexus LS 460: True Cost to Own | Edmunds

 

 

 

https://www.edmunds.com/lexus/ls-460/2016/cost-to-own/?style=401580679

 

 

 

Here's a potential way to get that value:

 

 

let
    Source = Xml.Tables(Web.Contents("https://www.edmunds.com/sitemap_web54-mmy-cost-to-own.xml")),
    Table0 = Source{0}[Table],
    #"Kept First Rows" = Table.FirstN(Table0,10),
    #"Added Custom" = Table.AddColumn(#"Kept First Rows", "Custom", each Web.BrowserContents([loc])),
    #"Added Custom3" = Table.AddColumn(#"Added Custom", "Custom.3", each try Text.Range([Custom],Text.PositionOf([Custom],"<optgroup"),Text.PositionOf([Custom],"</optgroup>")-Text.PositionOf([Custom],"<optgroup")+11) otherwise "<optgroup/>"),
    #"Parsed XML" = Table.TransformColumns(#"Added Custom3",{{"Custom.3", Xml.Tables}}),
    #"Expanded Custom.3" = Table.ExpandTableColumn(#"Parsed XML", "Custom.3", {"option"}, {"option"}),
    #"Expanded option" = Table.ExpandTableColumn(#"Expanded Custom.3", "option", {"Element:Text", "Attribute:value"}, {"Model", "Style"})
in
    #"Expanded option"

 

Then you can fetch the true URL for each model variation to get the cost table.

 

 

Anonymous
Not applicable

Thank you!  I posted my function code to get the model name. Clearly your code is more efficient. But I was confused about how to have pull the link and then the model to get the specific cost to own. I tried to calify in my orogonal question.

As I said you can use my code as the basis to get the model numbers. Then you can use your approach to fetch the table for each of the models.

Helpful resources

Announcements
New to Fabric survey Carousel

New to Fabric Survey

If you have recently started exploring Fabric, we'd love to hear how it's going. Your feedback can help with product improvements.

Power BI DataViz World Championships carousel

Power BI DataViz World Championships - June 2026

A new Power BI DataViz World Championship is coming this June! Don't miss out on submitting your entry.

Join our Fabric User Panel

Join our Fabric User Panel

Share feedback directly with Fabric product managers, participate in targeted research studies and influence the Fabric roadmap.

March Power BI Update Carousel

Power BI Community Update - March 2026

Check out the March 2026 Power BI update to learn about new features.