Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

The Power BI Data Visualization World Championships is back! Get ahead of the game and start preparing now! Learn more

Reply
Anonymous
Not applicable

Using 2 functions to web scrape

I am scraping cost to own data from edmunds. I used the site map to get all the Cost to own links site map here.

so I have a table of links and I used a function to get the model names from each of the pages (see example page here). So now I have 2 columns one with links and a second with the model names of the cars. I want a thired column that contains the cost to own table specific to each model. Right now it just gives me the first one and I can't find anything in the code to use the model name drop down menu as another secondary function. How can I do this? I used this video as a guide.image_2021-08-24_191805.png

As you can see above. I made `PageStart` my variable wich I plug the column loc into below and it outputs the column Model I tired to do the same with the Cost to Own column but instead of pulling the data from the specific model it just pulled the first model it found. I need it to go the link and pull the data specific to the model.

image_2021-08-21_122840.png

 

3 REPLIES 3
lbendlin
Super User
Super User

Your URL is incomplete.  You need to add the style query parameter

 

2016 Lexus LS 460: True Cost to Own | Edmunds

 

 

 

https://www.edmunds.com/lexus/ls-460/2016/cost-to-own/?style=401580679

 

 

 

Here's a potential way to get that value:

 

 

let
    Source = Xml.Tables(Web.Contents("https://www.edmunds.com/sitemap_web54-mmy-cost-to-own.xml")),
    Table0 = Source{0}[Table],
    #"Kept First Rows" = Table.FirstN(Table0,10),
    #"Added Custom" = Table.AddColumn(#"Kept First Rows", "Custom", each Web.BrowserContents([loc])),
    #"Added Custom3" = Table.AddColumn(#"Added Custom", "Custom.3", each try Text.Range([Custom],Text.PositionOf([Custom],"<optgroup"),Text.PositionOf([Custom],"</optgroup>")-Text.PositionOf([Custom],"<optgroup")+11) otherwise "<optgroup/>"),
    #"Parsed XML" = Table.TransformColumns(#"Added Custom3",{{"Custom.3", Xml.Tables}}),
    #"Expanded Custom.3" = Table.ExpandTableColumn(#"Parsed XML", "Custom.3", {"option"}, {"option"}),
    #"Expanded option" = Table.ExpandTableColumn(#"Expanded Custom.3", "option", {"Element:Text", "Attribute:value"}, {"Model", "Style"})
in
    #"Expanded option"

 

Then you can fetch the true URL for each model variation to get the cost table.

 

 

Anonymous
Not applicable

Thank you!  I posted my function code to get the model name. Clearly your code is more efficient. But I was confused about how to have pull the link and then the model to get the specific cost to own. I tried to calify in my orogonal question.

As I said you can use my code as the basis to get the model numbers. Then you can use your approach to fetch the table for each of the models.

Helpful resources

Announcements
Power BI DataViz World Championships

Power BI Dataviz World Championships

The Power BI Data Visualization World Championships is back! Get ahead of the game and start preparing now!

December 2025 Power BI Update Carousel

Power BI Monthly Update - December 2025

Check out the December 2025 Power BI Holiday Recap!

FabCon Atlanta 2026 carousel

FabCon Atlanta 2026

Join us at FabCon Atlanta, March 16-20, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.