Join us for an expert-led overview of the tools and concepts you'll need to pass exam PL-300. The first session starts on June 11th. See you there!
Get registeredPower BI is turning 10! Let’s celebrate together with dataviz contests, interactive sessions, and giveaways. Register now.
Good morning,
See if anyone can help a real beginner in M. I've been reading up on web scarping from various sources to see how to scrpe image urls from websites. I've been trying my luck with the following link: http://classicyachtinfo.com/yachts/a-day-at-the-races/
And this is the code I have so far:
let
Source =
Web.BrowserContents("http://classicyachtinfo.com/yachts/a-day-at-the-races"),
Image =
Html.Table(
Source,
{{
"Image",
"img",
each [Attributes][src]}})
in
Image
The thing is this code brings in all images on the page.
Is there a way to restrict the image url import to the relevant ones? for example, from this page it would be the image related to the yacht itself (no ads etc):
I have been trying to use RowSelector but keep getting an error . (My knowledge of M is far too basic to work my way out of the error, and have tried a few things over the last hour...)
Thanks for any help!
Proud to be a Super User!
Paul on Linkedin.
Solved! Go to Solution.
Thanks again for your suggestion. I did eventually succeed in importing the image URLs, albeit tweaking the M code for the different websites. Just for general information, I imported about 3,500 different URLs, from about 1,650 different pages from 4 different websites. The whole process took over 6 hours probably (in fact I started the queries at around 20:00, checked well past midnight and it was more or less half way there...).
I subsequently decided I was also interest in importing some text from each of those pages, but have given up trying to work out the code for the selector etc needed...
Thanks again!
Proud to be a Super User!
Paul on Linkedin.
I'm not sure how to get that specific image as I don't know how you'd know the file name. However, I did put together a list of all images on that page that have the word "yacht" in them, excluding ads.
I started with your code, then:
I got 4 results, but still not sure which image would be the one you are looking at, and not sure it is any of them. You are assuming the file names would have the word you are looking for. That might be called "sailboatbybeach" on that site. 🤷♂️
let
Source =
Web.BrowserContents("http://classicyachtinfo.com/yachts/a-day-at-the-races"),
Image =
Html.Table(
Source,
{{
"Image",
"img",
each [Attributes][src]}}),
#"Lowercased Text" = Table.TransformColumns(Image,{{"Image", Text.Lower, type text}}),
#"Inserted Text After Delimiter" = Table.AddColumn(#"Lowercased Text", "Text After Delimiter", each Text.AfterDelimiter([Image], "classicyachtinfo.com/wp-content/"), type text),
#"Filtered Rows" = Table.SelectRows(#"Inserted Text After Delimiter", each Text.Contains([Text After Delimiter], "yacht"))
in
#"Filtered Rows"
a
DAX is for Analysis. Power Query is for Data Modeling
Proud to be a Super User!
MCSA: BI Reporting
Thank you for taking the time to give this a shot. Unfortunately the images rendered are not what I am looking for:
I actually think my original code does not include the images I am looking for. If I go into the webpage 's code, what I'm trying to get at is the following:
or...
which is why I started playing around with "RowSelector", but as I say, I know veeeery little about M and even less about CSS selectors..
The thing is I need to write queries for a number of webpages to import image urls (which will all be of course have different structures), and I'm trying to understand the coding patterns/structure to apply to each website (which entails providing the code necessary to reach out to a particuar segment of the webpage's code).
Thanks again!
Proud to be a Super User!
Paul on Linkedin.
If you change my final filter to look for "races" vs "yacht" is that not the two images you are referring to?
DAX is for Analysis. Power Query is for Data Modeling
Proud to be a Super User!
MCSA: BI Reporting
Thanks again for your suggestion. I did eventually succeed in importing the image URLs, albeit tweaking the M code for the different websites. Just for general information, I imported about 3,500 different URLs, from about 1,650 different pages from 4 different websites. The whole process took over 6 hours probably (in fact I started the queries at around 20:00, checked well past midnight and it was more or less half way there...).
I subsequently decided I was also interest in importing some text from each of those pages, but have given up trying to work out the code for the selector etc needed...
Thanks again!
Proud to be a Super User!
Paul on Linkedin.
This is your chance to engage directly with the engineering team behind Fabric and Power BI. Share your experiences and shape the future.
Check out the June 2025 Power BI update to learn about new features.
User | Count |
---|---|
17 | |
9 | |
8 | |
7 | |
7 |