Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
daanmot
Frequent Visitor

Add URL part from a column in Get Data > Web

How can I put my column as a part of the URL in the Get Data > Web > Advanced?

Example:

Part 1: www.example1.com/getid/
part2: [mycolumnid]/
part3: stats

 

and it goes like: www.example1.com/getid/mycolumnid/stats

 

I have ids from 1 to 100 for example, and I want to request all this ids stats.

 

 

1 ACCEPTED SOLUTION
PhilipTreacy
Super User
Super User

Hi @daanmot 

 

Download example PBIX file

 

You'll need to do some manual coding to get this to work, but it's not hard.  The process is explained here

 

Scrape multiple pages from web with Power Query function

 

In the following example I'll get multiple pages from Microsoft's blog, you can adapt the technique to your own URL

 

Start by creating a new Web query and just enter the URL in the Basic part of the connector dialog

web1.png

 

I'll come back to this later and change the page number for a variable.

 

After clicking OK it'll ask you how to connect (Anonymous) and then present you with the Navigator so you can choose what part of the web page you want to get.

nav1.png

 

Once that is selected and you click OK it'll load the query.

 

Now, open the query in the Advanced Editor and you'll see something like this

 

let
    Source = Web.BrowserContents("https://blogs.microsoft.com/page/1/"),
    #"Extracted Table From Html" = Html.Table(Source, {{"Column1", "ABBR"}, {"Column2", ".m-chevron"}, {"Column3", ".c-hyperlink"}}, [RowSelector=".m-preview"]),
    #"Changed Type" = Table.TransformColumnTypes(#"Extracted Table From Html",{{"Column1", type date}, {"Column2", type text}, {"Column3", type text}})
in
    #"Changed Type"

 

 

 

You want to change this to a function.  It's easily done just by adding 1 line at the top of the code

 

(PageNum as text) =>
let
    Source = Web.BrowserContents("https://blogs.microsoft.com/page/"&PageNum&"/"),
    #"Extracted Table From Html" = Html.Table(Source, {{"Column1", "ABBR"}, {"Column2", ".m-chevron"}, {"Column3", ".c-hyperlink"}}, [RowSelector=".m-preview"]),
    #"Changed Type" = Table.TransformColumnTypes(#"Extracted Table From Html",{{"Column1", type date}, {"Column2", type text}, {"Column3", type text}})
in
    #"Changed Type"

 

 

For your situation it'll look something like

(id as text) =>
let
    Source = Web.BrowserContents("https://example.com/getid/"&id&"/stats"),
    .......
    .......

 

Notice that I've also replaced the number 1 in the URL with the new variable PageNum. 

 

This variable is text so if your columns of ID's are numbers, they need to be converted to text to be used in the URL.  There are a few ways to do this, depending on the situation.

 

Clicking Done on the Advanced editor will now create a function from that code that can be called.  You can double click the query to rename, I've called mine GetWeb.

 

You should have another table that has a column of ID's something like this

nums.png

 

With that table selected, click on Add Column in the Ribbon and Invoke Custom Function.  In the dialog box, complete it like this

invk.png

 

Clicking OK calls the function for each row of the table i.e. for every PageNum.

 

The result is a table with a new column called Web Data that contains the data I want from the first 5 pages of the Microsoft blog.

webd.png

 

regards

 

Phil

 

 



Did I answer your question? Then please mark my post as the solution.
If I helped you, click on the Thumbs Up to give Kudos.


Blog :: YouTube Channel :: Connect on Linkedin


Proud to be a Super User!


View solution in original post

1 REPLY 1
PhilipTreacy
Super User
Super User

Hi @daanmot 

 

Download example PBIX file

 

You'll need to do some manual coding to get this to work, but it's not hard.  The process is explained here

 

Scrape multiple pages from web with Power Query function

 

In the following example I'll get multiple pages from Microsoft's blog, you can adapt the technique to your own URL

 

Start by creating a new Web query and just enter the URL in the Basic part of the connector dialog

web1.png

 

I'll come back to this later and change the page number for a variable.

 

After clicking OK it'll ask you how to connect (Anonymous) and then present you with the Navigator so you can choose what part of the web page you want to get.

nav1.png

 

Once that is selected and you click OK it'll load the query.

 

Now, open the query in the Advanced Editor and you'll see something like this

 

let
    Source = Web.BrowserContents("https://blogs.microsoft.com/page/1/"),
    #"Extracted Table From Html" = Html.Table(Source, {{"Column1", "ABBR"}, {"Column2", ".m-chevron"}, {"Column3", ".c-hyperlink"}}, [RowSelector=".m-preview"]),
    #"Changed Type" = Table.TransformColumnTypes(#"Extracted Table From Html",{{"Column1", type date}, {"Column2", type text}, {"Column3", type text}})
in
    #"Changed Type"

 

 

 

You want to change this to a function.  It's easily done just by adding 1 line at the top of the code

 

(PageNum as text) =>
let
    Source = Web.BrowserContents("https://blogs.microsoft.com/page/"&PageNum&"/"),
    #"Extracted Table From Html" = Html.Table(Source, {{"Column1", "ABBR"}, {"Column2", ".m-chevron"}, {"Column3", ".c-hyperlink"}}, [RowSelector=".m-preview"]),
    #"Changed Type" = Table.TransformColumnTypes(#"Extracted Table From Html",{{"Column1", type date}, {"Column2", type text}, {"Column3", type text}})
in
    #"Changed Type"

 

 

For your situation it'll look something like

(id as text) =>
let
    Source = Web.BrowserContents("https://example.com/getid/"&id&"/stats"),
    .......
    .......

 

Notice that I've also replaced the number 1 in the URL with the new variable PageNum. 

 

This variable is text so if your columns of ID's are numbers, they need to be converted to text to be used in the URL.  There are a few ways to do this, depending on the situation.

 

Clicking Done on the Advanced editor will now create a function from that code that can be called.  You can double click the query to rename, I've called mine GetWeb.

 

You should have another table that has a column of ID's something like this

nums.png

 

With that table selected, click on Add Column in the Ribbon and Invoke Custom Function.  In the dialog box, complete it like this

invk.png

 

Clicking OK calls the function for each row of the table i.e. for every PageNum.

 

The result is a table with a new column called Web Data that contains the data I want from the first 5 pages of the Microsoft blog.

webd.png

 

regards

 

Phil

 

 



Did I answer your question? Then please mark my post as the solution.
If I helped you, click on the Thumbs Up to give Kudos.


Blog :: YouTube Channel :: Connect on Linkedin


Proud to be a Super User!


Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.