Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

The Power BI Data Visualization World Championships is back! Get ahead of the game and start preparing now! Learn more

Reply
mrpowrbihelp
Frequent Visitor

scrapping amazon

I'm able to load the webpage and create a table in Power BI. Below is the advanced editor.

 

let
Source = Web.BrowserContents("https://www.amazon.com/gp/offer-listing/0062420704"),
#"Extracted Table From Html" = Html.Table(Source, {{"Seller Name", ".olpSellerName"}, {"Seller Price", ".a-color-price"}, {"Condition", ".olpConditionColumn > .a-section"}, {"Ships From", ".a-unordered-list.olpFastTrack > :nth-child(2):nth-last-child(2)"}}, [RowSelector=".olpOffer"]),
#"Changed Type" = Table.TransformColumnTypes(#"Extracted Table From Html",{{"Seller Name", type text}, {"Seller Price", Currency.Type}, {"Condition", type text}, {"Ships From", type text}})
in
#"Changed Type"

 

I want to be able to have a table of ASINs that are at the end of the website address (highlighted in red) so i can scrap and log multiple products information each week. I will also need to scrap page 1,2,3, ect depending on how many there are.

 

But to start...I've watched videos to add a line at the top of the code

=(ASIN as text) as table=>

and change the website address to..Source = Web.BrowserContents("https://www.amazon.com/gp/offer-listing/"&ASIN"),

 

but I keep getting a Token Literal expected error.

What is going wrong? 

7 REPLIES 7
v-cherch-msft
Microsoft Employee
Microsoft Employee

Hi @mrpowrbihelp

 

Here are two articles for you to see if they are helpful to you:

Loop through Multiple Web Pages using Power Query

Iterating over multiple pages of web data using Power Query

 

 

Regards,

Cherie

Community Support Team _ Cherie Chen
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Thank you! I apologize this is all new to me.

I can get it to loop thru pages using 

https://www.mattmasson.com/2014/11/iterating-over-an-unknown-number-of-pages-in-power-query/

 

with my Code looking like..

(Page as number) as table =>

let
Source = Web.BrowserContents("https://www.amazon.com/gp/offer-listing/B01MQWUXZS/ref=olp_page_next?ie=UTF8&f_all=true&startIndex=" & Number.ToText(Page)),
#"Extracted Table From Html" = Html.Table(Source, {{"Seller", ".olpSellerName"}, {"Price", ".a-color-price"}}, [RowSelector=".olpOffer"])
in
#"Extracted Table From Html"

 

let
PageRange = {0,10,20,30,40,50},
Source = List.Transform(PageRange, each try {_, GetData(_)} otherwise null),
First = List.FirstN(Source, each _ <> null),
Table = Table.FromRows(First, {"Page", "Column1"}),
Expanded = Table.ExpandTableColumn(Table, "Column1", {"Seller", "Price"}, {"Seller", "Price"})
in
Expanded

 

BUT I'm having trouble combining the two,  I need it to also go thru all the different ASINs and I want my code to look more like

 

(Page as number, Asin as text) as table =>

let
Source = Web.BrowserContents("https://www.amazon.com/gp/offer-listing/" & Asin & "/ref=olp_page_next?ie=UTF8&f_all=true&startIndex=" & Number.ToText(Page)),
#"Extracted Table From Html" = Html.Table(Source, {{"Seller", ".olpSellerName"}, {"Price", ".a-color-price"}}, [RowSelector=".olpOffer"])
in
#"Extracted Table From Html"

 

How do I adjust the next Query to look at a list of Page numbers and a list of Asins?

hi @mrpowrbihelpty this code for replacing the query starting with PageRange 

let
    PageRange = {0,10,20,30,40,50},
    #"Converted to Table" = Table.FromList(PageRange, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
    #"Renamed Columns" = Table.RenameColumns(#"Converted to Table",{{"Column1", "Pages"}}),
    AddASIN = Table.AddColumn(#"Renamed Columns", "ASIN", each {"aaa","bbb","ccc"}),
    #"Expanded ASIN" = Table.ExpandListColumn(AddASIN, "ASIN"),
    GetData = Table.AddColumn(#"Expanded ASIN", "Custom", each (try {_, GetData([Pages],[ASIN])} otherwise null)),
    GetTable = Table.AddColumn(GetData, "Table", each [Custom]{1}),      //this part may return error if the output of GetData doesn't have 2 parameters
    #"Expanded Table" = Table.ExpandTableColumn(GetTable, "Table", {"Seller", "Price"}, {"Seller", "Price"}),
    #"Removed Columns" = Table.RemoveColumns(#"Expanded Table",{"Custom"})
in
    #"Removed Columns"

it converts list to table, and adds second column with ASIN code
GetData in next step is using these 2 columns as input for the parameters



Did I answer your question? Mark my post as a solution!
Thank you for the kudos 🙂

Hi @Stachu,

 

Thank you for your response!

 

I tried using your suggestion with a random ASIN. (see below). This example had 3 pages of data it should have pulled in (24 lines), but the third page of data repeated for the pages that should have been blank resulting in 36 lines of data.

 

let
PageRange = {0,10,20,30,40,50},
#"Converted to Table" = Table.FromList(PageRange, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
#"Renamed Columns" = Table.RenameColumns(#"Converted to Table",{{"Column1", "Pages"}}),
AddASIN = Table.AddColumn(#"Renamed Columns", "ASIN", each {"B01M0GB8CC"}),
#"Expanded ASIN" = Table.ExpandListColumn(AddASIN, "ASIN"),
GetData = Table.AddColumn(#"Expanded ASIN", "Custom", each (try {_, GetData([Pages],[ASIN])} otherwise null)),
GetTable = Table.AddColumn(GetData, "Table", each [Custom]{1}), //this part may return error if the output of GetData doesn't have 2 parameters
#"Expanded Table" = Table.ExpandTableColumn(GetTable, "Table", {"Seller", "Price"}, {"Seller", "Price"}),
#"Removed Columns" = Table.RemoveColumns(#"Expanded Table",{"Custom"})
in
#"Removed Columns"

 

Also, can I bring in an excel file with a list of ASIN's in stead of how you have "AddASIN = Table.AddColumn(#"Renamed Columns", "ASIN", each {"B01M0GB8CC"})," to list each one?

 

Thank you!

Hi @mrpowrbihelp,

 

I've combined the code into one query: 

 

let

fnScrapeASIN = (ASIN as text) => 
 List.Generate( 
                () => [Table = null, page = 10, PrevTable = 0],
                each [Table] <> [PrevTable],
                each [Source = Web.BrowserContents("https://www.amazon.com/gp/offer-listing/" & ASIN & "/ref=olp_page_13?ie=UTF8&f_all=true&startIndex=" & Text.From(page)),
                    Table = Html.Table(Source, {{"Seller Name", ".olpSellerName"}, {"Seller Price", ".a-color-price"}, {"Condition", ".olpConditionColumn > .a-section"}, {"Ships From", ".a-unordered-list.olpFastTrack > :nth-child(2):nth-last-child(2)"}}, [RowSelector=".olpOffer"]),
                    PrevTable = [Table],
                    page = [page]+10],
                    each [Table]
                    ),

    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMjQzNDUxNzA2MVGK1YlWMjA3NjWztDA2jVCKjQUA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [ASIN = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"ASIN", type text}}),
    #"Added Custom" = Table.AddColumn(#"Changed Type", "Custom", each List.Skip(fnScrapeASIN([ASIN]),1)),
    #"Expanded Custom" = Table.ExpandListColumn(#"Added Custom", "Custom"),
    #"Expanded Custom1" = Table.ExpandTableColumn(#"Expanded Custom", "Custom", {"Seller Name", "Seller Price", "Condition", "Ships From"}, {"Seller Name", "Seller Price", "Condition", "Ships From"})
in
    #"Expanded Custom1"

All you have to do is to replace the bold part in the Source-step by a reference to a table with your ASIN-numbers (column named "ASIN").

 

This is a recursive query that retrieves the next page (+10) if the result of the current request is not equal to the result of the previous request (as the page seems to always return the last page, even if the numbers go up)

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

Stachu
Community Champion
Community Champion

my guess is unnecessary paranthesis, try this

Source = Web.BrowserContents("https://www.amazon.com/gp/offer-listing/" & ASIN),


Did I answer your question? Mark my post as a solution!
Thank you for the kudos 🙂

That worked. Thanks!

 

Would you know how to loop thru the next pages. I found tutorials for how to do it if the page number is shown in the URL, but that is not the case with Amazon. When you click to page 2 the URL changes to the below.

 

https://www.amazon.com/gp/offer-listing/0062420704/ref=olp_page_next?ie=UTF8&f_all=true&startIndex=1...

 

What could I write to get it to loop thru all the pages of data?

Helpful resources

Announcements
Power BI DataViz World Championships

Power BI Dataviz World Championships

The Power BI Data Visualization World Championships is back! Get ahead of the game and start preparing now!

December 2025 Power BI Update Carousel

Power BI Monthly Update - December 2025

Check out the December 2025 Power BI Holiday Recap!

FabCon Atlanta 2026 carousel

FabCon Atlanta 2026

Join us at FabCon Atlanta, March 16-20, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.