March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount! Early bird discount ends December 31.
Register NowBe one of the first to start using Fabric Databases. View on-demand sessions with database experts and the Microsoft product team to learn just how easy it is to get started. Watch now
Hi,
I have a small project working on a Power BI model for demoing features etc. The report used to run, but recently has become absurdly slow.
It first consists of this function:
let
MetaCritic = (page as number) as table =>
let
Source = Web.BrowserContents("https://www.metacritic.com/browse/games/score/metascore/all/all/all?view=condensed&sort=desc&page=" & Number.ToText(page), [WaitFor=[Timeout=#duration(0, 5, 0, 0)]]),
#"Extracted Table From Html" = Html.Table(Source, {{"Column2", ".numbered"}, {"Column3", ".numbered + *"}, {"Column4", ".next_to_side_col .data"}, {"Column5", ".platform + *"}, {"Column7", ".score:nth-child(1)"}, {"Column8", ".summary"}}, [RowSelector="TR"])
in
#"Extracted Table From Html"
in
MetaCritic
When I feed this function any variable (page number), I get a rather quick response with a table corresponding to the 100 rows of the page number provided as variable.
Then I create an iterative query, that should return each page up until the max page (280) in succession:
let
Source = {0..280},
#"Converted to Table" = Table.FromList(Source, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
#"Renamed Columns" = Table.RenameColumns(#"Converted to Table",{{"Column1", "Page"}}),
#"Added Custom" = Table.AddColumn(#"Renamed Columns", "MC", each MC_PageFx_2([Page])),
#"Expanded MC" = Table.ExpandTableColumn(#"Added Custom", "MC", {"Column2", "Column3", "Column4", "Column5", "Column7", "Column8"}),
#"Removed Other Columns" = Table.SelectColumns(#"Expanded MC",{"Page", "Column2", "Column3", "Column4", "Column5", "Column7", "Column8"}),
#"Changed Type" = Table.TransformColumnTypes(#"Removed Other Columns",{{"Page", type text}, {"Column2", type text}, {"Column3", type text}, {"Column4", type text}, {"Column5", type date}, {"Column7", type number}, {"Column8", type text}}),
#"Renamed Columns1" = Table.RenameColumns(#"Changed Type",{{"Column2", "Rank"}, {"Column3", "Title"}, {"Column4", "Platform"}, {"Column5", "Date"}, {"Column7", "Score"}, {"Column8", "Description"}})
in
#"Renamed Columns1"
When I click close and apply to this, it startes to load rows into the model, 100 at a time, så the counter goes up from 100, to 200, to 300 etc loaded rows. The problem is however, that each 100 rows takes HOURS to load, so It would at least need to be running a week for 280 pages times 100 rows each would load...
My question is, as it is clearly the iterative query that is creating some kind of bottle neck, how can I improve this query? The function returns pages quickly, and entering the page specific urls in a browser, the page takes a few seconds to load...
I also tried both parallele and serial query load, but it seems to be the same regardless.
Solved! Go to Solution.
This code takes about 6 seconds per 100 entries. So about half an hour to fetch everything. Adjust the page range as needed.
let
Source = {0..20},
Extract = (page)=> Html.Table(Web.BrowserContents("https://www.metacritic.com/browse/games/score/metascore/all/all/all?view=condensed&sort=desc&page=" & Text.From(page)),
{{"Rank", ".numbered"}, {"Name", ".numbered + *"}, {"Platform", ".next_to_side_col .data"}, {"Released", ".platform + *"}, {"User Score", ".user"}, {"Critic Review", ".score:nth-child(1)"}, {"Summary", ".summary"}}, [RowSelector="TR"]),
#"Converted to Table" = Table.FromList(Source, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
#"Added Custom" = Table.AddColumn(#"Converted to Table", "Custom", each Extract([Column1])),
#"Expanded Custom" = Table.ExpandTableColumn(#"Added Custom", "Custom", {"Rank", "Name", "Platform", "Released", "User Score", "Critic Review", "Summary"})
in
#"Expanded Custom"
How to use this code: Create a new Blank Query. Click on "Advanced Editor". Replace the code in the window with the code provided here. Click "Done".
That worls wonders. Thank you!
Would it be possible for it to also determine how many pages are actually present, right now I think it is 271, and then make that number dynamic based on this? Thanks again.
That worls wonders. Thank you!
Would it be possible for it to also determine how many pages are actually present, right now I think it is 271, and then make that number dynamic based on this? Thanks again.
Use List.Generate with an exit condition when the page doesn't return any more payload data. The disadvantage of that approach will be that you have to lug the entire resultset around inside the List.Generate, so you better have plenty of memory. Or you could play dirty tricks and keep trying to fetch pages, throw them away, determine the last page, then run the original code in the hope that your web browser cache has all the data locally available.
This code takes about 6 seconds per 100 entries. So about half an hour to fetch everything. Adjust the page range as needed.
let
Source = {0..20},
Extract = (page)=> Html.Table(Web.BrowserContents("https://www.metacritic.com/browse/games/score/metascore/all/all/all?view=condensed&sort=desc&page=" & Text.From(page)),
{{"Rank", ".numbered"}, {"Name", ".numbered + *"}, {"Platform", ".next_to_side_col .data"}, {"Released", ".platform + *"}, {"User Score", ".user"}, {"Critic Review", ".score:nth-child(1)"}, {"Summary", ".summary"}}, [RowSelector="TR"]),
#"Converted to Table" = Table.FromList(Source, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
#"Added Custom" = Table.AddColumn(#"Converted to Table", "Custom", each Extract([Column1])),
#"Expanded Custom" = Table.ExpandTableColumn(#"Added Custom", "Custom", {"Rank", "Name", "Platform", "Released", "User Score", "Critic Review", "Summary"})
in
#"Expanded Custom"
How to use this code: Create a new Blank Query. Click on "Advanced Editor". Replace the code in the window with the code provided here. Click "Done".
March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!
Your insights matter. That’s why we created a quick survey to learn about your experience finding answers to technical questions.
Arun Ulag shares exciting details about the Microsoft Fabric Conference 2025, which will be held in Las Vegas, NV.
User | Count |
---|---|
129 | |
90 | |
75 | |
58 | |
53 |
User | Count |
---|---|
200 | |
104 | |
101 | |
67 | |
55 |