March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount! Early bird discount ends December 31.
Register NowBe one of the first to start using Fabric Databases. View on-demand sessions with database experts and the Microsoft product team to learn just how easy it is to get started. Watch now
Scenario:
In Power Query, when we encounter intricate data structures and the native functions in M script are insufficient, we can resort to other languages for implementation. Given that Web.Page is compatible with HTML, I will utilize JavaScript for the execution of regular expressions, which is a more straightforward approach compared to using M script.
How:
1.Understanding What Web.Page Returns
Initially, we employ Web.Page to retrieve the data. It’s observable that the DOM data is procured, which encompasses additional arrays.
Immediately after that we can continue to look for data down the line
= Table.AddColumn(#"Changed Type", "Custom", each Web.Page("<script>document.write('"&[RequestInfo]&"')</script>")[Data]{0})
This step we get data such as the request type:
= Table.AddColumn(#"Changed Type", "Custom", each Web.Page("<script>document.write('"&[RequestInfo]&"')</script>")[Data]{0}[Children]{0})
Request header and request body:
For Request Header
= Table.AddColumn(#"Changed Type", "Custom", each Web.Page("<script>document.write('"&[RequestInfo]&"')</script>")[Data]{0}[Children]{0}[Children]{0})
For request body
= Table.AddColumn(#"Changed Type", "Custom", each Web.Page("<script>document.write('"&[RequestInfo]&"')</script>")[Data]{0}[Children]{0}[Children]{1})
It can be noticed that the request body returns the data we need.
= Table.AddColumn(#"Changed Type", "Custom", each Web.Page("<script>document.write('"&[RequestInfo]&"')</script>")[Data]{0}[Children]{0}[Children]{1}[Text]{0})
We can output the returned data
2.Using JavaScript Regular Expressions
By following the aforementioned procedures, we can effortlessly acquire the necessary data within the HTML environment. Consequently, our sole requirement is to compose the requisite functions to facilitate the straightforward implementation of regular expressions.
= Table.AddColumn(#"Changed Type", "Custom", each Web.Page("<script>document.write('"&[RequestInfo]&"'.match(/\b\d{3}\b/g))</script>")[Data]{0}[Children]{0}[Children]{1}[Text]{0})
The implemented requirement is to obtain the network request status code. It’s understood that it comprises three digits, so the condition is: /\b\d{3}\b/.
It’s observable that the sample data contains either multiple or singular status codes, dispersed across various locations, and includes erroneous data such as “32145”. This can be accurately sifted through the use of regular expressions. If M script is employed, functions like Text.Contains, Text.StartsWith, Text.EndsWith may be required. Particularly when the data distribution is inconsistent, Text.Replace might also be utilized, making this filtering process challenging to implement.
Summary:
By exploring, we can use the general script to get the required data:
Web.Page("<script>Your Fuction</script>")[Data]{0}[Children]{0}[Children]{1}[Text]{0}
Functions that are either unavailable natively in M script or challenging to implement can be effectively executed using JavaScript.
Document reference
Regular expressions - JavaScript | MDN (mozilla.org)
Web.Page - PowerQuery M | Microsoft Learn
Author: Tianyi Chang
Reviewer: Ula and Kerry
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.