Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn a 50% discount on the DP-600 certification exam by completing the Fabric 30 Days to Learn It challenge.

v-xinruzhu-msft

Explore Web.Page in power query: Execute regular expressions with JavaScript

Scenario: 

In Power Query, when we encounter intricate data structures and the native functions in M script are insufficient, we can resort to other languages for implementation. Given that Web.Page is compatible with HTML, I will utilize JavaScript for the execution of regular expressions, which is a more straightforward approach compared to using M script. 

  

How: 

1.Understanding What Web.Page Returns 

Initially, we employ Web.Page to retrieve the data. It’s observable that the DOM data is procured, which encompasses additional arrays. 

vxinruzhumsft_0-1708416392172.png

Immediately after that we can continue to look for data down the line 

 

 

= Table.AddColumn(#"Changed Type", "Custom", each Web.Page("<script>document.write('"&[RequestInfo]&"')</script>")[Data]{0}) ​

 

 

This step we get data such as the request type: 

vxinruzhumsft_0-1708416462494.png

 

 

= Table.AddColumn(#"Changed Type", "Custom", each Web.Page("<script>document.write('"&[RequestInfo]&"')</script>")[Data]{0}[Children]{0}) ​

 

 

Request header and request body: 

vxinruzhumsft_1-1708416476894.png

For Request Header 

 

 

= Table.AddColumn(#"Changed Type", "Custom", each Web.Page("<script>document.write('"&[RequestInfo]&"')</script>")[Data]{0}[Children]{0}[Children]{0})​

 

 

vxinruzhumsft_2-1708416495962.png

For request body 

 

 

= Table.AddColumn(#"Changed Type", "Custom", each Web.Page("<script>document.write('"&[RequestInfo]&"')</script>")[Data]{0}[Children]{0}[Children]{1}) ​

 

 

It can be noticed that the request body returns the data we need. 

vxinruzhumsft_3-1708416511364.png

 

 

= Table.AddColumn(#"Changed Type", "Custom", each Web.Page("<script>document.write('"&[RequestInfo]&"')</script>")[Data]{0}[Children]{0}[Children]{1}[Text]{0}) ​

 

 

We can output the returned data 

vxinruzhumsft_4-1708416522469.png

2.Using JavaScript Regular Expressions 

By following the aforementioned procedures, we can effortlessly acquire the necessary data within the HTML environment. Consequently, our sole requirement is to compose the requisite functions to facilitate the straightforward implementation of regular expressions.  

 

 

= Table.AddColumn(#"Changed Type", "Custom", each Web.Page("<script>document.write('"&[RequestInfo]&"'.match(/\b\d{3}\b/g))</script>")[Data]{0}[Children]{0}[Children]{1}[Text]{0}) ​

 

 

vxinruzhumsft_5-1708416548380.png

The implemented requirement is to obtain the network request status code. It’s understood that it comprises three digits, so the condition is/\b\d{3}\b/. 

  

It’s observable that the sample data contains either multiple or singular status codes, dispersed across various locations, and includes erroneous data such as “32145”. This can be accurately sifted through the use of regular expressions. If M script is employed, functions like Text.ContainsText.StartsWithText.EndsWith may be required. Particularly when the data distribution is inconsistent, Text.Replace might also be utilized, making this filtering process challenging to implement. 

 

Summary:     

By exploring, we can use the general script  to get the required data: 

 

 

Web.Page("<script>Your Fuction</script>")[Data]{0}[Children]{0}[Children]{1}[Text]{0} ​

 

 

Functions that are either unavailable natively in M script or challenging to implement can be effectively executed using JavaScript. 

  

Document reference 

Regular expressions - JavaScript | MDN (mozilla.org) 

Web.Page - PowerQuery M | Microsoft Learn 

  

 

Author: Tianyi Chang 

Reviewer: Ula and Kerry