Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Enhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends August 31st. Request your voucher.

Reply
jianfajun
Advocate II
Advocate II

Regular Expression Functions (REGEX) in Power Query

Hello, I made some regular expression functions in Power Query to enhance text processing capabilities.


Background:

Currently, Power Query primarily relies on basic text functions such as Text.Contains() and Text.Select(), which do not support complex text cleansing requirements. Although users can implement regular expression functionality via Python or custom functions, this approach increases complexity and reduces both readability and maintainability.

 

It is worth noting that Excel has already added three regular expression functions for Microsoft 365 users in 2024: REGEXTEST, REGEXEXTRACT, and REGEXREPLACE
https://techcommunity.microsoft.com/blog/microsoft365insiderblog/new-regular-expression-regex-functi...

Checking whether the strings in column C contain numerical digits, usi.png

 

Suggestions:

Based on Excel's implementation, I have already finished custom functions in Power Query for your reference:

 

Text.RegexTest

Text.RegexTest(text as text, pattern as text, optional case_sensitivity as logical) as logical

Screenshot - 2025-03-14 11.08.27.png

Tests if the specified text matches a given regular expression pattern. Returns true if it matches, otherwise returns false. case_sensitivity is optional, used to indicate whether to be case-sensitive.

let
    // Implementation of Text.RegexTest function to check if a string matches a given regex pattern
    Text.RegexTest = (text as text, pattern as text, optional caseSensitivity as number) =>
    let
        // Set default caseSensitive to 0 (case-insensitive) if not provided
        caseSensitivityValue = if caseSensitivity = null then 0 else caseSensitivity,
        
        // Set regex modifier based on case sensitivity setting (0 = case-sensitive, otherwise case-insensitive)
        regexModifier = if caseSensitivityValue = 0 then "" else "i",
        
        // Construct the JavaScript regex expression
        regexExpression = "/" & pattern & "/" & regexModifier,
        
        // Create HTML with embedded JavaScript to evaluate the regex pattern
        htmlContent = 
            "<script>" &
            "var matches = """ & text & """.match(" & regexExpression & ");" &
            "document.write(matches ? 'true' : 'false');" &
            "</script>",
        
        // Execute the JavaScript using Web.Page and extract the result
        resultText = Web.Page(htmlContent)[Data]{0}[Children]{0}[Children]{1}[Text]{0},
        
        // Convert string result to a logical value
        resultValue = if resultText = "true" then true else false
    in
        resultValue
in
    Text.RegexTest

 

Text.RegexExtract

Text.RegexExtract(text as text, pattern as text, optional return_mode as text, optional case_sensitivity as logical) as text

Screenshot - 2025-03-14 11.08.34.png

Extracts substrings from text that match the regular expression pattern. return_mode is optional, used to specify the return mode (e.g., return first match, complete match, or capturing groups), case_sensitivity is optional, used to control whether to be case-sensitive.

let
    // Text.RegexExtract function to extract content matching a given regex pattern
    Text.RegexExtract = (
        text as text, 
        pattern as text, 
        optional returnMode as number,
        optional caseSensitivity as number
    ) =>
    let
        // Set default return mode (0 = first match only) if not provided
        returnModeValue = if returnMode is null then 0 else returnMode,
        
        // Set default case sensitivity (0 = case-sensitive) if not provided
        caseSensitivityValue = if caseSensitivity = null then 0 else caseSensitivity,
        
        // Set regex modifier based on case sensitivity (g = global, i = case-insensitive)
        regexModifier = if caseSensitivityValue = 1 then "g" else "gi",
        
        // Construct the JavaScript regex expression
        regexExpression = "/" & pattern & "/" & regexModifier,
        
        // Generate JavaScript logic based on return mode:
        // 0 = first match only
        // 1 = all matches as array
        // other = capturing groups
        javascriptLogic = if returnModeValue = 0 then 
            "var match = str.match(regex); var res = match ? match[0] : null;" 
        else if returnModeValue = 1 then 
            "var res = str.match(regex);"
        else 
            "var match = regex.exec(str); res = match ? match.slice(1) : null;",
        
        // Build HTML content with embedded JavaScript
        htmlContent = Text.Combine({
            "<script>",
            "var regex = ", regexExpression, ";",
            "var str = """, text, """;",
            javascriptLogic,
            "document.write(res)",
            "</script>"
        }),
        
        // Execute JavaScript via Web.Page and extract the result
        resultText = Web.Page(htmlContent)[Data]{0}[Children]{0}[Children]{1}[Text]{0}
    in
        // Return extracted text
        resultText
in
    Text.RegexExtract

 

Text.RegexReplace

Text.RegexReplace(text as text, pattern as text, replacement as text, optional occurrence as number, optional case_sensitivity as logical) as text

Screenshot - 2025-03-14 11.08.40.png

Replaces parts of the text that match the regular expression pattern with the specified replacement string. occurrence is optional, specifies the index of the match to replace (if omitted, replaces all matches by default), case_sensitivity is optional, used to indicate whether to be case-sensitive.

let
    // Text.RegexReplace function to replace content matching a regex pattern with specified text
    Text.RegexReplace = (
        text as text,             // Original text
        pattern as text,          // Regex pattern (e.g., "[0-9]{3}-")
        replacement as text,      // Replacement text
        optional occurrence as number,     // Which occurrence to replace (0 = all, default)
        optional caseSensitivity as number   // Case sensitivity (0 = case-sensitive, default)
    ) =>
    let
        // Handle default parameters
        occurrenceValue = if occurrence = null then 0 else occurrence,
        caseSensitivityValue = if caseSensitivity = null then 0 else caseSensitivity,
        
        // Generate JavaScript regex with appropriate modifiers
        regexModifier = if caseSensitivityValue = 0 then "g" else "gi",
        regexExpression = "/" & pattern & "/" & regexModifier, 
        
        // Build JavaScript replacement logic
        javascriptCode = 
            "var regex = " & regexExpression & ";" & 
            "var str = '" & text & "';" &
            "var replacement = '" & replacement & "';" &
            "var occ = " & Text.From(occurrenceValue) & ";" &
            "
                var matches = [];
                var match;
                while ((match = regex.exec(str)) !== null) {
                    matches.push({
                        index: match.index,
                        length: match[0].length
                    });
                }
                if (occ === 0) {
                    // Replace all occurrences
                    document.write(str.replace(regex, replacement));
                } else {
                    // Calculate target position
                    var targetIndex;
                    if (occ > 0) {
                        targetIndex = occ - 1;
                    } else {
                        targetIndex = matches.length + occ;
                    }
                    if (targetIndex < 0 || targetIndex >= matches.length) {
                        document.write(str);
                    } else {
                        var target = matches[targetIndex];
                        var res = str.substring(0, target.index) + 
                                  replacement + 
                                  str.substring(target.index + target.length);
                        document.write(res);
                    }
                }
            ",
        // Execute JavaScript and get result
        htmlContent = "<script>" & javascriptCode & "</script>",
        resultText = Web.Page(htmlContent)[Data]{0}[Children]{0}[Children]{1}[Text]{0}
    in
        resultText
in
    Text.RegexReplace

 

I think these custom functions will enable Power Query users to handle complex text pattern matching tasks more efficiently. I look forward to any further discussion with you!

3 ACCEPTED SOLUTIONS
lbendlin
Super User
Super User

Please consider putting this in the Blog section instead.

Fabric community blogs - Microsoft Fabric Community

 

View solution in original post

v-pnaroju-msft
Community Support
Community Support

Thank you, @lbendlin , for your response and valuable advice, which will greatly benefit the members of the Microsoft Fabric Community.

Hi @jianfajun,

We sincerely appreciate your efforts in sharing this valuable information with the community. Your contribution in providing regular expression functions in Power Query to enhance text processing capabilities is highly commendable. This will undoubtedly benefit other community members seeking similar guidance.


Thank you.

View solution in original post

v-pnaroju-msft
Community Support
Community Support

Hi jianfajun,

To enhance the visibility of this discussion within the forum, we kindly request you to mark this response as the accepted solution and extend kudos. This will assist other members encountering similar queries in easily finding the relevant information.

Thank you.

View solution in original post

6 REPLIES 6
RHOU
Advocate III
Advocate III

These are great, but performance appears to be a bit slow if this is used on a table with many rows to calculate a new column due to having to launch a new instance of Web.Page for every line. I really hope Microsoft will make these functions natively supported in Power Query soon. 

JayBabcock
Frequent Visitor

Oh my!  Absolute life-save you are!

I've been trying all day to get this to work through a Python script, but have trouble getting the dataframe to return only the current row.  I'm going to use this for validator and extraction.  AWESOME!

v-pnaroju-msft
Community Support
Community Support

Hi jianfajun,

To enhance the visibility of this discussion within the forum, we kindly request you to mark this response as the accepted solution and extend kudos. This will assist other members encountering similar queries in easily finding the relevant information.

Thank you.

v-pnaroju-msft
Community Support
Community Support

Thank you, @lbendlin , for your response and valuable advice, which will greatly benefit the members of the Microsoft Fabric Community.

Hi @jianfajun,

We sincerely appreciate your efforts in sharing this valuable information with the community. Your contribution in providing regular expression functions in Power Query to enhance text processing capabilities is highly commendable. This will undoubtedly benefit other community members seeking similar guidance.


Thank you.

lbendlin
Super User
Super User

Please consider putting this in the Blog section instead.

Fabric community blogs - Microsoft Fabric Community

 

Thanks! I just could not find the post button in the Blog section. Maybe level restriction?

Helpful resources

Announcements
July PBI25 Carousel

Power BI Monthly Update - July 2025

Check out the July 2025 Power BI update to learn about new features.

Join our Fabric User Panel

Join our Fabric User Panel

This is your chance to engage directly with the engineering team behind Fabric and Power BI. Share your experiences and shape the future.

June 2025 community update carousel

Fabric Community Update - June 2025

Find out what's new and trending in the Fabric community.