Power BI is turning 10! Tune in for a special live episode on July 24 with behind-the-scenes stories, product evolution highlights, and a sneak peek at what’s in store for the future.
Save the dateEnhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends August 31st. Request your voucher.
Hello, I made some regular expression functions in Power Query to enhance text processing capabilities.
Currently, Power Query primarily relies on basic text functions such as Text.Contains() and Text.Select(), which do not support complex text cleansing requirements. Although users can implement regular expression functionality via Python or custom functions, this approach increases complexity and reduces both readability and maintainability.
It is worth noting that Excel has already added three regular expression functions for Microsoft 365 users in 2024: REGEXTEST, REGEXEXTRACT, and REGEXREPLACE
https://techcommunity.microsoft.com/blog/microsoft365insiderblog/new-regular-expression-regex-functi...
Based on Excel's implementation, I have already finished custom functions in Power Query for your reference:
Text.RegexTest
Text.RegexTest(text as text, pattern as text, optional case_sensitivity as logical) as logical
Tests if the specified text matches a given regular expression pattern. Returns true if it matches, otherwise returns false. case_sensitivity is optional, used to indicate whether to be case-sensitive.
let
// Implementation of Text.RegexTest function to check if a string matches a given regex pattern
Text.RegexTest = (text as text, pattern as text, optional caseSensitivity as number) =>
let
// Set default caseSensitive to 0 (case-insensitive) if not provided
caseSensitivityValue = if caseSensitivity = null then 0 else caseSensitivity,
// Set regex modifier based on case sensitivity setting (0 = case-sensitive, otherwise case-insensitive)
regexModifier = if caseSensitivityValue = 0 then "" else "i",
// Construct the JavaScript regex expression
regexExpression = "/" & pattern & "/" & regexModifier,
// Create HTML with embedded JavaScript to evaluate the regex pattern
htmlContent =
"<script>" &
"var matches = """ & text & """.match(" & regexExpression & ");" &
"document.write(matches ? 'true' : 'false');" &
"</script>",
// Execute the JavaScript using Web.Page and extract the result
resultText = Web.Page(htmlContent)[Data]{0}[Children]{0}[Children]{1}[Text]{0},
// Convert string result to a logical value
resultValue = if resultText = "true" then true else false
in
resultValue
in
Text.RegexTest
Text.RegexExtract
Text.RegexExtract(text as text, pattern as text, optional return_mode as text, optional case_sensitivity as logical) as text
Extracts substrings from text that match the regular expression pattern. return_mode is optional, used to specify the return mode (e.g., return first match, complete match, or capturing groups), case_sensitivity is optional, used to control whether to be case-sensitive.
let
// Text.RegexExtract function to extract content matching a given regex pattern
Text.RegexExtract = (
text as text,
pattern as text,
optional returnMode as number,
optional caseSensitivity as number
) =>
let
// Set default return mode (0 = first match only) if not provided
returnModeValue = if returnMode is null then 0 else returnMode,
// Set default case sensitivity (0 = case-sensitive) if not provided
caseSensitivityValue = if caseSensitivity = null then 0 else caseSensitivity,
// Set regex modifier based on case sensitivity (g = global, i = case-insensitive)
regexModifier = if caseSensitivityValue = 1 then "g" else "gi",
// Construct the JavaScript regex expression
regexExpression = "/" & pattern & "/" & regexModifier,
// Generate JavaScript logic based on return mode:
// 0 = first match only
// 1 = all matches as array
// other = capturing groups
javascriptLogic = if returnModeValue = 0 then
"var match = str.match(regex); var res = match ? match[0] : null;"
else if returnModeValue = 1 then
"var res = str.match(regex);"
else
"var match = regex.exec(str); res = match ? match.slice(1) : null;",
// Build HTML content with embedded JavaScript
htmlContent = Text.Combine({
"<script>",
"var regex = ", regexExpression, ";",
"var str = """, text, """;",
javascriptLogic,
"document.write(res)",
"</script>"
}),
// Execute JavaScript via Web.Page and extract the result
resultText = Web.Page(htmlContent)[Data]{0}[Children]{0}[Children]{1}[Text]{0}
in
// Return extracted text
resultText
in
Text.RegexExtract
Text.RegexReplace
Text.RegexReplace(text as text, pattern as text, replacement as text, optional occurrence as number, optional case_sensitivity as logical) as text
Replaces parts of the text that match the regular expression pattern with the specified replacement string. occurrence is optional, specifies the index of the match to replace (if omitted, replaces all matches by default), case_sensitivity is optional, used to indicate whether to be case-sensitive.
let
// Text.RegexReplace function to replace content matching a regex pattern with specified text
Text.RegexReplace = (
text as text, // Original text
pattern as text, // Regex pattern (e.g., "[0-9]{3}-")
replacement as text, // Replacement text
optional occurrence as number, // Which occurrence to replace (0 = all, default)
optional caseSensitivity as number // Case sensitivity (0 = case-sensitive, default)
) =>
let
// Handle default parameters
occurrenceValue = if occurrence = null then 0 else occurrence,
caseSensitivityValue = if caseSensitivity = null then 0 else caseSensitivity,
// Generate JavaScript regex with appropriate modifiers
regexModifier = if caseSensitivityValue = 0 then "g" else "gi",
regexExpression = "/" & pattern & "/" & regexModifier,
// Build JavaScript replacement logic
javascriptCode =
"var regex = " & regexExpression & ";" &
"var str = '" & text & "';" &
"var replacement = '" & replacement & "';" &
"var occ = " & Text.From(occurrenceValue) & ";" &
"
var matches = [];
var match;
while ((match = regex.exec(str)) !== null) {
matches.push({
index: match.index,
length: match[0].length
});
}
if (occ === 0) {
// Replace all occurrences
document.write(str.replace(regex, replacement));
} else {
// Calculate target position
var targetIndex;
if (occ > 0) {
targetIndex = occ - 1;
} else {
targetIndex = matches.length + occ;
}
if (targetIndex < 0 || targetIndex >= matches.length) {
document.write(str);
} else {
var target = matches[targetIndex];
var res = str.substring(0, target.index) +
replacement +
str.substring(target.index + target.length);
document.write(res);
}
}
",
// Execute JavaScript and get result
htmlContent = "<script>" & javascriptCode & "</script>",
resultText = Web.Page(htmlContent)[Data]{0}[Children]{0}[Children]{1}[Text]{0}
in
resultText
in
Text.RegexReplace
I think these custom functions will enable Power Query users to handle complex text pattern matching tasks more efficiently. I look forward to any further discussion with you!
Solved! Go to Solution.
Please consider putting this in the Blog section instead.
Fabric community blogs - Microsoft Fabric Community
Thank you, @lbendlin , for your response and valuable advice, which will greatly benefit the members of the Microsoft Fabric Community.
Hi @jianfajun,
We sincerely appreciate your efforts in sharing this valuable information with the community. Your contribution in providing regular expression functions in Power Query to enhance text processing capabilities is highly commendable. This will undoubtedly benefit other community members seeking similar guidance.
Thank you.
Hi jianfajun,
To enhance the visibility of this discussion within the forum, we kindly request you to mark this response as the accepted solution and extend kudos. This will assist other members encountering similar queries in easily finding the relevant information.
Thank you.
These are great, but performance appears to be a bit slow if this is used on a table with many rows to calculate a new column due to having to launch a new instance of Web.Page for every line. I really hope Microsoft will make these functions natively supported in Power Query soon.
Oh my! Absolute life-save you are!
I've been trying all day to get this to work through a Python script, but have trouble getting the dataframe to return only the current row. I'm going to use this for validator and extraction. AWESOME!
Hi jianfajun,
To enhance the visibility of this discussion within the forum, we kindly request you to mark this response as the accepted solution and extend kudos. This will assist other members encountering similar queries in easily finding the relevant information.
Thank you.
Thank you, @lbendlin , for your response and valuable advice, which will greatly benefit the members of the Microsoft Fabric Community.
Hi @jianfajun,
We sincerely appreciate your efforts in sharing this valuable information with the community. Your contribution in providing regular expression functions in Power Query to enhance text processing capabilities is highly commendable. This will undoubtedly benefit other community members seeking similar guidance.
Thank you.
Please consider putting this in the Blog section instead.
Fabric community blogs - Microsoft Fabric Community
Thanks! I just could not find the post button in the Blog section. Maybe level restriction?
Check out the July 2025 Power BI update to learn about new features.
This is your chance to engage directly with the engineering team behind Fabric and Power BI. Share your experiences and shape the future.
User | Count |
---|---|
63 | |
62 | |
52 | |
39 | |
24 |
User | Count |
---|---|
84 | |
57 | |
45 | |
42 | |
38 |