Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Enhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends August 31st. Request your voucher.

Reply
Ezay
New Member

3 same REGEX queries with R script and 3 different results.

Hello, this is my first topic here, i'm glad to share with you my "REGEX with R" issues.
This was also my first calculated column with a R script but I really needed this language to test data against REGEX.

This is my script : 

 

 

 

 

 

# 'dataset' contient les données d'entrée pour ce script

Regex_apply <- function(x,y) {  grepl(x , y)}
output <- within(dataset,{ValidRegex=Regex_apply(dataset$REGEX_simple, dataset$VAT )})

 

 

 

 It's build to apply the REGEX value contained in the REGEX_simple column to the VAT code in the VAT column and to return a boolean TRUE/FALSE.
I'm working on a table with companies in around 30 different countries and i expect to check if their VAT code is OK.
My script works well when i reduce the scope with only one country (example : Germany) and find the good and the bad ones.
When i add a second country in my scope, the results are fine for 1 country and totally wrong for the other country 
When i go for the full scope, the results are either OK or completely false without my being able to detect a pattern on what separates why one country is OK and another not.

Ezay_1-1641920420342.png

 

Ezay_0-1641920374387.png

 

 



( i had to hide some data but there were 9 numbers after DE and 12 after the SE, the result must return true for each row, in each array).
But we have : 

numbers of countryDESE
1OKOK
2OKKO
fullKOOK


I think that i missed something and i will be really happy if someone could help me find what it is.

1 REPLY 1
Icey
Community Support
Community Support

Hi @Ezay ,

 

This issue is caused by the argument 'pattern' can only use the first element. 

Icey_2-1642148398506.png

Reference: grep: Pattern Matching and Replacement (rdrr.io)

 

Icey_1-1642148255744.png

 

 

Then I refer to this post to get a workaround:

 

# 'dataset' holds the input data for this script
Regex_apply <- function(x,y){
if(grepl(x, y)){
  check <- TRUE
} else{
  check <- FALSE
}
check
}
library(purrr)
output <- within(dataset,{ValidRegex=map2(dataset$REGEX_simple,dataset$VAT,Regex_apply)})

 

Icey_0-1642148206509.png

 

However, this won't work in Power BI. 

Icey_0-1642148950880.png

 

Please give me some time to research. Once there is a solution, I will post it as soon as possible.

 

 

Best Regards,

Icey

 

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Helpful resources

Announcements
July 2025 community update carousel

Fabric Community Update - July 2025

Find out what's new and trending in the Fabric community.

July PBI25 Carousel

Power BI Monthly Update - July 2025

Check out the July 2025 Power BI update to learn about new features.