Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Join us for an expert-led overview of the tools and concepts you'll need to become a Certified Power BI Data Analyst and pass exam PL-300. Register now.

Reply
Anonymous
Not applicable

R script regex with spaces

Hi,

 

I tried creating small R script that is able to parse incorrect JSON column. The data is poor quality as it lacks quotations. Here is my code:

# 'dataset' holds the input data for this script
pattern <- "(:)(?!null)(?!true)(?!false)(?!\"[a-z0-9A-Z.:-[[:space:]]]+\")([a-z0-9A-Z.:-[[:space:]]]+)(,|})"
replacer <- function(x) { gsub(pattern, "\\1_TEST_\\2_TEST_\\3", as.character(x), perl = TRUE) }
output <- within(dataset, { PARSED=replacer(dataset$WORKSTATUSES) } )

What is happening the regex matches incorrect data in R runner but doesn't in Power BI. The problem are spaces. I tried it all; \\s, [:blank:], [[:blank:]], [:space:], [[:space:]] as well as inserting the space character itself into my patter. Some gave errors, some did nothing (regex must had not match). Can you tell me how to do it properly?

1 ACCEPTED SOLUTION
Anonymous
Not applicable

It turned out the problem is my script didn't handle all cases of string-type values that occure in JSON I parse. Also some cells contain empty strings (not null) and therefore aren't valid for parsing (produce Errors as well). Filtering empty rows out and then filtering rows with error let me investigate what the problem really was. 

View solution in original post

3 REPLIES 3
v-juanli-msft
Community Support
Community Support

Hi @Anonymous 

First, could you show me the error in Power BI when running the R script?

 

What is your purpose to use this code?

Some requiremnets can be directly achieved by Power Query, if your requirement is clear, we can make a test for it.

 

Best Regards
Maggie
Community Support Team _ Maggie Li
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
Anonymous
Not applicable

Hi,

 

What I needed was parsing improper JSON from mongoDB. JSON I get is missing quotations around keys and some of values. I managed to work this out through the weekend and the code that works looks like this:

'dataset' holds the input data for this script
pattern_key <- "(\"?)([a-zA-Z_-]+)(\"?:)"
pattern_value <- "(:\"?)(?!null|true|false)([a-zA-Z0-9\\s,-_:.]+)(,|})"
key_replacer <- function(x) { gsub(pattern_key, "\"\\2\":", x, perl = TRUE) }
value_replacer <- function(x) { gsub(pattern_value, "\\1\"\\2\"\\3", x, perl = TRUE) }
output <- within(dataset, { PARSED=value_replacer(key_replacer(dataset$WORKSTATUSES)) } )

Although this may lack some characters that should be treated as part of keys or values and as the result some quotations aren't being inserted in the right places. The error I get now says there are some unexpected characters at the end of JSON file. What's interesting I am able to parse JSON with my R script to new column, then parse JSON into Lists, further into separate rows and then expand as columns and it works fine in Query Editor. Once I click 'Apply changes' on the yellow bar in the main window it loads the data from remote server, tries to bend them to my needs and then this error occures. So I managed to make some progress in this area but it still doesn't work.

Anonymous
Not applicable

It turned out the problem is my script didn't handle all cases of string-type values that occure in JSON I parse. Also some cells contain empty strings (not null) and therefore aren't valid for parsing (produce Errors as well). Filtering empty rows out and then filtering rows with error let me investigate what the problem really was. 

Helpful resources

Announcements
Join our Fabric User Panel

Join our Fabric User Panel

This is your chance to engage directly with the engineering team behind Fabric and Power BI. Share your experiences and shape the future.

June 2025 Power BI Update Carousel

Power BI Monthly Update - June 2025

Check out the June 2025 Power BI update to learn about new features.

June 2025 community update carousel

Fabric Community Update - June 2025

Find out what's new and trending in the Fabric community.