Power BI is turning 10, and we’re marking the occasion with a special community challenge. Use your creativity to tell a story, uncover trends, or highlight something unexpected.
Get startedJoin us for an expert-led overview of the tools and concepts you'll need to become a Certified Power BI Data Analyst and pass exam PL-300. Register now.
Hi,
I tried creating small R script that is able to parse incorrect JSON column. The data is poor quality as it lacks quotations. Here is my code:
# 'dataset' holds the input data for this script pattern <- "(:)(?!null)(?!true)(?!false)(?!\"[a-z0-9A-Z.:-[[:space:]]]+\")([a-z0-9A-Z.:-[[:space:]]]+)(,|})" replacer <- function(x) { gsub(pattern, "\\1_TEST_\\2_TEST_\\3", as.character(x), perl = TRUE) } output <- within(dataset, { PARSED=replacer(dataset$WORKSTATUSES) } )
What is happening the regex matches incorrect data in R runner but doesn't in Power BI. The problem are spaces. I tried it all; \\s, [:blank:], [[:blank:]], [:space:], [[:space:]] as well as inserting the space character itself into my patter. Some gave errors, some did nothing (regex must had not match). Can you tell me how to do it properly?
Solved! Go to Solution.
It turned out the problem is my script didn't handle all cases of string-type values that occure in JSON I parse. Also some cells contain empty strings (not null) and therefore aren't valid for parsing (produce Errors as well). Filtering empty rows out and then filtering rows with error let me investigate what the problem really was.
Hi @Anonymous
First, could you show me the error in Power BI when running the R script?
What is your purpose to use this code?
Some requiremnets can be directly achieved by Power Query, if your requirement is clear, we can make a test for it.
Hi,
What I needed was parsing improper JSON from mongoDB. JSON I get is missing quotations around keys and some of values. I managed to work this out through the weekend and the code that works looks like this:
'dataset' holds the input data for this script pattern_key <- "(\"?)([a-zA-Z_-]+)(\"?:)" pattern_value <- "(:\"?)(?!null|true|false)([a-zA-Z0-9\\s,-_:.]+)(,|})" key_replacer <- function(x) { gsub(pattern_key, "\"\\2\":", x, perl = TRUE) } value_replacer <- function(x) { gsub(pattern_value, "\\1\"\\2\"\\3", x, perl = TRUE) } output <- within(dataset, { PARSED=value_replacer(key_replacer(dataset$WORKSTATUSES)) } )
Although this may lack some characters that should be treated as part of keys or values and as the result some quotations aren't being inserted in the right places. The error I get now says there are some unexpected characters at the end of JSON file. What's interesting I am able to parse JSON with my R script to new column, then parse JSON into Lists, further into separate rows and then expand as columns and it works fine in Query Editor. Once I click 'Apply changes' on the yellow bar in the main window it loads the data from remote server, tries to bend them to my needs and then this error occures. So I managed to make some progress in this area but it still doesn't work.
It turned out the problem is my script didn't handle all cases of string-type values that occure in JSON I parse. Also some cells contain empty strings (not null) and therefore aren't valid for parsing (produce Errors as well). Filtering empty rows out and then filtering rows with error let me investigate what the problem really was.
This is your chance to engage directly with the engineering team behind Fabric and Power BI. Share your experiences and shape the future.
Check out the June 2025 Power BI update to learn about new features.
User | Count |
---|---|
72 | |
71 | |
57 | |
38 | |
36 |
User | Count |
---|---|
82 | |
67 | |
61 | |
46 | |
45 |