Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Find everything you need to get certified on Fabric—skills challenges, live sessions, exam prep, role guidance, and more. Get started

Reply
russell80
Helper III
Helper III

Parsing JSON Using a Schema

I have a table in an SQL database that holds the records of submitted forms. The form fields aren't held in columns in the database, but instead are stored in a JSON string in a column. To get the data that's been submitted into columns for Power BI, I can use the Parse JSON function in Power Query on the JSON column and get the records and expand them out into columns.

 

The trouble I run into is I am relying on Power Query to identify the column names from the JSON, and this doesn't always work because not all fields are filled out all the time, so some get missed by Power Query, e.g. when a new form is submitted with a field that wasn't previously completed, this gets missed by Power Query as it's not listed in the query.

 

The SQL database also holds the schema for the JSON and I was hoping to use this to define the fields but I couldn't find any examples online how to do this? Is there an easy way to do this?

 

One way I have found to do this is to use one of the functions in the database called "parseJSON". Applying this to the schema column, I can create a list of all the fields in the schema and then use this query in the original query on the forms data in place of the column names when expanding the columns of the parsed JSON. This works but I was wondering if I am missing something where there is a better way to do what I'm trying to achieve?

1 ACCEPTED SOLUTION

You will probably need to have a version that you can make minor modifications to for each schema.  You can create a master schema map with an alignment to the required transformations.

View solution in original post

5 REPLIES 5
russell80
Helper III
Helper III

@jennratten  Thanks for sharing this pbix. I've had a look through but can't see where you are using a JSON schema to parse a JSON string into columns. Not sure if that's because I don't have credentials for some of the datasources.

 

In my example I have a table with column "Schema", this holds the strings of the JSON schemas for all the forms that are used. A simple example of a string is like this:

[
  {
    "type": "Date",
    "icon": "calendar",
    "colour": "#84aced",
    "label": "",
    "description": "",
    "readOnly": false,
    "required": false,
    "placeholder": "",
    "hideExpression": "",
    "settings": {
      "minDate": "",
      "maxDate": ""
    },
    "id": "Date1",
    "valid": true
  },
  {
    "type": "Location",
    "icon": "street-view",
    "colour": "#fca52a",
    "label": "",
    "description": "",
    "readOnly": false,
    "required": false,
    "placeholder": "",
    "hideExpression": "",
    "settings": {
      "options": ""
    },
    "id": "Location1",
    "valid": true
  }
]

 

In another table, I have the records of all the forms submitted, and the values submitted are represented as a JSON string following the schema of the form used and are stored in a column. The results string could look like this for the schema above:

 

{
  "Date1": "2019-01-23T00:00:00+00:00"
}

 

Notice how it doesn't contain the Location, as that's not a mandatory field in the form. So if I used PQs Parse JSON function, it would only parse the "Date1" field, and if a new form got submitted with the "Location1" field, the query wouldn't catch it when the dataset is refreshed in the service.

 

I'm looking for a way to extract all the fields from the schema (they all have the 'id' tag) and use them as column names when parsing the JSON results. Hope that makes sense!

Hello again - Yes, this makes sense.  Thanks for providing some sample data.  Please add the script below as a new blank query.  You can grab the list of fields expected from the schema and the use it to expand the records returned in the forms json like this.

 

SCRIPT

let
    // JSON Schema
    Source = 
        {
            [
                type = "Date",
                settings = [
                minDate = "",
                maxDate = ""
                ],
                id = "Date1",
                valid = true
            ],
            [
                type = "Location",
                settings = [
                options = ""
                ],
                id = "Location1",
                valid = true
            ]
        },
    #"Converted to Table" = Table.FromList(Source, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
    #"Expanded Column1" = Table.ExpandRecordColumn(#"Converted to Table", "Column1", {"type", "icon", "colour", "label", "description", "readOnly", "required", "placeholder", "hideExpression", "settings", "id", "valid"}, {"type", "icon", "colour", "label", "description", "readOnly", "required", "placeholder", "hideExpression", "settings", "id", "valid"}),
    #"Expanded settings" = Table.ExpandRecordColumn(#"Expanded Column1", "settings", {"minDate", "maxDate", "options"}, {"settings.minDate", "settings.maxDate", "settings.options"}),
    Schema = Table.TransformColumnTypes(#"Expanded settings",{{"type", type text}, {"icon", type text}, {"colour", type text}, {"label", type any}, {"description", type any}, {"readOnly", type logical}, {"required", type logical}, {"placeholder", type any}, {"hideExpression", type any}, {"settings.minDate", type any}, {"settings.maxDate", type any}, {"settings.options", type any}, {"id", type text}, {"valid", type logical}}),
    // Sample Data
        Data = 
        {
            [
                Date1 = "2019-01-23T00:00:00+00:00"
            ],
            [
                Date1 = "2019-01-23T00:00:00+00:00"
            ],
            [
                Date1 = "2019-01-23T00:00:00+00:00"
            ],
            [
                Date1 = "2019-01-23T00:00:00+00:00",
                Location1 = "ABC"
            ]
        },
    #"Converted to Table1" = Table.FromList(Data, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
    ExpandRecords = Table.ExpandRecordColumn ( 
        #"Converted to Table1",
        "Column1",
        // List of fields to expand
        Schema[id]
    )
    
in
    ExpandRecords

RESULT

jennratten_0-1692624693787.png

 

 

Thanks @jennratten . This looks like a good solution for my example. The trouble I have is, in the real database, there are hundreds of different form templates, all with different levels of nesting and different field names. So I won't be able to type those all out in a query. I think given the "id" field is consistent across all form templates, that should make it easier. But what i'm not sure about is how to handle all the different nesting.

You will probably need to have a version that you can make minor modifications to for each schema.  You can create a master schema map with an alignment to the required transformations.

jennratten
Super User
Super User

Hello - I have worked with simiar data structures and in my opinion, the most reliable solution uses the schema.  Depending on the depth of the json it can get pretty lucrative.  I have attached a sample for you that include many examples.

Helpful resources

Announcements
July 2024 Power BI Update

Power BI Monthly Update - July 2024

Check out the July 2024 Power BI update to learn about new features.

PBI_Carousel_NL_June

Fabric Community Update - June 2024

Get the latest Fabric updates from Build 2024, key Skills Challenge voucher deadlines, top blogs, forum posts, and product ideas.

Top Solution Authors
Top Kudoed Authors