Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Get certified in Microsoft Fabric—for free! For a limited time, get a free DP-600 exam voucher to use by the end of 2024. Register now

Reply
ElliotP
Post Prodigy
Post Prodigy

Nested JSON and never end Records

Afternoon,

 

I have a json file (https://1drv.ms/u/s!At8Q-ZbRnAj8hkJLL1cyU4t_hoHC) which has many nested arrays and I'm unsure of how to extract all of the records into powerbi.

 

I've watch the guyinacube video (incredibly helpful), but I get to a point where I can't expand my columns any further yet I need the data inside that record (it's actually the data I want).

 

First step: https://gyazo.com/86e411d5de9d7f2f52813f6f66cb3bf9

Convert to table, easy it converts out to this.

 

Second Step: https://gyazo.com/546206f9a6b0459a4a09b2c865c3d902

Expand out, again comfortable. Expand again.

 

Third Step (issue): https://gyazo.com/ffe36ce86fa91f4ae0a435c524d026e5

The column on the far right is the colum which contains the data that i want, but I'm unable to retrieve the data from that column without individually clicking on the links.

 

Any ideas as to how I would be to produce rows or tables from this?

1 ACCEPTED SOLUTION

Here is the final result that you can follow

 

let
    json= Json.Document(File.Contents("D:\Downloads\Xero Datapowerbiforum")),
    json_tab = Table.FromList(json, Splitter.SplitByNothing()),
    expand_1 = Table.ExpandRecordColumn(json_tab, "Column1", {"JournalID", "JournalDate", "JournalNumber", "CreatedDateUTC", "SourceID", "SourceType", "JournalLines"}),
    expand_2 = Table.ExpandRecordColumn(expand_1, "JournalLines", {"JournalLine"}),
    journal_line_transform = Table.TransformColumns(expand_2, {"JournalLine", each if _ is record then {_}  else _}),
    expand_3 = Table.ExpandListColumn(journal_line_transform, "JournalLine"),
    expand_4 = Table.ExpandRecordColumn(expand_3, "JournalLine", {"JournalLineID", "AccountID", "AccountCode", "AccountType", "AccountName", "Description", "NetAmount", "GrossAmount", "TaxAmount", "TaxType", "TaxName"}, {"JournalLineID", "AccountID", "AccountCode", "AccountType", "AccountName", "Description", "NetAmount", "GrossAmount", "TaxAmount", "TaxType", "TaxName"})
in
    expand_4

 

View solution in original post

10 REPLIES 10
astoz
Frequent Visitor

Hello, this post is really useful thanks, It helped me solving an issue I had

I'd like to check my understanding of this piece of code in the solution:

if _ is record then {_} else _}

 

Am I right in assuming that here it says if current value is record then store it in an object else go to the next value? I'm asking cause I'm not yet entirely clear on what _ (underscore) means in Power Query

 

ImkeF
Super User
Super User

Yes, normally you would create a function that opens and transforms one (sample) record and apply this in for each cell by calling it from a new column.

But this would only work, if all items have the same structure. This doesn't seem to be the case for your data here, as there are records and lists in them. So you would at least need two different functions and apply them using a conditional statement in the new column.

Or do you know in advance that some of these rows will not be needed and can then filter them out before the expansion?

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

I know all of the records will have the same structure (columns names and data types). Would it be best to somehow create their own table and expand if possible somehow there?

Hi @ElliotP,

yes, it's very unlikely that it makes sense to expand 2 different table structures into one column.

So you can add a column to your current query that you can use as a filter wit this:  

 

Value.Is([Column1.JournalLines.JournalLine], type table)

 

This checks if the type of the value is table and returns true or false.

Then you reference this table once and set a filter on true, and in a second query a filter on false.

 

Check out @hugoberry's function, it is very promising.

 

@hugoberry: Kudos to your function repository! Yesterday I've discovered your Matrix-multiplication - that's a real beauty! How do you do the indentation in your code? Manually or do you use a program?

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

hugoberry
Responsive Resident
Responsive Resident

Hey @ImkeF thanks for mentioning that 😉

More posts/queries are coming. I do my identation manually, because I still can't make my visual studio code extension to work properlly. Once I get one in a more via ble stage I'll give you a heads up 😉

Thx @hugoberry can't wait (for both) 😉

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

@hugoberryYou are a superstar. Thank you so much, I can't thank you enough, this is amazing.

 

@ImkeFThank you for your help, I really the time and the ideas. I liked our ideas of breaking the tables apart and linking.

hugoberry
Responsive Resident
Responsive Resident

Hi @ElliotP, my initial reply was quite generic. Only now I had a chance to look at your JSON.

I've managed to drill down to the data that you were after. If I understand right the format of your data, at the step where the column becomes either a list or a record you have to apply a transofrmation of cell contents and cast them into a list, and then use standard expand procedures to expand the list values in the table.

Here is the step that will help:

= Table.TransformColumns(step, {"Column1.JournalLines.JournalLine", each if _ is record then {_}  else _})

Here is the final result that you can follow

 

let
    json= Json.Document(File.Contents("D:\Downloads\Xero Datapowerbiforum")),
    json_tab = Table.FromList(json, Splitter.SplitByNothing()),
    expand_1 = Table.ExpandRecordColumn(json_tab, "Column1", {"JournalID", "JournalDate", "JournalNumber", "CreatedDateUTC", "SourceID", "SourceType", "JournalLines"}),
    expand_2 = Table.ExpandRecordColumn(expand_1, "JournalLines", {"JournalLine"}),
    journal_line_transform = Table.TransformColumns(expand_2, {"JournalLine", each if _ is record then {_}  else _}),
    expand_3 = Table.ExpandListColumn(journal_line_transform, "JournalLine"),
    expand_4 = Table.ExpandRecordColumn(expand_3, "JournalLine", {"JournalLineID", "AccountID", "AccountCode", "AccountType", "AccountName", "Description", "NetAmount", "GrossAmount", "TaxAmount", "TaxType", "TaxName"}, {"JournalLineID", "AccountID", "AccountCode", "AccountType", "AccountName", "Description", "NetAmount", "GrossAmount", "TaxAmount", "TaxType", "TaxName"})
in
    expand_4

 

hugoberry
Responsive Resident
Responsive Resident

Hi @ElliotP I've tried to come up with an algortihm to do the extraction automatically for a JSON object. See maybe you can use this function for your particular JSON

https://gist.github.com/Hugoberry/4ad49f4301edf47fffe2ef06aed61513

 

 

Helpful resources

Announcements
November Carousel

Fabric Community Update - November 2024

Find out what's new and trending in the Fabric Community.

Live Sessions with Fabric DB

Be one of the first to start using Fabric Databases

Starting December 3, join live sessions with database experts and the Fabric product team to learn just how easy it is to get started.

Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount! Early Bird pricing ends December 9th.

Nov PBI Update Carousel

Power BI Monthly Update - November 2024

Check out the November 2024 Power BI update to learn about new features.