Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Get certified in Microsoft Fabric—for free! For a limited time, the Microsoft Fabric Community team will be offering free DP-600 exam vouchers. Prepare now

Reply
iBusinessBI
Kudo Collector
Kudo Collector

How to save a "List" column for later use?

I am starting to work with Dataflow Gen 2.

I get data from API and I want to save in in a destination for later use.

One of my columns contains "List" type of values and I want to KEEP IT CLOSED AS-IT-IS without expanding for later consumption by another data flow I'll build later.

How can I achive this?

iBusinessBI_0-1703608688629.png

 

2 ACCEPTED SOLUTIONS

Perhaps what you're looking for is to save the full response from the API as a binary, save it to the lakehouse as a binary and then consume the binaries through a different dataflow.
If this is in line with your expectation, I'd suggest raising a new idea using the link below:

 https://aka.ms/FabricIdeas

 

You could also post an idea for the lakehouse team to increase the maximum string size. As mentioned before, this is not a limitation of the Dataflow engine, but rather a limitation of the destination which cannot handle more than X number of characters for a string and how many characters your string has.

All of the previously shared solutions are the ways that you can handle things in Dataflows Gen2 today:

  • Expand the complex values as those data types are not supported in the destination that you're mentioning
  • Transform the complex values to a text string and save it as a string.
    • Furthermore, you could encode the string so it shortens things and save it that way. It could also make the number of character expand tremendously, so it might not be a one-size-fits-all solution

View solution in original post

So this a limitation on the Lakehouse/Warehouse end.

I've opened an idea so they increase the Text field size.

https://ideas.fabric.microsoft.com/ideas/idea/?ideaid=fd452033-4bb2-ee11-92bd-000d3a0d7518

Please vote

View solution in original post

19 REPLIES 19
ppm1
Solution Sage
Solution Sage

@miguel is right. The best approach is to use Json.FromValue. You can use Text.FromBinary(Json.FromValue([ListColumn])) to store it as a JSON text string. Your consumers can then use Json.Document([JSONasText]) to parse the JSON and get the lists, records, etc. back.

 

ppm1_0-1704421736932.png

Pat

Microsoft Employee
cpwebb
Microsoft Employee
Microsoft Employee

This blog post has a lot of examples on how you can use Json.FromValue to do this: https://blog.crossjoin.co.uk/2016/10/07/generating-json-in-power-bi-and-power-query/

Anonymous
Not applicable

Hi @iBusinessBI ,

Thank you for using Fabric Community

 

Here are the methods to save a "List" column for later use in MS Fabric Dataflow Gen2:

  1. Serialize to JSON:
    Within the Dataflow:
    Use the toJSON() function to convert the "List" column into a JSON string.
    Create a new column to store the JSON string.
    Destination Sink:
    Choose a sink that accepts JSON strings, such as Data Lake Storage Gen2 or Azure SQL Database.
    Later Use:
    In the subsequent Dataflow, use fromJSON() to deserialize the JSON string back into a list.

  2. Custom Sink (if necessary):
    Development:
    Create a custom sink specifically designed to handle the "List" column in its desired format.
    Integration:
    Integrate this custom sink into your Dataflow pipeline.

 

Remember: There is no one-size-fits-all solution, and the best approach will depend on your specific data, transformations, and preferences.

Hope this is helpful. Please let me know incase of any queries.

Hi @Anonymous 
I've tried to use toJSON([My List Field]) but I don't find such a function at all in Power Query / Dataflows.

What do I miss?
Did you try this function yourself?
I just get an error:
"The import toJSON matches no exports. Did you miss a module reference?

Details
Reason = Expression.Error"
Anonymous
Not applicable

Hi @iBusinessBI ,

Apologies for the confusion, I rechecked your ask once again. 
Could you please check whether below steps may help you?

Scenario -

vgchennamsft_1-1703838033329.png


You can convert the list of value to comma seperated values in a new column -

vgchennamsft_2-1703838136649.png

Text.Combine([Column Name], ",")


After saving you will able to convert list into comma seperated values and can store it in any destination.

vgchennamsft_3-1703838225347.png


Final Destination -

vgchennamsft_4-1703838305906.png


Hope this is helpful. Please let me know incase of any further queries.

Sorry, no, it doesn't help.

Look at my original screenshot. My List is not composed with comma-separated text strings, but it is a COMPEX List, which contains RECORDS that contain Lists, which contain more records, etc....

So Text.Combine doesn't work

Hey! Please try the function below:

https://learn.microsoft.com/en-us/powerquery-m/json-fromvalue

the concept is as the one mentioned before. Transform a complex value into something that can be stored in a destination such as a text string.

 

Below is a sample code to demonstrate how this might work:

let
  SampleList = {1..20,{1..2}},
  Transformation = Lines.FromBinary(Json.FromValue(Source))
in
  Transformation

Note: The only situation that you might encounter is when you might have text strings that perhaps are too large for the fields of your destination, so if perhaps the field in a table of your data destination can only hold a maximum of X characters, if your string has more than X amount then that could cause issues. It might also be quite resource intensive to first evaluate a complex value and transform it into a text string. This is because things in Power Query are lazily evaluated, so are only evaluated when needed.

Anonymous
Not applicable

Hi @iBusinessBI ,

At this time, we are reaching out to the internal team to get some help on this .
We will update you once we hear back from them.

Thank you

Anonymous
Not applicable

Hi @iBusinessBI ,

We haven’t heard from you on the last response and was just checking back to see if you have a resolution yet. In case if you have any resolution please do share that same with the community as it can be helpful to others. Otherwise, will respond back with the more details and we will try to help .

I have managed to create a string field in Dataflow, but when I save it to a Lakehouse or a Warehouse it gets truncated (limited ti 7600 charachters). While in Dataflow I see the whole text (19000 characters). How can I save the whole text?

If you're hitting a limit on the maximum length of a text value somewhere, then I guess the only alternative is to split the text across multiple rows (something like this https://blog.crossjoin.co.uk/2019/05/19/storing-large-images-in-power-bi-datasets/). This will cause complications downstream though. Why do you need to retain the nested structures? I think at this point it would be easier to expand them in the dataflow.

I am working with API source. I want to run once a day and save the results in a Datawarehouse. The day after, I will run API on a different day....
Dataflow itself doesn't allow me to work incrementally with API source, so I need to save the results in Warehouse / Lakehouse. How can I store the whole text in Warehouse / Lakehouse?

Yes - but my point is, why don't you expand the nested structures (by clicking on the icon with the two arrows in the column header) and avoid the problem that way?

Two reasons:

1. When I expand the columns (I have a lot) - the Dataflow sends multiple API requests to my API endpoint. And the API has a Rate Limit. It only allows X calls a minute and I don't have a convenient way to manage this in Dataflow. So I need to be very careful to make as little requests as I can.

2. The second reason - I want to save ALL the information I get from the API call for later use. I don't want to limit myself to only column I expand. Maybe later I'd like to expand more columns (I have tens of columns there to expand - a lot of information)

So, ideally, I should be able to run this API call once a day and save the whole result as a simple JSON file with a timestamp into a Lakehouse. Then, I could connect to all these files from another Dataflow and expand all the fields I want without being afraid to go over the Rate limit of the API source.
The first problem is that Dataflow does not allow currently to save JSON files into Lakehouse, only tables.
And the second problem for tables - there is probably a limit of text columns.
So I am kind of stuck here.

1) Does the dataflow send multiple requests when you have the Query Editor open or when you refresh the dataflow? If it's just when the Query Editor is open that's something you may be able to work with.

2) It is possible to write code to expand all the columns (this is an old post but you get the idea https://blog.crossjoin.co.uk/2014/05/21/expanding-all-columns-in-a-table-in-power-query/) and if the issue in (1) doesn't happen when you refresh, it could be worthwhile to do this

Thanks, but I do not want to expand the columns at this stage. 
Just to get the whole JSON Text from the collapsed List and save it as such - Text.

I just want to deal withh all the expands lated. This is to separate the "data gathering" stage from "data transformation".

This separation is critical:

1. At the "Gathering Satge" I just need a simple API call that gets me everything as a "Raw" JSON List column. I don't want any transformation (apart of expracting the whole JSON as a text). I want as little transformations because the API is very very very sensitive to Rate Limit.

So I just want to "dump" everything AS-IS to Lakehouse.

2. Later, I'll connect to this Raw data and transform it as i please.

I think my scenario is very common, actually.

I tried to do it with Fabric Pipeline API action, but it is sooooo far away to be as friendly as Dataflow and also it doesn't suppot Gateways (which is required for my API).

So either the Dataflow should allow saving JSON files to Lakehouse, and not just Tables.

Or tables in Lakehouse should be able to store Text columns without characters' limit.... 

Perhaps what you're looking for is to save the full response from the API as a binary, save it to the lakehouse as a binary and then consume the binaries through a different dataflow.
If this is in line with your expectation, I'd suggest raising a new idea using the link below:

 https://aka.ms/FabricIdeas

 

You could also post an idea for the lakehouse team to increase the maximum string size. As mentioned before, this is not a limitation of the Dataflow engine, but rather a limitation of the destination which cannot handle more than X number of characters for a string and how many characters your string has.

All of the previously shared solutions are the ways that you can handle things in Dataflows Gen2 today:

  • Expand the complex values as those data types are not supported in the destination that you're mentioning
  • Transform the complex values to a text string and save it as a string.
    • Furthermore, you could encode the string so it shortens things and save it that way. It could also make the number of character expand tremendously, so it might not be a one-size-fits-all solution

So this a limitation on the Lakehouse/Warehouse end.

I've opened an idea so they increase the Text field size.

https://ideas.fabric.microsoft.com/ideas/idea/?ideaid=fd452033-4bb2-ee11-92bd-000d3a0d7518

Please vote

Anonymous
Not applicable

Hi @iBusinessBI ,

We haven’t heard from you on the last response and was just checking back to see if you have a resolution yet.  Otherwise, will respond back with the more details and we will try to help .

Helpful resources

Announcements
Oct Fabric Update Carousel

Fabric Monthly Update - October 2024

Check out the October 2024 Fabric update to learn about new features.

September Hackathon Carousel

Microsoft Fabric & AI Learning Hackathon

Learn from experts, get hands-on experience, and win awesome prizes.

October NL Carousel

Fabric Community Update - October 2024

Find out what's new and trending in the Fabric Community.