Get certified in Microsoft Fabric—for free! For a limited time, the Microsoft Fabric Community team will be offering free DP-600 exam vouchers. Prepare now
I am starting to work with Dataflow Gen 2.
I get data from API and I want to save in in a destination for later use.
One of my columns contains "List" type of values and I want to KEEP IT CLOSED AS-IT-IS without expanding for later consumption by another data flow I'll build later.
How can I achive this?
Solved! Go to Solution.
Perhaps what you're looking for is to save the full response from the API as a binary, save it to the lakehouse as a binary and then consume the binaries through a different dataflow.
If this is in line with your expectation, I'd suggest raising a new idea using the link below:
You could also post an idea for the lakehouse team to increase the maximum string size. As mentioned before, this is not a limitation of the Dataflow engine, but rather a limitation of the destination which cannot handle more than X number of characters for a string and how many characters your string has.
All of the previously shared solutions are the ways that you can handle things in Dataflows Gen2 today:
So this a limitation on the Lakehouse/Warehouse end.
I've opened an idea so they increase the Text field size.
https://ideas.fabric.microsoft.com/ideas/idea/?ideaid=fd452033-4bb2-ee11-92bd-000d3a0d7518
Please vote
@miguel is right. The best approach is to use Json.FromValue. You can use Text.FromBinary(Json.FromValue([ListColumn])) to store it as a JSON text string. Your consumers can then use Json.Document([JSONasText]) to parse the JSON and get the lists, records, etc. back.
Pat
This blog post has a lot of examples on how you can use Json.FromValue to do this: https://blog.crossjoin.co.uk/2016/10/07/generating-json-in-power-bi-and-power-query/
Hi @iBusinessBI ,
Thank you for using Fabric Community
Here are the methods to save a "List" column for later use in MS Fabric Dataflow Gen2:
Remember: There is no one-size-fits-all solution, and the best approach will depend on your specific data, transformations, and preferences.
Hope this is helpful. Please let me know incase of any queries.
Hi @Anonymous
I've tried to use toJSON([My List Field]) but I don't find such a function at all in Power Query / Dataflows.
What do I miss?
Did you try this function yourself?
I just get an error:
"The import toJSON matches no exports. Did you miss a module reference?
Hi @iBusinessBI ,
Apologies for the confusion, I rechecked your ask once again.
Could you please check whether below steps may help you?
Scenario -
You can convert the list of value to comma seperated values in a new column -
Text.Combine([Column Name], ",")
After saving you will able to convert list into comma seperated values and can store it in any destination.
Final Destination -
Hope this is helpful. Please let me know incase of any further queries.
Sorry, no, it doesn't help.
Look at my original screenshot. My List is not composed with comma-separated text strings, but it is a COMPEX List, which contains RECORDS that contain Lists, which contain more records, etc....
So Text.Combine doesn't work
Hey! Please try the function below:
https://learn.microsoft.com/en-us/powerquery-m/json-fromvalue
the concept is as the one mentioned before. Transform a complex value into something that can be stored in a destination such as a text string.
Below is a sample code to demonstrate how this might work:
let
SampleList = {1..20,{1..2}},
Transformation = Lines.FromBinary(Json.FromValue(Source))
in
Transformation
Note: The only situation that you might encounter is when you might have text strings that perhaps are too large for the fields of your destination, so if perhaps the field in a table of your data destination can only hold a maximum of X characters, if your string has more than X amount then that could cause issues. It might also be quite resource intensive to first evaluate a complex value and transform it into a text string. This is because things in Power Query are lazily evaluated, so are only evaluated when needed.
Hi @iBusinessBI ,
At this time, we are reaching out to the internal team to get some help on this .
We will update you once we hear back from them.
Thank you
Hi @iBusinessBI ,
We haven’t heard from you on the last response and was just checking back to see if you have a resolution yet. In case if you have any resolution please do share that same with the community as it can be helpful to others. Otherwise, will respond back with the more details and we will try to help .
I have managed to create a string field in Dataflow, but when I save it to a Lakehouse or a Warehouse it gets truncated (limited ti 7600 charachters). While in Dataflow I see the whole text (19000 characters). How can I save the whole text?
If you're hitting a limit on the maximum length of a text value somewhere, then I guess the only alternative is to split the text across multiple rows (something like this https://blog.crossjoin.co.uk/2019/05/19/storing-large-images-in-power-bi-datasets/). This will cause complications downstream though. Why do you need to retain the nested structures? I think at this point it would be easier to expand them in the dataflow.
I am working with API source. I want to run once a day and save the results in a Datawarehouse. The day after, I will run API on a different day....
Dataflow itself doesn't allow me to work incrementally with API source, so I need to save the results in Warehouse / Lakehouse. How can I store the whole text in Warehouse / Lakehouse?
Yes - but my point is, why don't you expand the nested structures (by clicking on the icon with the two arrows in the column header) and avoid the problem that way?
Two reasons:
1. When I expand the columns (I have a lot) - the Dataflow sends multiple API requests to my API endpoint. And the API has a Rate Limit. It only allows X calls a minute and I don't have a convenient way to manage this in Dataflow. So I need to be very careful to make as little requests as I can.
2. The second reason - I want to save ALL the information I get from the API call for later use. I don't want to limit myself to only column I expand. Maybe later I'd like to expand more columns (I have tens of columns there to expand - a lot of information)
So, ideally, I should be able to run this API call once a day and save the whole result as a simple JSON file with a timestamp into a Lakehouse. Then, I could connect to all these files from another Dataflow and expand all the fields I want without being afraid to go over the Rate limit of the API source.
The first problem is that Dataflow does not allow currently to save JSON files into Lakehouse, only tables.
And the second problem for tables - there is probably a limit of text columns.
So I am kind of stuck here.
1) Does the dataflow send multiple requests when you have the Query Editor open or when you refresh the dataflow? If it's just when the Query Editor is open that's something you may be able to work with.
2) It is possible to write code to expand all the columns (this is an old post but you get the idea https://blog.crossjoin.co.uk/2014/05/21/expanding-all-columns-in-a-table-in-power-query/) and if the issue in (1) doesn't happen when you refresh, it could be worthwhile to do this
Thanks, but I do not want to expand the columns at this stage.
Just to get the whole JSON Text from the collapsed List and save it as such - Text.
I just want to deal withh all the expands lated. This is to separate the "data gathering" stage from "data transformation".
This separation is critical:
1. At the "Gathering Satge" I just need a simple API call that gets me everything as a "Raw" JSON List column. I don't want any transformation (apart of expracting the whole JSON as a text). I want as little transformations because the API is very very very sensitive to Rate Limit.
So I just want to "dump" everything AS-IS to Lakehouse.
2. Later, I'll connect to this Raw data and transform it as i please.
I think my scenario is very common, actually.
I tried to do it with Fabric Pipeline API action, but it is sooooo far away to be as friendly as Dataflow and also it doesn't suppot Gateways (which is required for my API).
So either the Dataflow should allow saving JSON files to Lakehouse, and not just Tables.
Or tables in Lakehouse should be able to store Text columns without characters' limit....
Perhaps what you're looking for is to save the full response from the API as a binary, save it to the lakehouse as a binary and then consume the binaries through a different dataflow.
If this is in line with your expectation, I'd suggest raising a new idea using the link below:
You could also post an idea for the lakehouse team to increase the maximum string size. As mentioned before, this is not a limitation of the Dataflow engine, but rather a limitation of the destination which cannot handle more than X number of characters for a string and how many characters your string has.
All of the previously shared solutions are the ways that you can handle things in Dataflows Gen2 today:
So this a limitation on the Lakehouse/Warehouse end.
I've opened an idea so they increase the Text field size.
https://ideas.fabric.microsoft.com/ideas/idea/?ideaid=fd452033-4bb2-ee11-92bd-000d3a0d7518
Please vote
Hi @iBusinessBI ,
We haven’t heard from you on the last response and was just checking back to see if you have a resolution yet. Otherwise, will respond back with the more details and we will try to help .
Check out the October 2024 Fabric update to learn about new features.
Learn from experts, get hands-on experience, and win awesome prizes.
User | Count |
---|---|
4 | |
3 | |
2 | |
1 | |
1 |
User | Count |
---|---|
6 | |
4 | |
4 | |
3 | |
2 |