The ultimate Microsoft Fabric, Power BI, Azure AI, and SQL learning event: Join us in Stockholm, September 24-27, 2024.
Save €200 with code MSCUST on top of early bird pricing!
Find everything you need to get certified on Fabric—skills challenges, live sessions, exam prep, role guidance, and more. Get started
1. Step 1 - I want to connect to Salesforce and get an Opportinities table AS-IS
2. I want to connect to the table from Step 1 and do all kinds of complex transformations with Power Query.
My task is to make an API request JUST ONCE to Salesforce. All the rest load should be done inside Microsoft capacity. So I want to separate the two steps most efficiently.
(When I do everything inside ONE Dataflow - the second step is sending API requests again and again and again to Salesforce so it is not an option)
I have created 2 dataflows and the second one refers to the first (it created a "linked" entity with enhances mode, watever it means). And I've created a Power Automate flow so the second dataflow will refresh after the first is finished.
But it still takes a long time for the second one to run. And I wonder whether it still propagates the API calls to Salesforce because of "linked entity".
I am thinking to test another alternative:
Create a Gen2 Dataflow, save the initial "AS-IS" table in a Warehouse/or Lakehouse.
Then, connect the second Dataflow to it. And schedule everything with Pipeline.
Should there be differences in this approach?
Or is it more or less the same as just separating two Gen1 dataflows?
What does the theory tell?
Solved! Go to Solution.
Thanks for the follow up!
Hope this clarifies 😁
Hi @iBusinessBI
Glad that your query got resolved. Please continue using Fabric Community for any help regarding your queries.
Hi BusinessBI!
My name is Jeroen Luitwieler and a Senior Product Manager on the Dataflow/dataintegration team.
Looking at your case you may want to consider the following:
Steps:
Would this cover your usecase? do you have any questions or concerns with this solution?
Thanks, @LuitwielerMSFT Jeroen,
That's interesting.... Some thoughts:
1. Where does the Web activity saves its output to? (What is the source for the "Ingest to lakehouse" DF?)
2. What is the advantage of saving the intermediate results to a Lakehouse?
I could just build 2 regular Gen1 Dataflows:
- the first connects to the Salesforce and brings all the table in the raw format as-is.
- the second DF connects to the first and does all the necessary transformation.
So what advantage I get working with the Lakehouse and Gen2 DF, instead of just 2 regular Gen1 DF?
Thanks
Thanks for the follow up!
Hope this clarifies 😁
The theory tells you that the 2nd way of working is good. It is common for data to be loaded into a DWH as is, and only to be transformed afterwards. The goal of this approach is often to not disturb the connection with the source for too long and having an easier time debugging.
But isn't it exactly what two separate Dataflows Gen1 can achieve? Without need for DWH?
Hi @iBusinessBI
Thanks for using Fabric Community.
At this time, we are reaching out to the internal team to get some help on this.
We will update you once we hear back from them.
Thanks
Join the community in Stockholm for expert Microsoft Fabric learning including a very exciting keynote from Arun Ulag, Corporate Vice President, Azure Data.
Check out the August 2024 Fabric update to learn about new features.