Find everything you need to get certified on Fabric—skills challenges, live sessions, exam prep, role guidance, and more. Get started
I have multiple dataflows, as a part of an ETL chain.
I'm following this great pattern from @MatthewRoche here: https://ssbipolar.com/2019/10/07/quick-tip-factoring-your-dataflow-entities/
For each entity, I have at least 3 dataflows:
This is a great setup, because it gives me injection points if I need to add new or change data midstream.
At this point, Ingest simply ingests. I have a few Ingest dataflows with Incremental Refresh enabled.
Cleanse converts data types...my data source stores some numerical IDs as strings instead of integers--so I convert those here.
Final (at this point) simple renames the columns to business friendly names.
This is a great pattern--but I wonder if my minimal transformations require this many steps--or am I sacrificing performance by generating 3 different computed entities in this chain?
Should Ingest always just ingest, or if I'm simply casting and renaming columns--should I just use one dataflow? Does anyone have any recommendations in this space? Thanks.
EDIT:
I think I answered my question on Ingest with Incremental Refresh. When I enable Incremental Refresh, additional steps and queries are added ("_Canary", "RangeStart", and "RangeEnd") and a Table.Select() step added to my main query. This steps is added last, so any other transformations will have to be performed first--meaning the Table.Select() will not fold and all records will have to be downloaded before they can be filtered.
EDIT2:
Although, I could have two queries in my dataflow: Customers_Ingest and Customers_Cleanse, where _Ingest is untransformed and incremental refresh enabled, and _Cleanse is linked and has the transformations. Since these transformations are happening within the same dataflow though, I assume I wouldn't get the benefit of the enhanced compute engine.
Hi @jeffshieldsdev ,
If you need to get timely help, I think you could create a support ticket to get the dedicated support from Microsoft. You could reference the blog about how to create it. I don't have much experience in ETL. Sorry that I have not helped you.
Best Regards,
Xue Ding
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly. Kudos are nice too.