Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Find everything you need to get certified on Fabric—skills challenges, live sessions, exam prep, role guidance, and more. Get started

Reply
PaulKn
Advocate II
Advocate II

Joining linked entities together from another workspace in a dataflow

I am trying to join two linked entities together in a dataflow. Entity A is 19 million rows, entity B is 30,000.

 

The dataflow refresh kept failing at first with what I assumed were memory issues (it worked with smaller volumes but when I increased them I got the error message "There was a problem refreshing your dataflow"). However, after I turned the enhanced compute engine on it worked ok. Both linked entities at this testing phase were within the same workspace as the dataflow.

 

I am now trying to do the same thing but with entity B in a different workspace. Now it takes ages before failing again. Why is it performing differently having one of the linked entities being sourced from another workspace - is it no longer using the enhanced compute engine for some reason? I can't see why it would work differently.

3 REPLIES 3
jeffshieldsdev
Solution Sage
Solution Sage

I'm literally working on this today.

I have some dataflows entities with 5m-151m records.  These are activity records, when a "activity" is performed multiple times, my analyst users often only want the first activity, or most recent activity, in their analysis.  I'd like to create "unqiue" activity dataflows for these use cases.

I'm doing this now, within one Workspace.  I have a "ingest" dataflow, a Linked Entity "final" dataflow, and a "unlinked" entity (uses Linked Entities but with "Load Enabled" unchecked) "unique" dataflow.

If I chain my "unique" dataflow by enabling load on the entities, "uniques" will have to complete before "ingest" and "final" will finish.  If "unique" fails, "ingest" will fail--and my data will have to be re-fetched from the data source again, even if that step actually succeeded.  This is why "unique" isn't chained.

Since "unique" isn't chained, I believe I'm not leveraging the Enhanced Dataflows Compute Engine (EDCE).

So today I'm going to try creating my "unique" dataflows in a seperate Workspace where I can leave "Load Enabled" checked--hopefully engaging EDCE without making it a depdendcy on the ingest.

I will report back!

 

I have "Load enabled" ticked on all my linked entities. If you untick it then you no longer get the icon to indicate it is a linked entity and my understanding is that it does not use the enhanced compute engine. I did try unticking it at first as a way to hide my staging/transformation entities from the user, but the performance was very poor. It's a bit frustrating as it means not only do you have to separate out your ingestion into a separate dataflow but also the joining/transformation into a separate dataflow as well. 

 

If my linked entity is from another workspace I do still get the "linked entity" icon.

 

With regards to Source = PowerBI.Dataflows() vs Source = PowerBI.Dataflows([IncludeGroups = false, SourceDataflowId = "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"]), I don't believe that makes a difference. When I first add a linked entity it produces the latter M code, but when I save/close and re-open the dataflow, it reverts back to the former code.

Also, does your Power Query M look like this:

let
Source = PowerBI.Dataflows(),
...

Or like this:

let
Source = PowerBI.Dataflows([IncludeGroups = false, SourceDataflowId = "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"]),
...

 I'm not sure, but I've wondered if the latter has an impact on the Linked Entity functionality.

Helpful resources

Announcements
July 2024 Power BI Update

Power BI Monthly Update - July 2024

Check out the July 2024 Power BI update to learn about new features.

PBI_Carousel_NL_June

Fabric Community Update - June 2024

Get the latest Fabric updates from Build 2024, key Skills Challenge voucher deadlines, top blogs, forum posts, and product ideas.