Find everything you need to get certified on Fabric—skills challenges, live sessions, exam prep, role guidance, and more. Get started
I've been experimenting with what happens when you have a dataflow in workspace A, with a linked table referencing workspace A in a dataflow in workspace B. What I've done and what I've found is:
The documentation on linked tables clearly states that 'linked tables simply point to the tables in other dataflows, and don't copy or duplicate the data'. However, by showing that referencing dataflow in workspace B, we return an older version of the data until that dataflow is refreshed, we can surmise one of two things is happening:
I understand that downstream queries might need a refresh, but based on the documentation I would believe that linked tables would simply show the updates in the source. Since there's no efficiency benefits, and as other posts have pointed out, using linked tables requires dataflow B users to have access to dataflow A as well (preventing us from creating a 'presentation layer' of dataflows), why would anyone use linked entities in a different workspace, and not simply disable the load and create a query like 'let Source = LinkedTable in Source' to provide that data?
Solved! Go to Solution.
Your observations raise some interesting points about the behavior of linked tables in Power BI dataflows across different workspaces. Let's address your findings and concerns:
1. **Documentation Accuracy**: It's possible that the documentation on linked tables may not fully capture the nuances of their behavior, particularly when used across workspaces. Microsoft's documentation is generally reliable, but there may be scenarios or edge cases where the behavior deviates from what's described.
2. **Data Copy vs. Pointer**: The behavior you observed could suggest that dataflow B might be maintaining its own copy of the data from dataflow A, rather than just pointing to it. This would contradict the documentation's statement that linked tables don't copy or duplicate the data. However, without access to the underlying implementation details of Power BI's dataflows, it's challenging to conclusively determine the exact mechanism at play.
3. **Versioning and Refresh**: Another possibility is that dataflow B is indeed using a pointer to dataflow A, but it's referencing a specific version or snapshot of dataflow A. This would explain why refreshing dataflow A doesn't automatically propagate changes to dataflow B until the latter is refreshed explicitly.
4. **Usage and Best Practices**: Given the complexities and potential limitations of linked tables, it's worth considering whether they're the most appropriate solution for your scenario. As you noted, there may be little efficiency benefit compared to directly querying the source dataflow. Additionally, managing access permissions across multiple workspaces can add complexity.
5. **Alternative Approaches**: Disabling the load of linked entities and creating custom queries to reference dataflows directly may indeed offer more flexibility and control over data access and refresh behavior. This approach allows you to explicitly manage the flow of data and updates without relying on the behavior of linked tables.
In conclusion, while linked tables provide a convenient way to reference data across different workspaces in Power BI dataflows, their behavior and limitations warrant careful consideration. It's essential to understand how they operate in your specific scenario and evaluate alternative approaches to ensure optimal data management and performance.
Did I answer your question? Mark my post as a solution! Appreciate your Kudos !!
Your observations raise some interesting points about the behavior of linked tables in Power BI dataflows across different workspaces. Let's address your findings and concerns:
1. **Documentation Accuracy**: It's possible that the documentation on linked tables may not fully capture the nuances of their behavior, particularly when used across workspaces. Microsoft's documentation is generally reliable, but there may be scenarios or edge cases where the behavior deviates from what's described.
2. **Data Copy vs. Pointer**: The behavior you observed could suggest that dataflow B might be maintaining its own copy of the data from dataflow A, rather than just pointing to it. This would contradict the documentation's statement that linked tables don't copy or duplicate the data. However, without access to the underlying implementation details of Power BI's dataflows, it's challenging to conclusively determine the exact mechanism at play.
3. **Versioning and Refresh**: Another possibility is that dataflow B is indeed using a pointer to dataflow A, but it's referencing a specific version or snapshot of dataflow A. This would explain why refreshing dataflow A doesn't automatically propagate changes to dataflow B until the latter is refreshed explicitly.
4. **Usage and Best Practices**: Given the complexities and potential limitations of linked tables, it's worth considering whether they're the most appropriate solution for your scenario. As you noted, there may be little efficiency benefit compared to directly querying the source dataflow. Additionally, managing access permissions across multiple workspaces can add complexity.
5. **Alternative Approaches**: Disabling the load of linked entities and creating custom queries to reference dataflows directly may indeed offer more flexibility and control over data access and refresh behavior. This approach allows you to explicitly manage the flow of data and updates without relying on the behavior of linked tables.
In conclusion, while linked tables provide a convenient way to reference data across different workspaces in Power BI dataflows, their behavior and limitations warrant careful consideration. It's essential to understand how they operate in your specific scenario and evaluate alternative approaches to ensure optimal data management and performance.
Did I answer your question? Mark my post as a solution! Appreciate your Kudos !!
Thanks for your response @johnbasha33 , helps me know I'm not going mad. I would imagine the most likely scenario is that the data is being copied and the docs should clarify this, otherwise we might have seen some functionality around versioning of dataflows.
My main reason for asking this question was to help one of our users with an architectural decision - it seems like the conclusion is that there simply isn't any reason to or advantage of using linked tables when we're in a different workspace, unless it happens to be in premium anyway. No performance benefits, no additional functionality
Check out the September 2024 Power BI update to learn about new features.
Learn from experts, get hands-on experience, and win awesome prizes.