Power BI is turning 10! Tune in for a special live episode on July 24 with behind-the-scenes stories, product evolution highlights, and a sneak peek at what’s in store for the future.
Save the dateEnhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends August 31st. Request your voucher.
Hello,
Our project’s data architecture includes multiple notebooks created in Fabric - one for each generated table. I have grouped the notebooks by domain and created one pipeline per domain. Then, I created the main pipeline, where I invoked these domain pipelines and added additional notebooks and semantic model refreshes.
This main pipeline is scheduled to run every hour to deliver fresh data to users. However, it consumes so much capacity that it is not sustainable. Unfortunately, I cannot change the data architecture.
I’ve noticed that each notebook takes at least 7 minutes to launch because a new instance is created each time. I am using the High Concurrency instance and have tagged the scripts within the domain pipelines so that the same instance is reused. This has improved speed somewhat, but I’m not sure what else I can do to achieve better performance. Would using the same tag names in the sessions used in the invoked pipelines and applying them in the notebooks directly within the main pipeline help improve performance?
I’m not sure if I’m taking the right approach. I would appreciate advice on what else I can do to improve performance.
Solved! Go to Solution.
Hi @anlebonny,
May I ask if you have resolved this issue? If so, please mark the helpful reply and accept it as the solution. This will be helpful for other community members who have similar problems to solve it faster.
Thank you.
Hi @anlebonny,
I wanted to check if you had the opportunity to review the information provided. Please feel free to contact us if you have any further questions. If my response has addressed your query, please accept it as a solution and give a 'Kudos' so other members can easily find it.
Thank you.
Hi @anlebonny,
Thank you for posting your query in the Microsoft Fabric Community Forum, and thanks to @andrewsommer & @nilendraFabric for sharing valuable insights.
Could you please confirm if your query has been resolved by the provided solution? If so, please mark it as the solution. This will help other community members solve similar problems faster.
Thank you.
Using consistent session tags can improve performance by ensuring session reuse, which reduces startup overhead. This minimizes the number of new compute instances being provisioned, which can be a cause of the capacity spike. However, tagging alone may not do what you need.
Are you updating everything or only what you need to? Explore partitioning tables and using Delta Lake's MERGE with filtered conditions to limit processing.
Also you might want to look at if running domain pipelines in parallel within a parent pipeline causes resource contention; if so, stagger their execution.
Please mark this post as solution if it helps you. Appreciate Kudos.
User | Count |
---|---|
24 | |
15 | |
5 | |
5 | |
2 |
User | Count |
---|---|
48 | |
41 | |
18 | |
7 | |
5 |