Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

The Power BI Data Visualization World Championships is back! It's time to submit your entry. Live now!

Reply
jcampbell474
Helper IV
Helper IV

Shared Dataset with Direct Query Dataflows

Wondering if anyone has used a shared dataset (feeding (via direct query) multiple reports) that uses direct query against multiple dataflows?  Basically, we want to avoid a large (imported) dataset.  Direct query across the dataflows distributes the overall size across the entities.  Large storage format w/the enhanced compute engine turned on.

 

If anyone has done this, how was/is the performance?  Initially, I don't know how I should feel about users navigating across multiple reports using direct query on the shared dataset, which then will send multiple direct query statements to the multiple (imported) dataflows.

 

Maybe the strategy is solid and only limited by our P1 sku cores and number of concurrent users?  Maybe it'll be a resource intensive architecture?  Just looking for real-world testimonials and/or thoughts & opinions.

5 REPLIES 5
bcdobbs
Community Champion
Community Champion

I have a similar setup but the dataflows only contain a small amount of data that comes in from an hourly api refresh. It works but my gut feeling (no hard evidence) is that it wouldn't scale very well. If I'm honest I only went for that route because it

was simple and required less orchestration (laziness).

 

Do you mind me asking why you're trying to avoid an import dataset? A large dataset is not necessarily an issue (bear in mind the vast compression that vertipak can achieve as well).



Ben Dobbs

LinkedIn | Twitter | Blog

Did I answer your question? Mark my post as a solution! This will help others on the forum!
Appreciate your Kudos!!

Sure.  We only have a P1 and one of the several tables is 7gb.  Simply refreshing a 7gb table will often result in eviction.  Avoiding the import allows us to avoid the refresh 🙂

 

Couple of questions for you, if you don't mind: 

- Do you have many concurrent users?  If so, is performance noticably impacted?

- Have you looked at usage/capacity metrics app to see if there is any correlation between report usage and CPU spikes?

I'll see if I have anything of use. It's a small bespoke dataset only used by a few people so not sure it's going to help you I'm afraid.

 

On alternative solutions. Would using incremental refresh on your dataset or building some custom partitions in tabular editor and designing a refresh strategy through either xlma or api with a script be an option?



Ben Dobbs

LinkedIn | Twitter | Blog

Did I answer your question? Mark my post as a solution! This will help others on the forum!
Appreciate your Kudos!!

Capacity limits prohibit importing all of the data needed for the dataset.  So that rules out incremental refreshes, custom partitions, etc...  

 

Thanky you for sharing your thoughts.  

 

Would be interested in how you get on!

 

If you go for it and start to find performance issues you might look

at building aggregate tables that are imported in order to handle

the "popular" report queries.



Ben Dobbs

LinkedIn | Twitter | Blog

Did I answer your question? Mark my post as a solution! This will help others on the forum!
Appreciate your Kudos!!

Helpful resources

Announcements
Power BI DataViz World Championships

Power BI Dataviz World Championships

The Power BI Data Visualization World Championships is back! It's time to submit your entry.

December 2025 Power BI Update Carousel

Power BI Monthly Update - December 2025

Check out the December 2025 Power BI Holiday Recap!

FabCon Atlanta 2026 carousel

FabCon Atlanta 2026

Join us at FabCon Atlanta, March 16-20, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.