Re: Shared Dataset with Direct Query Dataflows

jcampbell474 · ‎03-12-2022

Wondering if anyone has used a shared dataset (feeding (via direct query) multiple reports) that uses direct query against multiple dataflows? Basically, we want to avoid a large (imported) dataset. Direct query across the dataflows distributes the overall size across the entities. Large storage format w/the enhanced compute engine turned on.

If anyone has done this, how was/is the performance? Initially, I don't know how I should feel about users navigating across multiple reports using direct query on the shared dataset, which then will send multiple direct query statements to the multiple (imported) dataflows.

Maybe the strategy is solid and only limited by our P1 sku cores and number of concurrent users? Maybe it'll be a resource intensive architecture? Just looking for real-world testimonials and/or thoughts & opinions.

bcdobbs · ‎03-12-2022

I have a similar setup but the dataflows only contain a small amount of data that comes in from an hourly api refresh. It works but my gut feeling (no hard evidence) is that it wouldn't scale very well. If I'm honest I only went for that route because it

was simple and required less orchestration (laziness).

Do you mind me asking why you're trying to avoid an import dataset? A large dataset is not necessarily an issue (bear in mind the vast compression that vertipak can achieve as well).

Ben Dobbs

LinkedIn | Twitter | Blog

Did I answer your question? Mark my post as a solution! This will help others on the forum!
Appreciate your Kudos!!

jcampbell474 · ‎03-12-2022

Sure. We only have a P1 and one of the several tables is 7gb. Simply refreshing a 7gb table will often result in eviction. Avoiding the import allows us to avoid the refresh 🙂

Couple of questions for you, if you don't mind:

- Do you have many concurrent users? If so, is performance noticably impacted?

- Have you looked at usage/capacity metrics app to see if there is any correlation between report usage and CPU spikes?

bcdobbs · ‎03-12-2022

I'll see if I have anything of use. It's a small bespoke dataset only used by a few people so not sure it's going to help you I'm afraid.

On alternative solutions. Would using incremental refresh on your dataset or building some custom partitions in tabular editor and designing a refresh strategy through either xlma or api with a script be an option?

Ben Dobbs

LinkedIn | Twitter | Blog

Did I answer your question? Mark my post as a solution! This will help others on the forum!
Appreciate your Kudos!!

jcampbell474 · ‎03-12-2022

Capacity limits prohibit importing all of the data needed for the dataset. So that rules out incremental refreshes, custom partitions, etc...

Thanky you for sharing your thoughts.

bcdobbs · ‎03-12-2022

Would be interested in how you get on!

If you go for it and start to find performance issues you might look

at building aggregate tables that are imported in order to handle

the "popular" report queries.

Ben Dobbs

LinkedIn | Twitter | Blog

Did I answer your question? Mark my post as a solution! This will help others on the forum!
Appreciate your Kudos!!

Shared Dataset with Direct Query Dataflows

Helpful resources

Join our Fabric User Panel

Power BI Monthly Update - February 2026