Join us at FabCon Atlanta from March 16 - 20, 2026, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.
Register now!The Power BI Data Visualization World Championships is back! Get ahead of the game and start preparing now! Learn more
Wondering if anyone has used a shared dataset (feeding (via direct query) multiple reports) that uses direct query against multiple dataflows? Basically, we want to avoid a large (imported) dataset. Direct query across the dataflows distributes the overall size across the entities. Large storage format w/the enhanced compute engine turned on.
If anyone has done this, how was/is the performance? Initially, I don't know how I should feel about users navigating across multiple reports using direct query on the shared dataset, which then will send multiple direct query statements to the multiple (imported) dataflows.
Maybe the strategy is solid and only limited by our P1 sku cores and number of concurrent users? Maybe it'll be a resource intensive architecture? Just looking for real-world testimonials and/or thoughts & opinions.
I have a similar setup but the dataflows only contain a small amount of data that comes in from an hourly api refresh. It works but my gut feeling (no hard evidence) is that it wouldn't scale very well. If I'm honest I only went for that route because it
was simple and required less orchestration (laziness).
Do you mind me asking why you're trying to avoid an import dataset? A large dataset is not necessarily an issue (bear in mind the vast compression that vertipak can achieve as well).
Sure. We only have a P1 and one of the several tables is 7gb. Simply refreshing a 7gb table will often result in eviction. Avoiding the import allows us to avoid the refresh 🙂
Couple of questions for you, if you don't mind:
- Do you have many concurrent users? If so, is performance noticably impacted?
- Have you looked at usage/capacity metrics app to see if there is any correlation between report usage and CPU spikes?
I'll see if I have anything of use. It's a small bespoke dataset only used by a few people so not sure it's going to help you I'm afraid.
On alternative solutions. Would using incremental refresh on your dataset or building some custom partitions in tabular editor and designing a refresh strategy through either xlma or api with a script be an option?
Capacity limits prohibit importing all of the data needed for the dataset. So that rules out incremental refreshes, custom partitions, etc...
Thanky you for sharing your thoughts.
Would be interested in how you get on!
If you go for it and start to find performance issues you might look
at building aggregate tables that are imported in order to handle
the "popular" report queries.
The Power BI Data Visualization World Championships is back! Get ahead of the game and start preparing now!