Dataflows Direct Query

tmckenzie3 · ‎11-09-2018

Dataflows would be really useful if you could connect to them via DirectQuery. This would mean that we could query enormous dataflows that are not practical to load to the Data model and can make aggregate dataflows that we load to the Data model. Unsure if this is possible with the architecture of a Dataflow, but it could be quite handy if it is possible.

lizg · ‎07-05-2020

AGREE!!

pbiideas1 · ‎07-05-2020

this is a must have

james_wheeler · ‎07-05-2020

This functionality is basically essential. All of the data intelligence tools need to work the following way..... 1) Incremental refresh to pull large raw / close to raw data into the service efficiently. Incremental refresh needs to be much more intelligent and robust than it is currently. 2) Transform / query against the raw data already in the service. DO NOT COPY THE OUTPUT SOMEPLACE ELSE!!!! 3) MUST!!!! NOT REQUIRE PREMIUM NODE. At a high level data ingest should be super efficient. NEVER ingest the same data twice. This means not only being able to detect new data based on a date column but also some mechanism to allow the data source owner to specify how to detect what data has been updated. Once data is in service DO NOT make a copy of it anyplace. Only make a copy if the owner of the transform specifically indicates that you can run the transform once and "cache" the output because he/she is certain that the results would never change or is ok with stale results. Default behavior should ALWAYS be something like a live connection to the data imported by the data owner.

james_wheeler · ‎07-05-2020

The reality is that there are two classes of Power BI users. There are software engineers / data experts that typically own the data sources that feed Power BI. These people have the expertise to write complex queries and do other calculations against the source data. You need to give these people a way to put data from their source into the service efficiently to be used by the other type of user. These people NEVER want to pull the same data from their source more than once for obvious reasons. The other user is the business analyst. This person typically does not have a background in SQL or programming. They need to be able to use the graphical interface to efficiently transform the raw data that has already been brought into PowerBI by the data source owner. You can't expect the data source owner to do all the aggregation / filtering / etc against the data before bringing it into Power BI service. It would seem to be the whole reason for Power BI is to enable people who don't have backgrounds in software engineering / SQL to analyze the data.

fbcideas_migusr · ‎07-05-2020

I totally support this idea. Right now, I have Dataflow with 55million records. Everytime I need to use this dataflow to create a new report... I am forced to download between 5 to 10 GB of data to my desktop. The initial understanding was the Datasets will establish a live connection to Dataflows. I was disappointed that its not! Here are some screenshots and scenarios I listed: https://community.powerbi.com/t5/Service/Dataflow-Dataset-NEED-more-Clarity/m-p/764619

Alisher · ‎07-05-2020

Fully support the idea of enabling DQ to dataflows. Must be included in PRO license to make it widely accepted. Otherwise analysts have no option but to dublicate the same row data in datasets.

fbcideas_migusr · ‎07-05-2020

This would be great for me, I have a massive table I want to keep in the source DB and create aggregate tables for it but I want to store some of the shared dimensions in data flows but because we can't direct query them or set them as dual then I don't have much of a use case for data flows.

ageraci1 · ‎07-05-2020

Cannot fathom how this isn't currently an option. You have direct query, but it is only available in certain instances? What? Perhaps someone can identify if I am doing something wrong. I have the data in an Azure SQL Warehouse and want to make the data available in a direct query (not a scheduled refresh) manner in dataflows so users can generate reports using the preset data.

pbiideas1 · ‎07-05-2020

Including the ability to direct query your own ADLG2 should be included as well. We load our ADLG2 with ADF and Databricks and just want a dataflow to reference that content without requiring a scheduled or manual refresh, and allow it to direct query Parquet, ORC or Avro files.

pbiideas1 · ‎07-05-2020

is there a reason why this cannot be done?

Support for parameters in triggering Fabric action...

Pass parameters from Reflex (Data Activator) to Da...

Configure S3 shortcut using temporary credentials.

Identity Federation for S3 shortcuts for tokenless...

Sort matrix by Parameter field

Party with Power BI’s own Guy in a Cube