Join us at FabCon Atlanta from March 16 - 20, 2026, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.
Register now!The Power BI Data Visualization World Championships is back! Get ahead of the game and start preparing now! Learn more
Hello everyone
I've looked everywhere for an answer to this, I couldn't find a working example.
I have a very large Power BI Dataflow set up, will all Sales history over 10 years. My report, that I build on PBI desktop, only looks for the Sales of the current year.
Right now, using the Power Bi Dataflow connector, I can't find the place to write my SQL statement like "SELECT order_id FROM Sales WHERE Sales_date = 2021"
The M Code in Advanced Editor is this :
let
Source = PowerBI.Dataflows([]),
#"xxxxxxxxxxxxxxxx" = Source{[workspaceId="xxxxxxxxxxxxxxxx"]}[Data],
#"yyyyyyyyyyyyyyyy" = #"xxxxxxxxxxxxxxxx"{[dataflowId="yyyyyyyyyyyyyyyy"]}[Data],
#"Sales" = #"yyyyyyyyyyyyyyyy"{[entity="Sales"]}[Data]
Right now, PBI is pulling the whole history and all the columns of the table Sales Orders, and that takes almost 1 hour to perform Join in the subsequent steps.
Isn't there a way to amend the M code in order to pass a SQL statement to limit the amount of data transfered ?
Thanks a lot,
A
Solved! Go to Solution.
You should be able to add a filter step in your query editor to select just 2021 data. I don't think the dataflow necessarily uses SQL but that shouldn't matter.
Your query will look something like this with the new step.
let
Source = PowerBI.Dataflows([]),
#"xxxxxxxxxxxxxxxx" = Source{[workspaceId="xxxxxxxxxxxxxxxx"]}[Data],
#"yyyyyyyyyyyyyyyy" = #"xxxxxxxxxxxxxxxx"{[dataflowId="yyyyyyyyyyyyyyyy"]}[Data],
#"Sales" = #"yyyyyyyyyyyyyyyy"{[entity="Sales"]}[Data],
#"Filtered Rows" = Table.SelectRows(#"Sales", each [sales_date] >= #date(2021, 1, 1))
in
#"Filtered Rows"
In order to run SQL statements, you need a database engine to run those statements.
A Dataflow is not storing data in a SQL database. You can see that when you eexport your Dataflow in a JSON file on tag "ppdf:outputFileFormat".
Dataflow Gen1 is using csv files.
Dataglow Gen2 is using Apache Parquet files.
If you can figure out a way to connect to the files generated by Gen2 Dataflows, You can run SQL statement on Parquet files by using Apache Spark SQL.
You should be able to add a filter step in your query editor to select just 2021 data. I don't think the dataflow necessarily uses SQL but that shouldn't matter.
Your query will look something like this with the new step.
let
Source = PowerBI.Dataflows([]),
#"xxxxxxxxxxxxxxxx" = Source{[workspaceId="xxxxxxxxxxxxxxxx"]}[Data],
#"yyyyyyyyyyyyyyyy" = #"xxxxxxxxxxxxxxxx"{[dataflowId="yyyyyyyyyyyyyyyy"]}[Data],
#"Sales" = #"yyyyyyyyyyyyyyyy"{[entity="Sales"]}[Data],
#"Filtered Rows" = Table.SelectRows(#"Sales", each [sales_date] >= #date(2021, 1, 1))
in
#"Filtered Rows"
Thanks @amitchandak
Does that mean it's impossible to send a complex query with JOIN etc, to be executed on the Dataflows side before the data is sent back to Power BI ?
@Anonymous , You can use M parameter but that will limit data, can only be changed in the deployment pipeline
refer
The Power BI Data Visualization World Championships is back! Get ahead of the game and start preparing now!
| User | Count |
|---|---|
| 61 | |
| 44 | |
| 41 | |
| 36 | |
| 21 |
| User | Count |
|---|---|
| 176 | |
| 120 | |
| 106 | |
| 77 | |
| 52 |