Power BI is turning 10, and we’re marking the occasion with a special community challenge. Use your creativity to tell a story, uncover trends, or highlight something unexpected.
Get startedJoin us for an expert-led overview of the tools and concepts you'll need to become a Certified Power BI Data Analyst and pass exam PL-300. Register now.
Hello everyone
I've looked everywhere for an answer to this, I couldn't find a working example.
I have a very large Power BI Dataflow set up, will all Sales history over 10 years. My report, that I build on PBI desktop, only looks for the Sales of the current year.
Right now, using the Power Bi Dataflow connector, I can't find the place to write my SQL statement like "SELECT order_id FROM Sales WHERE Sales_date = 2021"
The M Code in Advanced Editor is this :
let
Source = PowerBI.Dataflows([]),
#"xxxxxxxxxxxxxxxx" = Source{[workspaceId="xxxxxxxxxxxxxxxx"]}[Data],
#"yyyyyyyyyyyyyyyy" = #"xxxxxxxxxxxxxxxx"{[dataflowId="yyyyyyyyyyyyyyyy"]}[Data],
#"Sales" = #"yyyyyyyyyyyyyyyy"{[entity="Sales"]}[Data]
Right now, PBI is pulling the whole history and all the columns of the table Sales Orders, and that takes almost 1 hour to perform Join in the subsequent steps.
Isn't there a way to amend the M code in order to pass a SQL statement to limit the amount of data transfered ?
Thanks a lot,
A
Solved! Go to Solution.
You should be able to add a filter step in your query editor to select just 2021 data. I don't think the dataflow necessarily uses SQL but that shouldn't matter.
Your query will look something like this with the new step.
let
Source = PowerBI.Dataflows([]),
#"xxxxxxxxxxxxxxxx" = Source{[workspaceId="xxxxxxxxxxxxxxxx"]}[Data],
#"yyyyyyyyyyyyyyyy" = #"xxxxxxxxxxxxxxxx"{[dataflowId="yyyyyyyyyyyyyyyy"]}[Data],
#"Sales" = #"yyyyyyyyyyyyyyyy"{[entity="Sales"]}[Data],
#"Filtered Rows" = Table.SelectRows(#"Sales", each [sales_date] >= #date(2021, 1, 1))
in
#"Filtered Rows"
In order to run SQL statements, you need a database engine to run those statements.
A Dataflow is not storing data in a SQL database. You can see that when you eexport your Dataflow in a JSON file on tag "ppdf:outputFileFormat".
Dataflow Gen1 is using csv files.
Dataglow Gen2 is using Apache Parquet files.
If you can figure out a way to connect to the files generated by Gen2 Dataflows, You can run SQL statement on Parquet files by using Apache Spark SQL.
You should be able to add a filter step in your query editor to select just 2021 data. I don't think the dataflow necessarily uses SQL but that shouldn't matter.
Your query will look something like this with the new step.
let
Source = PowerBI.Dataflows([]),
#"xxxxxxxxxxxxxxxx" = Source{[workspaceId="xxxxxxxxxxxxxxxx"]}[Data],
#"yyyyyyyyyyyyyyyy" = #"xxxxxxxxxxxxxxxx"{[dataflowId="yyyyyyyyyyyyyyyy"]}[Data],
#"Sales" = #"yyyyyyyyyyyyyyyy"{[entity="Sales"]}[Data],
#"Filtered Rows" = Table.SelectRows(#"Sales", each [sales_date] >= #date(2021, 1, 1))
in
#"Filtered Rows"
Thanks @amitchandak
Does that mean it's impossible to send a complex query with JOIN etc, to be executed on the Dataflows side before the data is sent back to Power BI ?
@Anonymous , You can use M parameter but that will limit data, can only be changed in the deployment pipeline
refer
This is your chance to engage directly with the engineering team behind Fabric and Power BI. Share your experiences and shape the future.
Check out the June 2025 Power BI update to learn about new features.
User | Count |
---|---|
65 | |
63 | |
52 | |
37 | |
36 |
User | Count |
---|---|
82 | |
66 | |
61 | |
46 | |
45 |