Check your eligibility for this 50% exam voucher offer and join us for free live learning sessions to get prepared for Exam DP-700.
Get StartedDon't miss out! 2025 Microsoft Fabric Community Conference, March 31 - April 2, Las Vegas, Nevada. Use code MSCUST for a $150 discount. Prices go up February 11th. Register now.
Hello everyone
I've looked everywhere for an answer to this, I couldn't find a working example.
I have a very large Power BI Dataflow set up, will all Sales history over 10 years. My report, that I build on PBI desktop, only looks for the Sales of the current year.
Right now, using the Power Bi Dataflow connector, I can't find the place to write my SQL statement like "SELECT order_id FROM Sales WHERE Sales_date = 2021"
The M Code in Advanced Editor is this :
let
Source = PowerBI.Dataflows([]),
#"xxxxxxxxxxxxxxxx" = Source{[workspaceId="xxxxxxxxxxxxxxxx"]}[Data],
#"yyyyyyyyyyyyyyyy" = #"xxxxxxxxxxxxxxxx"{[dataflowId="yyyyyyyyyyyyyyyy"]}[Data],
#"Sales" = #"yyyyyyyyyyyyyyyy"{[entity="Sales"]}[Data]
Right now, PBI is pulling the whole history and all the columns of the table Sales Orders, and that takes almost 1 hour to perform Join in the subsequent steps.
Isn't there a way to amend the M code in order to pass a SQL statement to limit the amount of data transfered ?
Thanks a lot,
A
Solved! Go to Solution.
You should be able to add a filter step in your query editor to select just 2021 data. I don't think the dataflow necessarily uses SQL but that shouldn't matter.
Your query will look something like this with the new step.
let
Source = PowerBI.Dataflows([]),
#"xxxxxxxxxxxxxxxx" = Source{[workspaceId="xxxxxxxxxxxxxxxx"]}[Data],
#"yyyyyyyyyyyyyyyy" = #"xxxxxxxxxxxxxxxx"{[dataflowId="yyyyyyyyyyyyyyyy"]}[Data],
#"Sales" = #"yyyyyyyyyyyyyyyy"{[entity="Sales"]}[Data],
#"Filtered Rows" = Table.SelectRows(#"Sales", each [sales_date] >= #date(2021, 1, 1))
in
#"Filtered Rows"
In order to run SQL statements, you need a database engine to run those statements.
A Dataflow is not storing data in a SQL database. You can see that when you eexport your Dataflow in a JSON file on tag "ppdf:outputFileFormat".
Dataflow Gen1 is using csv files.
Dataglow Gen2 is using Apache Parquet files.
If you can figure out a way to connect to the files generated by Gen2 Dataflows, You can run SQL statement on Parquet files by using Apache Spark SQL.
You should be able to add a filter step in your query editor to select just 2021 data. I don't think the dataflow necessarily uses SQL but that shouldn't matter.
Your query will look something like this with the new step.
let
Source = PowerBI.Dataflows([]),
#"xxxxxxxxxxxxxxxx" = Source{[workspaceId="xxxxxxxxxxxxxxxx"]}[Data],
#"yyyyyyyyyyyyyyyy" = #"xxxxxxxxxxxxxxxx"{[dataflowId="yyyyyyyyyyyyyyyy"]}[Data],
#"Sales" = #"yyyyyyyyyyyyyyyy"{[entity="Sales"]}[Data],
#"Filtered Rows" = Table.SelectRows(#"Sales", each [sales_date] >= #date(2021, 1, 1))
in
#"Filtered Rows"
Thanks @amitchandak
Does that mean it's impossible to send a complex query with JOIN etc, to be executed on the Dataflows side before the data is sent back to Power BI ?
@Anonymous , You can use M parameter but that will limit data, can only be changed in the deployment pipeline
refer
March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!
Arun Ulag shares exciting details about the Microsoft Fabric Conference 2025, which will be held in Las Vegas, NV.
User | Count |
---|---|
119 | |
78 | |
58 | |
52 | |
48 |
User | Count |
---|---|
171 | |
117 | |
61 | |
59 | |
53 |