Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Next up in the FabCon + SQLCon recap series: The roadmap for Microsoft SQL and Maximizing Developer experiences in Fabric. All sessions are available on-demand after the live show. Register now

Reply
jasbro
Regular Visitor

Staging query is getting executed for every sub query

Hello,

 

 I'm using Excel to run Power Query. I followed Ken's design approach to use a Staging query that builds multiple re-shaped tables. I have one staging sql query (connection only) feeding 4 re-shape queries (load to table).

 

 When I refesh all, I notice that every sub query is contacting the database. Is that normal? Is there a way to point my sub queries to the staging query's result instead?

 

Or am I not understanding something? I thought the stagging query would only be executed once, then other queries would execute from those results.

 

 My queries are something like this:

SQL_Staging_Qry

let

  src=SQL.Database("srvnm", "catalog", [Query="SELECT * FROM foo"])

in

  src

 

Sub_Qry

let

  sql_src=SQL,

  // shape data with Table.SelectRows(sql_src, ...), Table.AddColumn(sql_src, ...), etc

  result = reshaped_data

in

  result

1 ACCEPTED SOLUTION

Hi, I think I understand. You're point comes from the perspective of what M language does and how the compiler interprets duplicate vs reference. 

 

My perspective comes from SQL DB optimization and reducing the total number of queries to the database. As over the wire latency maters much more than processor time.

 

What I found is... that Power Query does in fact send multiple SQL queries per sub query when the stagging query is set to 'Connection Only'.

Proof / Repro

  1. Set your stagging query to 'Connection only' as Ken recomended.
  2. Use a SQL connection or some other remote data source. I used SQL for this test (and easier to test as you'll see).
  3. Create multiple sub queries as 'reference' queries. It actually doesn't matter what these queries do.
  4. Open up network monitor of your choice. I used WireShark. Set your capture filter to your remote destination.
  5. Hit Refresh all in your Excel spreadsheet and inspect the wire.

I have detected exactly the same number of SQL Select statements hit the wire as there are sub queries.

 

Potential Workaround Found

  1. Set your stagging query to 'Load to Table'
  2. Create a new query that loads from the table you just created
    let
        Source = Excel.CurrentWorkbook(){[Name="yourTableNameHere"]}[Content]
    in
        Source
  3. Set your sub queries to 'reference' the newly created query in step 2.
  4. Setup WireShark as before, and Hit Refresh All

Only one SQL query will hit the wire and all other sub queries will now wait for completion.

 

Further Discusion

Ideally, resultant set should be used, but M Language cannot tell where the data comes from. Is it being cached, is it dynamically created, or is it coming from a remote system? Without that, M Language cannot effectively provide step optimization techniques as claimed with 'duplicate' vs 'reference'. The combination of missunderstanding of the capabilities with M Language optimziation and bad architectual advice lead me to this state. Stanging Query concept only works if you 'cache' the resultant set somewhere. Either in the DataModel or in a table.

View solution in original post

5 REPLIES 5
Thejeswar
Super User
Super User

Hi @jasbro,

I think your Staging Query will be run as many times as your sub-query, if the staging query is used as a duplicate on top of your sub-query.

 

To resolve trying using your Staging Query as a reference before building your sub-query on top of it

 

Hi @Thejeswar,

 

What do you mean, "try using your Stagging Query as a reference"?

 

I found another Ken article about how to Reference other Power Queries, but when I followed along, all it did was create a new query with syntax similar to:
  let

    Source = SQL_Staging_Qry

  in

    Source

 

This is what I already have.

 

 However, I did come across another article, difference between Reference & Duplicate. This makes me think that maybe the UI isn't showing me what is really going on.

 

 When I hit Ctrl+Alt+F5, I get the following

Untitled.png

 You see, each sub query is displaying a "Waiting for "{DB server name redacted}  So my question is, am I hitting the database 3 times or am I hitting it once with my 'reference query'?

Hi @jasbro,

Yes as the post that you showed says, the Duplicate will re-run the code while reference will just re-use the result set

 

FYR, I am just giving some sample M Code for Duplicate and Reference

 

M-Code of my Source:

 

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("dZBLCsIwEECvErIuNjOTpFmLay9QulC0KtIO1N4fU5oPjLhICO89wiR9r0+X9a4bfeYDj+rK/H7Nj08EKq5/7piOalx4UhPP61OtrG5bPjS9hhZaNBBiimjTZT9w2/cas7DOl1pAH0yqKQuyrtQCgkGXclsNlVxAIB9S7rLxrk4uIBrElPtsgGouIEJ8wZ532YSuvlRAMpRvD3VMLLmARF38x+EL", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Input = _t, #" " = _t, Output = _t, #"(blank)" = _t, #"(blank).1" = _t, #"(blank).2" = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Input", type text}, {" ", type text}, {"Output", type text}, {"(blank)", type text}, {"(blank).1", type text}, {"(blank).2", type text}}),
in
    #"Changed Type"

M-Code for Duplicated Query:

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("dZBLCsIwEECvErIuNjOTpFmLay9QulC0KtIO1N4fU5oPjLhICO89wiR9r0+X9a4bfeYDj+rK/H7Nj08EKq5/7piOalx4UhPP61OtrG5bPjS9hhZaNBBiimjTZT9w2/cas7DOl1pAH0yqKQuyrtQCgkGXclsNlVxAIB9S7rLxrk4uIBrElPtsgGouIEJ8wZ532YSuvlRAMpRvD3VMLLmARF38x+EL", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Input = _t, #" " = _t, Output = _t, #"(blank)" = _t, #"(blank).1" = _t, #"(blank).2" = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Input", type text}, {" ", type text}, {"Output", type text}, {"(blank)", type text}, {"(blank).1", type text}, {"(blank).2", type text}}),
    #"Added Index" = Table.AddIndexColumn(#"Changed Type", "Index", 0, 1)
in
    #"Added Index"

M-Code for Referenced Query:

let
    Source = Table1,
    #"Added Index" = Table.AddIndexColumn(Source, "Index", 0, 1)
in
    #"Added Index"

As you can see, the duplicate Query runs through the source a second time apart from the actual Query. But in case of reference, it is just going to do the actions on top of the Source. So it is more likely that your duplicate query will run a separate query against your DB while you reference query will use the output of your source directly.

 

But when you do a manual Report Refresh, all your queries will be getting loading

 

The Other important thing that you can see when using a reference query is, when you make changes to your source query, Only your source query will be loaded again on giving apply changes, all the referenced queries will automatically reflect the changes. But here if you are using a duplicate query, you will have to make the changes again in all queries manually

 

 

Hi, I think I understand. You're point comes from the perspective of what M language does and how the compiler interprets duplicate vs reference. 

 

My perspective comes from SQL DB optimization and reducing the total number of queries to the database. As over the wire latency maters much more than processor time.

 

What I found is... that Power Query does in fact send multiple SQL queries per sub query when the stagging query is set to 'Connection Only'.

Proof / Repro

  1. Set your stagging query to 'Connection only' as Ken recomended.
  2. Use a SQL connection or some other remote data source. I used SQL for this test (and easier to test as you'll see).
  3. Create multiple sub queries as 'reference' queries. It actually doesn't matter what these queries do.
  4. Open up network monitor of your choice. I used WireShark. Set your capture filter to your remote destination.
  5. Hit Refresh all in your Excel spreadsheet and inspect the wire.

I have detected exactly the same number of SQL Select statements hit the wire as there are sub queries.

 

Potential Workaround Found

  1. Set your stagging query to 'Load to Table'
  2. Create a new query that loads from the table you just created
    let
        Source = Excel.CurrentWorkbook(){[Name="yourTableNameHere"]}[Content]
    in
        Source
  3. Set your sub queries to 'reference' the newly created query in step 2.
  4. Setup WireShark as before, and Hit Refresh All

Only one SQL query will hit the wire and all other sub queries will now wait for completion.

 

Further Discusion

Ideally, resultant set should be used, but M Language cannot tell where the data comes from. Is it being cached, is it dynamically created, or is it coming from a remote system? Without that, M Language cannot effectively provide step optimization techniques as claimed with 'duplicate' vs 'reference'. The combination of missunderstanding of the capabilities with M Language optimziation and bad architectual advice lead me to this state. Stanging Query concept only works if you 'cache' the resultant set somewhere. Either in the DataModel or in a table.

Nice Observation and Explanation Smiley Happy

Helpful resources

Announcements
New to Fabric survey Carousel

New to Fabric Survey

If you have recently started exploring Fabric, we'd love to hear how it's going. Your feedback can help with product improvements.

Power BI DataViz World Championships carousel

Power BI DataViz World Championships - June 2026

A new Power BI DataViz World Championship is coming this June! Don't miss out on submitting your entry.

Join our Fabric User Panel

Join our Fabric User Panel

Share feedback directly with Fabric product managers, participate in targeted research studies and influence the Fabric roadmap.

March Power BI Update Carousel

Power BI Community Update - March 2026

Check out the March 2026 Power BI update to learn about new features.