Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

The Power BI Data Visualization World Championships is back! Get ahead of the game and start preparing now! Learn more

Reply
frithjof_v
Super User
Super User

Using variable vs. referencing another query - Excel source

Hi,

 

I am wondering if there is some obvious conceptual difference (which may impact performance) between referencing a variable (i.e. referencing another step in the same query), or referencing the output of another query, when doing a merge operation which doesn't fold?

 

An example:

 

Let's say I have two Excel sources.

Both Excel sources are very simple tables, with an ID column with increasing values from 1 to 1 000 000 and also a column containing random values.

 

Excel A

frithjof_v_0-1720965683444.png

 

Excel B

frithjof_v_1-1720965764515.png

 

In Power Query, I want to merge these sources by using the ID column.

 

Does it make a difference if I reference a variable (step) of the same query:

frithjof_v_3-1720965938104.png

 

 

let
    Source = Excel.Workbook(File.Contents("...\Excel A.xlsx"), null, true),
    Ark1_Sheet = Source{[Item="Ark1",Kind="Sheet"]}[Data],
    #"Promoted Headers" = Table.PromoteHeaders(Ark1_Sheet, [PromoteAllScalars=true]),
    #"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"ID", Int64.Type}, {"Random", Int64.Type}}),
    Source_B = Excel.Workbook(File.Contents("...\Excel B.xlsx"), null, true),
    Ark1_Sheet_B = Source_B{[Item="Ark1",Kind="Sheet"]}[Data],
    #"Promoted Headers_B" = Table.PromoteHeaders(Ark1_Sheet_B, [PromoteAllScalars=true]),
    #"Changed Type_B" = Table.TransformColumnTypes(#"Promoted Headers_B",{{"ID", Int64.Type}, {"Random", Int64.Type}}),
    #"Merged Queries" = Table.NestedJoin(#"Changed Type", {"ID"}, #"Changed Type_B", {"ID"}, "MergeResult", JoinKind.LeftOuter),
    #"Expanded MergeResult" = Table.ExpandTableColumn(#"Merged Queries", "MergeResult", {"Random"}, {"B.Random"}),
    #"Merged Queries2" = Table.NestedJoin(#"Expanded MergeResult", {"ID"}, #"Changed Type_B", {"ID"}, "MergeResult2", JoinKind.LeftOuter),
    #"Expanded MergeResult2" = Table.ExpandTableColumn(#"Merged Queries2", "MergeResult2", {"Random"}, {"B.Random2"})
in
    #"Expanded MergeResult2"

 

 

 

 

Or if I reference another query:

frithjof_v_2-1720965893830.png

 

 

let
    Source = Excel.Workbook(File.Contents("...\Excel A.xlsx"), null, true),
    Ark1_Sheet = Source{[Item="Ark1",Kind="Sheet"]}[Data],
    #"Promoted Headers" = Table.PromoteHeaders(Ark1_Sheet, [PromoteAllScalars=true]),
    #"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"ID", Int64.Type}, {"Random", Int64.Type}}),
    #"Merged Queries" = Table.NestedJoin(#"Changed Type", {"ID"}, #"ExcelSource QueryB", {"ID"}, "ExcelSource B", JoinKind.LeftOuter),
    #"Renamed Columns" = Table.RenameColumns(#"Merged Queries",{{"ExcelSource B", "MergeResult"}}),
    #"Expanded MergeResult" = Table.ExpandTableColumn(#"Renamed Columns", "MergeResult", {"Random"}, {"B.Random"}),
    #"Merged Queries1" = Table.NestedJoin(#"Expanded MergeResult", {"ID"}, #"ExcelSource QueryB", {"ID"}, "MergeResult2", JoinKind.LeftOuter),
    #"Expanded MergeResult2" = Table.ExpandTableColumn(#"Merged Queries1", "MergeResult2", {"Random"}, {"MergeResult2.Random"}),
    #"Renamed Columns1" = Table.RenameColumns(#"Expanded MergeResult2",{{"MergeResult2.Random", "B.Random2"}})
in
    #"Renamed Columns1"

 

 

 

The referenced query, ExcelSource QueryB, is created as simple as this:

 

 

let
    Source = Excel.Workbook(File.Contents("...\Excel B.xlsx"), null, true),
    Ark1_Sheet = Source{[Item="Ark1",Kind="Sheet"]}[Data],
    #"Promoted Headers" = Table.PromoteHeaders(Ark1_Sheet, [PromoteAllScalars=true]),
    #"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"ID", Int64.Type}, {"Random", Int64.Type}})
in
    #"Changed Type"

 

 

 

This is an example, however I am also generally interested in understanding if there is a conceptual difference (performance-wise) between using a variable (a step inside the same query) or using the output of another query? 

 

E.g. if I wanted to append the two Excel sources. 

Also if I wanted to reuse source B multiple times within query A.

Is there a conceptual difference (performance-wise) between using a variable from within the same query, or referencing another query? 

 

 

By connecting SQL Server Profiler to Power BI Desktop, and refreshing each of the two queries, I cannot see a noticeable difference between the two approaches:

frithjof_v_0-1720967036317.png

I haven't tested this with Power BI Service (powerbi.com), only with Power BI Desktop.

 

My foundational question is, are there performance benefits of using variables in Power Query (M)?

10 REPLIES 10
PwerQueryKees
Super User
Super User

And I almost forgot. I did a bit of testing today. Calling a function defined as a variable in the same query or as a separate independent query takes so little time, I couldn't measure either by taking timestamps before and after the evaluation. 

I could measure a 80 ms function, but had to do a sum of a 1 million element list to get to that time range and I could not distinguish the execution times of the internal from the external call.

To test it properly I think I would need to do a list.acxumulate of a million items list to find the difference if there is any.

My guess is that it does not make a difference, because once the function is compiled and in memory, they would be indistinguishable. And the compiling happens only once, so would not be noticeable anyway.

Thank you @PwerQueryKees ,

 

It also seems to me that it doesn't make a difference (or maybe, see * below) if we reference another query, or reference a step (variable) inside the same query.

(If this is not true, I hope someone will correct me! 😀)

 

I did some testing, with inspiration from the example shown in this video: Inside Power Query reference queries for Power BI and Excel (youtube.com)

 

I created a Power Automate flow which gets triggered by someone making a HTTP GET call, and the Power Automate flow returns a JSON with some dummy data to the caller:

 

 frithjof_v_0-1721118049969.png

 

I then used Power BI desktop (and later also Power BI service) to make HTTP calls to this URL endpoint. I tried some different query configurations in Power Query.

 

 

1. Merging multiple queries (not loaded) into a main query (loaded).

 

frithjof_v_3-1721119114766.png

 

The main query (which gets loaded into the semantic model) has this M code:

 

 

 

let
    Source = Json.Document(Web.Contents("<URL Endpoint>")),
    #"Converted to Table" = Table.FromRecords({Source}),
    #"Expanded transactions" = Table.ExpandListColumn(#"Converted to Table", "transactions"),
    #"Expanded transactions1" = Table.ExpandRecordColumn(#"Expanded transactions", "transactions", {"transaction_id", "customer_id", "order_date", "items", "total_amount"}, {"transactions.transaction_id", "transactions.customer_id", "transactions.order_date", "transactions.items", "transactions.total_amount"}),
    #"Expanded transactions.items" = Table.ExpandListColumn(#"Expanded transactions1", "transactions.items"),
    #"Expanded transactions.items1" = Table.ExpandRecordColumn(#"Expanded transactions.items", "transactions.items", {"item_id", "description", "quantity", "price_per_unit"}, {"transactions.items.item_id", "transactions.items.description", "transactions.items.quantity", "transactions.items.price_per_unit"}),
    #"Changed Type" = Table.TransformColumnTypes(#"Expanded transactions.items1",{{"transactions.transaction_id", type text}, {"transactions.customer_id", type text}, {"transactions.order_date", type date}, {"transactions.items.item_id", type text}, {"transactions.items.description", type text}, {"transactions.items.quantity", Int64.Type}, {"transactions.items.price_per_unit", Int64.Type}, {"transactions.total_amount", Int64.Type}}),
    #"Merged Queries" = Table.NestedJoin(#"Changed Type", {"transactions.transaction_id", "transactions.items.item_id"}, #"APISource QueryA_1", {"transactions.transaction_id", "transactions.items.item_id"}, "MergeResult", JoinKind.LeftOuter),
    #"Expanded MergeResult" = Table.ExpandTableColumn(#"Merged Queries", "MergeResult", {"transactions.total_amount"}, {"B.transactions.total_amount"}),
    #"Merged Queries2" = Table.NestedJoin(#"Expanded MergeResult", {"transactions.transaction_id", "transactions.items.item_id"}, #"APISource QueryA_2", {"transactions.transaction_id", "transactions.items.item_id"}, "MergeResult2", JoinKind.LeftOuter),
    #"Expanded MergeResult2" = Table.ExpandTableColumn(#"Merged Queries2", "MergeResult2", {"transactions.total_amount"}, {"B.transactions.total_amount2"}),
    #"Merged Queries3" = Table.NestedJoin(#"Expanded MergeResult2", {"transactions.transaction_id", "transactions.items.item_id"}, #"APISource QueryA_3", {"transactions.transaction_id", "transactions.items.item_id"}, "MergeResult3", JoinKind.LeftOuter),
    #"Expanded MergeResult3" = Table.ExpandTableColumn(#"Merged Queries3", "MergeResult3", {"transactions.total_amount"}, {"B.transactions.total_amount3"}),
    #"Merged Queries4" = Table.NestedJoin(#"Expanded MergeResult3", {"transactions.transaction_id", "transactions.items.item_id"}, #"APISource QueryA_4", {"transactions.transaction_id", "transactions.items.item_id"}, "MergeResult4", JoinKind.LeftOuter),
    #"Expanded MergeResult4" = Table.ExpandTableColumn(#"Merged Queries4", "MergeResult4", {"transactions.total_amount"}, {"B.transactions.total_amount4"})
in
    #"Expanded MergeResult4"

 

 

 

Each of the queries A_1, A_2, A_3, A_4 (which don't get loaded to the semantic model, but gets merged into the main query) basically just call the API (same URL Endpoint as in the main query):

 

 

let
    Source = Json.Document(Web.Contents("<URL Endpoint>")),
    #"Converted to Table" = Table.FromRecords({Source}),
    #"Expanded transactions" = Table.ExpandListColumn(#"Converted to Table", "transactions"),
    #"Expanded transactions1" = Table.ExpandRecordColumn(#"Expanded transactions", "transactions", {"transaction_id", "customer_id", "order_date", "items", "total_amount"}, {"transactions.transaction_id", "transactions.customer_id", "transactions.order_date", "transactions.items", "transactions.total_amount"}),
    #"Expanded transactions.items" = Table.ExpandListColumn(#"Expanded transactions1", "transactions.items"),
    #"Expanded transactions.items1" = Table.ExpandRecordColumn(#"Expanded transactions.items", "transactions.items", {"item_id", "description", "quantity", "price_per_unit"}, {"transactions.items.item_id", "transactions.items.description", "transactions.items.quantity", "transactions.items.price_per_unit"}),
    #"Changed Type" = Table.TransformColumnTypes(#"Expanded transactions.items1",{{"transactions.transaction_id", type text}, {"transactions.customer_id", type text}, {"transactions.order_date", type date}, {"transactions.items.item_id", type text}, {"transactions.items.description", type text}, {"transactions.items.quantity", Int64.Type}, {"transactions.items.price_per_unit", Int64.Type}, {"transactions.total_amount", Int64.Type}})
in
    #"Changed Type"

 

 

 

 

Refreshing that PBIX file results in 1 run* of the Power Automate flow.
So it seems the Power Query engine evaluates the main (loaded) query and all the referenced queries, and determines that the source is the same, and therefore can call the source once and then cache the source data. So this (referencing other queries) seems to be just as performant as referencing an existing variable inside the main query.

 

frithjof_v_4-1721119233624.png

 

*Sometimes 2 runs actually (see the Refresh PBIX on the bottom).
I don't know why, maybe one of the runs is some "test connection" which Power Query makes.
Perhaps this is a difference between this option 1. and another option 3. (see below) which references a variable inside the main query.
However I need to try to do more runs to see if also option 3. will produce 2 runs sometimes.

 

 

 

2. Loading multiple queries.

The only difference from option 1. is that in option 2. I have also enabled load on all the queries.

 

Now, because I have enabled load on all the queries, one refresh of this PBIX file results in multiple runs of the Power Automate flow. I think it is because Power Query cannot share cached results between different loaded queries.

 

frithjof_v_5-1721119379786.png

 

frithjof_v_6-1721119540987.png

 

 

3. Merging with a variable defined inside the main query.

Here, I only have one query, and I am merging a variable inside the query (which calls the API) multiple times into the "main branch" of the query.

 

frithjof_v_7-1721120357520.png

 

 

let
    Source = Json.Document(Web.Contents("<URL Endpoint>")),
    #"Converted to Table" = Table.FromRecords({Source}),
    #"Expanded transactions" = Table.ExpandListColumn(#"Converted to Table", "transactions"),
    #"Expanded transactions1" = Table.ExpandRecordColumn(#"Expanded transactions", "transactions", {"transaction_id", "customer_id", "order_date", "items", "total_amount"}, {"transactions.transaction_id", "transactions.customer_id", "transactions.order_date", "transactions.items", "transactions.total_amount"}),
    #"Expanded transactions.items" = Table.ExpandListColumn(#"Expanded transactions1", "transactions.items"),
    #"Expanded transactions.items1" = Table.ExpandRecordColumn(#"Expanded transactions.items", "transactions.items", {"item_id", "description", "quantity", "price_per_unit"}, {"transactions.items.item_id", "transactions.items.description", "transactions.items.quantity", "transactions.items.price_per_unit"}),
    #"Changed Type" = Table.TransformColumnTypes(#"Expanded transactions.items1",{{"transactions.transaction_id", type text}, {"transactions.customer_id", type text}, {"transactions.order_date", type date}, {"transactions.items.item_id", type text}, {"transactions.items.description", type text}, {"transactions.items.quantity", Int64.Type}, {"transactions.items.price_per_unit", Int64.Type}, {"transactions.total_amount", Int64.Type}}),
    Source_B = Json.Document(Web.Contents("<URL Endpoint>")),
    #"Converted to Table_B" = Table.FromRecords({Source_B}),
    #"Expanded transactions_B" = Table.ExpandListColumn(#"Converted to Table_B", "transactions"),
    #"Expanded transactions1_B" = Table.ExpandRecordColumn(#"Expanded transactions_B", "transactions", {"transaction_id", "customer_id", "order_date", "items", "total_amount"}, {"transactions.transaction_id", "transactions.customer_id", "transactions.order_date", "transactions.items", "transactions.total_amount"}),
    #"Expanded transactions.items_B" = Table.ExpandListColumn(#"Expanded transactions1_B", "transactions.items"),
    #"Expanded transactions.items1_B" = Table.ExpandRecordColumn(#"Expanded transactions.items_B", "transactions.items", {"item_id", "description", "quantity", "price_per_unit"}, {"transactions.items.item_id", "transactions.items.description", "transactions.items.quantity", "transactions.items.price_per_unit"}),
    #"Changed Type_B" = Table.TransformColumnTypes(#"Expanded transactions.items1_B",{{"transactions.transaction_id", type text}, {"transactions.customer_id", type text}, {"transactions.order_date", type date}, {"transactions.items.item_id", type text}, {"transactions.items.description", type text}, {"transactions.items.quantity", Int64.Type}, {"transactions.items.price_per_unit", Int64.Type}, {"transactions.total_amount", Int64.Type}}),
    #"Merged Queries" = Table.NestedJoin(#"Changed Type", {"transactions.transaction_id", "transactions.items.item_id"}, #"Changed Type_B", {"transactions.transaction_id", "transactions.items.item_id"}, "MergeResult", JoinKind.LeftOuter),
    #"Expanded MergeResult" = Table.ExpandTableColumn(#"Merged Queries", "MergeResult", {"transactions.total_amount"}, {"B.transactions.total_amount"}),
    #"Merged Queries2" = Table.NestedJoin(#"Expanded MergeResult", {"transactions.transaction_id", "transactions.items.item_id"}, #"Changed Type_B", {"transactions.transaction_id", "transactions.items.item_id"}, "MergeResult2", JoinKind.LeftOuter),
    #"Expanded MergeResult2" = Table.ExpandTableColumn(#"Merged Queries2", "MergeResult2", {"transactions.total_amount"}, {"B.transactions.total_amount2"}),
    #"Merged Queries3" = Table.NestedJoin(#"Expanded MergeResult2", {"transactions.transaction_id", "transactions.items.item_id"}, #"Changed Type_B", {"transactions.transaction_id", "transactions.items.item_id"}, "MergeResult3", JoinKind.LeftOuter),
    #"Expanded MergeResult3" = Table.ExpandTableColumn(#"Merged Queries3", "MergeResult3", {"transactions.total_amount"}, {"B.transactions.total_amount3"}),
    #"Merged Queries4" = Table.NestedJoin(#"Expanded MergeResult3", {"transactions.transaction_id", "transactions.items.item_id"}, #"Changed Type_B", {"transactions.transaction_id", "transactions.items.item_id"}, "MergeResult4", JoinKind.LeftOuter),
    #"Expanded MergeResult4" = Table.ExpandTableColumn(#"Merged Queries4", "MergeResult4", {"transactions.total_amount"}, {"B.transactions.total_amount4"})
in
    #"Expanded MergeResult4"

 

 

 

frithjof_v_0-1721121681419.png

 

This also results in only 1 run of the Power Automate flow each time I refresh the PBIX.
* However, as mentioned above, I need to test more to see if this sometimes results in 2 runs of the Power Automate flow, as option 1. does. 

 

 

Just for reference, the JSON returned by the API is just some dummy data which ChatGPT generated for me for the sake of this experiment:

 

 

{
  "transactions": [
    {
      "transaction_id": "T1001",
      "customer_id": "C1001",
      "order_date": "2024-07-01",
      "items": [
        {
          "item_id": "I1001",
          "description": "Laptop",
          "quantity": 1,
          "price_per_unit": 1200
        },
        {
          "item_id": "I1002",
          "description": "Mouse",
          "quantity": 2,
          "price_per_unit": 25
        }
      ],
      "total_amount": 1250
    },
    {
      "transaction_id": "T1002",
      "customer_id": "C1002",
      "order_date": "2024-07-03",
      "items": [
        {
          "item_id": "I1003",
          "description": "Smartphone",
          "quantity": 1,
          "price_per_unit": 800
        },
        {
          "item_id": "I1004",
          "description": "Earphones",
          "quantity": 1,
          "price_per_unit": 50
        }
      ],
      "total_amount": 850
    },
    {
      "transaction_id": "T1003",
      "customer_id": "C1001",
      "order_date": "2024-07-05",
      "items": [
        {
          "item_id": "I1005",
          "description": "Tablet",
          "quantity": 1,
          "price_per_unit": 600
        }
      ],
      "total_amount": 600
    },
    {
      "transaction_id": "T1004",
      "customer_id": "C1003",
      "order_date": "2024-07-07",
      "items": [
        {
          "item_id": "I1006",
          "description": "Monitor",
          "quantity": 2,
          "price_per_unit": 200
        }
      ],
      "total_amount": 400
    },
    {
      "transaction_id": "T1005",
      "customer_id": "C1004",
      "order_date": "2024-07-08",
      "items": [
        {
          "item_id": "I1007",
          "description": "Keyboard",
          "quantity": 1,
          "price_per_unit": 100
        },
        {
          "item_id": "I1002",
          "description": "Mouse",
          "quantity": 1,
          "price_per_unit": 25
        }
      ],
      "total_amount": 125
    },
    {
      "transaction_id": "T1006",
      "customer_id": "C1005",
      "order_date": "2024-07-09",
      "items": [
        {
          "item_id": "I1008",
          "description": "Headphones",
          "quantity": 1,
          "price_per_unit": 150
        }
      ],
      "total_amount": 150
    },
    {
      "transaction_id": "T1007",
      "customer_id": "C1006",
      "order_date": "2024-07-10",
      "items": [
        {
          "item_id": "I1009",
          "description": "Printer",
          "quantity": 1,
          "price_per_unit": 300
        }
      ],
      "total_amount": 300
    },
    {
      "transaction_id": "T1008",
      "customer_id": "C1007",
      "order_date": "2024-07-11",
      "items": [
        {
          "item_id": "I1010",
          "description": "Webcam",
          "quantity": 1,
          "price_per_unit": 80
        }
      ],
      "total_amount": 80
    },
    {
      "transaction_id": "T1009",
      "customer_id": "C1008",
      "order_date": "2024-07-12",
      "items": [
        {
          "item_id": "I1011",
          "description": "External Hard Drive",
          "quantity": 1,
          "price_per_unit": 120
        }
      ],
      "total_amount": 120
    },
    {
      "transaction_id": "T1010",
      "customer_id": "C1009",
      "order_date": "2024-07-13",
      "items": [
        {
          "item_id": "I1012",
          "description": "USB-C Hub",
          "quantity": 1,
          "price_per_unit": 50
        },
        {
          "item_id": "I1013",
          "description": "Charging Cable",
          "quantity": 2,
          "price_per_unit": 10
        }
      ],
      "total_amount": 70
    },
    {
      "transaction_id": "T1011",
      "customer_id": "C1010",
      "order_date": "2024-07-14",
      "items": [
        {
          "item_id": "I1014",
          "description": "Wireless Charger",
          "quantity": 1,
          "price_per_unit": 40
        }
      ],
      "total_amount": 40
    },
    {
      "transaction_id": "T1012",
      "customer_id": "C1011",
      "order_date": "2024-07-15",
      "items": [
        {
          "item_id": "I1015",
          "description": "Bluetooth Speaker",
          "quantity": 1,
          "price_per_unit": 60
        }
      ],
      "total_amount": 60
    },
    {
      "transaction_id": "T1013",
      "customer_id": "C1012",
      "order_date": "2024-07-16",
      "items": [
        {
          "item_id": "I1016",
          "description": "Smartwatch",
          "quantity": 1,
          "price_per_unit": 200
        }
      ],
      "total_amount": 200
    },
    {
      "transaction_id": "T1014",
      "customer_id": "C1013",
      "order_date": "2024-07-17",
      "items": [
        {
          "item_id": "I1017",
          "description": "Fitness Tracker",
          "quantity": 1,
          "price_per_unit": 100
        }
      ],
      "total_amount": 100
    },
    {
      "transaction_id": "T1015",
      "customer_id": "C1014",
      "order_date": "2024-07-18",
      "items": [
        {
          "item_id": "I1018",
          "description": "VR Headset",
          "quantity": 1,
          "price_per_unit": 400
        }
      ],
      "total_amount": 400
    },
    {
      "transaction_id": "T1016",
      "customer_id": "C1015",
      "order_date": "2024-07-19",
      "items": [
        {
          "item_id": "I1019",
          "description": "Drone",
          "quantity": 1,
          "price_per_unit": 500
        }
      ],
      "total_amount": 500
    },
    {
      "transaction_id": "T1017",
      "customer_id": "C1016",
      "order_date": "2024-07-20",
      "items": [
        {
          "item_id": "I1020",
          "description": "Gaming Console",
          "quantity": 1,
          "price_per_unit": 450
        }
      ],
      "total_amount": 450
    },
    {
      "transaction_id": "T1018",
      "customer_id": "C1017",
      "order_date": "2024-07-21",
      "items": [
        {
          "item_id": "I1021",
          "description": "Action Camera",
          "quantity": 1,
          "price_per_unit": 300
        }
      ],
      "total_amount": 300
    },
    {
      "transaction_id": "T1019",
      "customer_id": "C1018",
      "order_date": "2024-07-22",
      "items": [
        {
          "item_id": "I1022",
          "description": "E-Reader",
          "quantity": 1,
          "price_per_unit": 120
        }
      ],
      "total_amount": 120
    }
  ]
}

 

 

 

Nice testing! Thanks for the new insights...

PwerQueryKees
Super User
Super User

Ok. Potatoe potatoe i guess. The only point I was trying to make is that the variables only get a value at the moment you reference them. Unlike in mist other popular languages where the assignment is executed as statements in the program flow. 

This concept is also key in understanding M execution flow. It has also performance consequences because M will to a large extent only compute what is actually needed, potentially retrieving a lot less data than you would think if you regard the various variable assignments as sequential steps.

And I looked it up. The language reference calls them variables.

I stand corrected... 😬

PwerQueryKees
Super User
Super User

I don't know if there is a performance difference. I never noticed it, but that doesn' t mean anything.

 

BUT: Talking about concepts: What you call variables, really aren't! Conceptually they are named references to another formula.

M is a functional programming language. It starts at the end en then evaluates all names it encounters as needed. Implying that names not needed, are never evaluated!

Thank you @PwerQueryKees!

 

By reading these pages, I get the impression that each step ("named identifier") in the M code is a separate variable:

 

M Language basic concepts - PowerQuery M | Microsoft Learn

Power Query M Primer (Part 4): Variables & Identifiers | Ben Gribaudo

 

So when I was referring to variables, I simply mean a step ("named identifier") in the M code.

 

Isn't that correct?

 

I'm interested to learn more about the nuances here 😀

I'm struggling to see the difference between a variable and a step in the M code.

 

Isn't each step in the M code a variable?

 

(And I guess each variable can only be accessed locally inside the M query where the variable (i.e. step) was created?
Meaning, I cannot access a variable (step) created inside one M query from another M query?).

 

Thanks!

Calling them variables suggests that you are storing a value there.

In M the steps are not executed sequentially.

If you execute a query, power query looks at what is after the in keyword. And the it works back.

The names are much more like a function call than a variable. It work a little bit like Excel itself does.

Therefore you can add a name in a query for free, as long as you don't reference it!

And no, you can't use a name inside another query. You can't even use a name inside another let statement within the same query, unless you let statement is 'inside' it.

I know it is a non intuitive way of thinking about queries in power query. And the UI representation as steps doesn't help. Also the preview is misleading as it suggests all steps are always evaluated...

Hope this helps...

Thanks @PwerQueryKees ! This is helpful and great food for thought 💡

 

I know that the let expression works in such a way that it starts with the output - namely, the variable(s) mentioned after the in keyword - and then only evaluates the steps (variables) in the let expression which are required to produce the output (i.e. working it's way backwards when preparing the query execution plan).

 

However it still seems to me that each step (named identifier) in the let expression is regarded as a variable in the M language (regardless of whether the variable is being used or not):

 

https://learn.microsoft.com/en-us/powerquery-m/m-spec-basic-concepts#environments-and-variables

 

"Expressions are evaluated within a given environment. An environment is a set of named values, called variables. Each variable in an environment has a unique name within the environment called an identifier."

 

Ref. also https://bengribaudo.com/blog/2017/11/17/4107/power-query-m-primer-part1-introduction-simple-expressi...

 

"However, let lets us (pun intended) define intermediate expressions whose results are assigned to variables.

 

Breaking a large expression into intermediate components with names assigned to each makes our code easier to read. Doing this also allows us to reference those intermediate components multiple times as we build toward the let expression’s final return value.

 

let

  Multiplicand = 10,

  Multiplier = 20

in

  Multiplicand * Multiplier

 

In the first part of this let expression, variables are defined by name, each followed by an equals sign then the expression producing the value to be assigned to the variable. The variable definitions are separated by commas. The in part of the expression is simply an expression that defines what let returns.

 

Since the purpose of let is to allow defining variables, it makes sense that at least one of them would be used in the in expression. However, this isn’t mandatory. in can ignore all of the just-defined variables and return something else:

 

let

  Multiplicand = 10,

  Multiplier = 20

in

  2 + 3"

 

(Credit to Ben Gribaudo for the above).

 

Even if the variables assigned in the examples above are just number values, I think the output value of any step in a let expression is regarded as a variable in the M language (i.e. the value assigned to a variable in M can be a table, a list, a record, a json, a binary file, or simply a number, datetime value, text value, etc.). Isn't that correct?

 

Basically, the way I understand it, all of these named identifiers (highlighted with yellow color) are regarded as variables in M:

frithjof_v_0-1721030214435.png

 

Are there any named steps inside a let expression which cannot be regarded as a variable?

 

Again, I am eager to learn about any nuances here 😀 Thanks!

Helpful resources

Announcements
Power BI DataViz World Championships

Power BI Dataviz World Championships

The Power BI Data Visualization World Championships is back! Get ahead of the game and start preparing now!

December 2025 Power BI Update Carousel

Power BI Monthly Update - December 2025

Check out the December 2025 Power BI Holiday Recap!

FabCon Atlanta 2026 carousel

FabCon Atlanta 2026

Join us at FabCon Atlanta, March 16-20, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.