Solved: Re: Need help in calculating LAG and LEAD in power...

Anonymous · ‎07-09-2018

Hi All, Thanks in Advance

Need help in calcuating Lag and Lead values in Power query.

Sample data attached, you're valuable replies are most apprisiated.

SITE CALL NAME	UNIQUE ID	FLOW RATE	VOL	AVG PRES	AVG TEMP	LAG SITE CALL NAME	LAG FLOW RATE	LAG UNIQUE ID	LEAD SITE CALL NAME	LEAD FLOW RATE	LEAD UNIQUE ID
EP	9005411	662	39462	4.7	4.15		.	.	EP	663	9005411
EP	9005411	663	600.87701	4.7	4.15	EP	662	9005411	EP	675	9005411
EP	9005411	675	1.9871558	4.7	4.15	EP	663	9005411	EP	677	9005411
EP	9005411	677	2.9995337	4.7	4.15	EP	675	9005411	EP	664	9005411
EP	9005411	664	4.5045409	4.7	4.15	EP	677	9005411	EP	668	9005411
EP	9005411	668	4.9998097	4.7	4.15	EP	664	9005411	EP	670	9005411
EP	9005411	670	6.0068946	4.7	4.15	EP	668	9005411	EP	668	9005411
EP	9005411	668	6.9999108	4.7	4.15	EP	670	9005411	EP	665	9005411
EP	9005411	665	9.9994822	4.7	4.15	EP	668	9005411	EP	677	9005411
EP	9005411	677	10.999551	4.7	4.15	EP	665	9005411	EP	675	9005411

SITE CALL NAMEUNIQUE IDFLOW RATEVOLAVG PRESAVG TEMPLAG SITE CALL NAMELAG FLOW RATELAG UNIQUE IDLEAD SITE CALL NAMELEAD FLOW RATELEAD UNIQUE IDEP9005411662394624.74.15 ..EP6639005411EP9005411663600.877014.74.15EP6629005411EP6759005411EP90054116751.98715584.74.15EP6639005411EP6779005411EP90054116772.99953374.74.15EP6759005411EP6649005411EP90054116644.50454094.74.15EP6779005411EP6689005411EP90054116684.99980974.74.15EP6649005411EP6709005411EP90054116706.00689464.74.15EP6689005411EP6689005411EP90054116686.99991084.74.15EP6709005411EP6659005411EP90054116659.99948224.74.15EP6689005411EP6779005411EP900541167710.9995514.74.15EP6659005411EP6759005411

Anonymous · ‎07-13-2018

Thanks @Greg_Deckler @ImkeF @Stachu for your valuable sugesstion.

@Stachu the logic you have suggested is exactly getting result what i"m expetcing but mainly I'm looking to improve performance the logic you have provided it is taking lot of time because it is validating rows at each and evert set custom column created for individual Lag & Lead columns.

Seems there is no alternative I'm keeping my old logic which is creating two Indes column one starts with 0 and other with 1, later doing left join with same table on 0 starts Index column with 1 starts index column to get Lag and for lead vise versa Left join with Same table but on condition will be reverse 1st table join with 1 starts index column with 0 starts Index column.

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("jZExDsMwDAP/4jkRKFuyxQd0LNA9yJBn5PdVO3UqNImDDgTJ42iPV9ta5/687l0jNQE31VSDNnvemCGrndvf5wlIrAWtAiqMpe5RBbqQ9DFWFTBxmBtYB9IhwLLDFGBG1lQH0oGKcmh+AIte3kHxrcl/djjf", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [SITE = _t, TIME = _t, #"UNIQUE ID" = _t, VOL = _t, #"AVG DENS" = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"SITE", type text}, {"TIME", type date}, {"UNIQUE ID", Int64.Type}, {"VOL", type number}, {"AVG DENS", type number}}),
    #"Added Index" = Table.AddIndexColumn(#"Changed Type", "Index", 0, 1),
    #"Added Index1" = Table.AddIndexColumn(#"Added Index", "Index.1", 1, 1),
    #"Lag Merged Queries" = Table.NestedJoin(#"Added Index1",{"Index"},#"Added Index1",{"Index.1"},"Added Index1",JoinKind.LeftOuter),
    #"Lag merge" = Table.ExpandTableColumn(#"Lag Merged Queries", "Added Index1", {"SITE"}, {"Lag.SITE"}),
    #"Lead Merged Queries1" = Table.NestedJoin(#"Lag merge",{"Index.1"},#"Lag merge",{"Index"},"Expanded Added Index1",JoinKind.LeftOuter),
    #"Expanded Expanded Added Index1" = Table.ExpandTableColumn(#"Lead Merged Queries1", "Expanded Added Index1", {"SITE"}, {"Lead.SITE"})
in
    #"Expanded Expanded Added Index1"

View solution in original post

Greg_Deckler · ‎07-09-2018

Can you be more specific about what you mean by "lag" and "lead" calculations? Is it something like this:

http://www.oakton.edu/user/4/billtong/eas100lab/lagtime.htm

Follow on LinkedIn
@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!: DAX For Humans

DAX is easy, CALCULATE makes DAX hard...

Anonymous · ‎07-10-2018

Hi Greg, Thanks for your reply!

Lag means the values in source columns to be 1 column below in Lag column values

Eg:- please expand below spolier tag for sample data for Lag

Spoiler

SITE	TIME	UNIQUE ID	VOL	AVG DENS	LAG SITE	LAG TIME	LAG UNIQUE ID
EP	29MAY2018	9005411	39462	868.7	Null	Null	Null
EP	29MAY2018	9005411	600.87701	868.7	EP	29MAY2018	9005411
EP	29MAY2018	9005411	1.9871558	868.7	EP	29MAY2018	9005411
EP	29MAY2018	9005411	2.9995337	868.7	EP	29MAY2018	9005411
EP	29MAY2018	9005411	4.5045409	868.7	EP	29MAY2018	9005411
EP	29MAY2018	9005411	4.9998097	868.7	EP	29MAY2018	9005411
EP	29MAY2018	9005411	6.0068946	868.7	EP	29MAY2018	9005411
EP	29MAY2018	9005411	6.9999108	868.7	EP	29MAY2018	9005411
EP	29MAY2018	9005411	9.9994822	868.7	EP	29MAY2018	9005411
EP	29MAY2018	9005411	10.999551	868.7	EP	29MAY2018	9005411

SITETIMEUNIQUE IDVOLAVG DENSLAG SITELAG TIMELAG UNIQUE IDEP29MAY2018900541139462868.7 NullNullNullEP29MAY20189005411600.87701868.7EP29MAY20189005411EP29MAY201890054111.9871558868.7EP29MAY20189005411EP29MAY201890054112.9995337868.7EP29MAY20189005411EP29MAY201890054114.5045409868.7EP29MAY20189005411EP29MAY201890054114.9998097868.7EP29MAY20189005411EP29MAY201890054116.0068946868.7EP29MAY20189005411EP29MAY201890054116.9999108868.7EP29MAY20189005411EP29MAY201890054119.9994822868.7EP29MAY20189005411EP29MAY2018900541110.999551868.7EP29MAY20189005411

Lead means the values in source columns to be 1 column Above in Lead column values

Eg:- please expand below spolier tag for sample data for Lead

Spoiler

SITE	TIME	UNIQUE ID	VOL	AVG DENS	LEAD SITE	LEAD TIME	LEAD UNIQUE ID
EP	29MAY2018	9005411	39462	868.7	EP	29MAY2018	9005411
EP	29MAY2018	9005411	600.87701	868.7	EP	29MAY2018	9005411
EP	29MAY2018	9005411	1.9871558	868.7	EP	29MAY2018	9005411
EP	29MAY2018	9005411	2.9995337	868.7	EP	29MAY2018	9005411
EP	29MAY2018	9005411	4.5045409	868.7	EP	29MAY2018	9005411
EP	29MAY2018	9005411	4.9998097	868.7	EP	29MAY2018	9005411
EP	29MAY2018	9005411	6.0068946	868.7	EP	29MAY2018	9005411
EP	29MAY2018	9005411	6.9999108	868.7	EP	29MAY2018	9005411
EP	29MAY2018	9005411	9.9994822	868.7	EP	29MAY2018	9005411
EP	29MAY2018	9005411	10.999551	868.7	Null	Null	Null

SITETIMEUNIQUE IDVOLAVG DENSLEAD SITELEAD TIMELEAD UNIQUE IDEP29MAY2018900541139462868.7EP29MAY20189005411EP29MAY20189005411600.87701868.7EP29MAY20189005411EP29MAY201890054111.9871558868.7EP29MAY20189005411EP29MAY201890054112.9995337868.7EP29MAY20189005411EP29MAY201890054114.5045409868.7EP29MAY20189005411EP29MAY201890054114.9998097868.7EP29MAY20189005411EP29MAY201890054116.0068946868.7EP29MAY20189005411EP29MAY201890054116.9999108868.7EP29MAY20189005411EP29MAY201890054119.9994822868.7EP29MAY20189005411EP29MAY2018900541110.999551868.7NullNullNull

Greg_Deckler · ‎07-10-2018

Oh, heck you just need EARLIER. And I would strongly recommend an Index column be inserted in the Query Editor. That will make it super easy. See my article on Mean Time Before Failure (MTBF) which uses EARLIER: http://community.powerbi.com/t5/Community-Blog/Mean-Time-Between-Failure-MTBF-and-Power-BI/ba-p/3395...

Follow on LinkedIn
@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!: DAX For Humans

DAX is easy, CALCULATE makes DAX hard...

Anonymous · ‎07-10-2018

Yes i tried by creating two Index's one starts with 0 and other from 1 then merged same table based on index's.

The above scineario working for small amount of data but in my dataset I'm having huge data which is taking lot of time and failing due to out of memory, I'm looking for any alternative apart from Index logic.

Greg_Deckler · ‎07-10-2018

And I missed that you want to do this in Power Query instead of DAX. Can you post the snippet of the code that you are using right now? Also, if you really want to do this in M then @ImkeF is probably your best bet.

Follow on LinkedIn
@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!: DAX For Humans

DAX is easy, CALCULATE makes DAX hard...

Anonymous · ‎07-10-2018

@Greg_Deckler, Yes I'm looking to handle this M.

@ImkeF could you please help me on this.

ImkeF · ‎07-10-2018

Hi @Anonymous,

You have to add 3 index column and merge the table with itself twice: Once to get the previous row and then to get the values from the next row.

There are many tutorials out there (search for "Power Query previous row"), maybe you start with this one: https://www.excelguru.ca/blog/2018/02/07/extract-data-based-on-the-previous-row/

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

Anonymous · ‎07-11-2018

@ImkeF Agree with you I have done that and i achived the expected output but my concern is on Performance side I've 5 Million records while processing Index and merge for whole overnight the process is failing saying connection dropped / TimeOut occured.

If someone help me with an alternative would be very much appriciated.

Stachu · ‎07-11-2018

not sure about performance here, the idea is to reference particualr record in the table based on the Index colum

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("jZAxDsQwCAT/4jqyFhsMPOD666P8/xsBX3eKFJqxJVYrmPNsn287mgPCRPFbawSn83656yZJu46H7EwC3VRB73mVIHU3JREr5HM6urvLnFrYh/dUwMLwQv63Q/QbvNCvSHZgWQgq9q/sd0Lh3pV+PPNso+B/+yFsQfLv/7oB", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [#"SITE CALL NAME" = _t, #"UNIQUE ID" = _t, #"FLOW RATE" = _t, VOL = _t, #"AVG PRES" = _t, #"AVG TEMP" = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"SITE CALL NAME", type text}, {"UNIQUE ID", Int64.Type}, {"FLOW RATE", Int64.Type}, {"VOL", type number}, {"AVG PRES", type number}, {"AVG TEMP", type number}}),
    #"Added Index" = Table.AddIndexColumn(#"Changed Type", "Index", 0, 1),
    LAG_Site_Add = Table.AddColumn(#"Added Index", "LAG Site Call Name", each #"Added Index"{[Index]-1}[SITE CALL NAME]),
    LEAD_Site_Add = Table.AddColumn(LAG_Site_Add, "LEAD Site Call Name", each LAG_Site_Add{[Index]+1}[SITE CALL NAME]),
    #"Replaced Errors" = Table.ReplaceErrorValues(LEAD_Site_Add, { {"LAG Site Call Name", null}, {"LEAD Site Call Name", null}})
in
    #"Replaced Errors"

what I want to ask is why would you need the lag/lead columns? if they will be later used in DAX then I'm sure there would be some efficient way of achieving the same result without this complexity

Did I answer your question? Mark my post as a solution!
Thank you for the kudos 🙂

ImkeF · ‎07-11-2018

Are you using additional conditions (like: ..if Prev[Site Call Name] <> [Site Call Name] then null else Value...) that should better be replaced by a grouping as described here?:

https://www.thebiccountant.com/2017/05/29/performance-tip-partition-tables-crossjoins-possible-power...

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

Anonymous · ‎07-11-2018

I'm not using any other logic while joining based on Index columns it taking huge amount of Time/Memrory that's the main concern for me

ImkeF · ‎07-12-2018

Hi @Stachu,

this new method should be faster: https://www.thebiccountant.com/2018/07/12/fast-and-easy-way-to-reference-previous-or-next-rows-in-po...

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

Stachu · ‎07-13-2018

@ImkeF that's a really cool approach, thanks for the good read 🙂

Did I answer your question? Mark my post as a solution!
Thank you for the kudos 🙂

Anonymous · ‎07-13-2018

Thanks @Greg_Deckler @ImkeF @Stachu for your valuable sugesstion.

@Stachu the logic you have suggested is exactly getting result what i"m expetcing but mainly I'm looking to improve performance the logic you have provided it is taking lot of time because it is validating rows at each and evert set custom column created for individual Lag & Lead columns.

Seems there is no alternative I'm keeping my old logic which is creating two Indes column one starts with 0 and other with 1, later doing left join with same table on 0 starts Index column with 1 starts index column to get Lag and for lead vise versa Left join with Same table but on condition will be reverse 1st table join with 1 starts index column with 0 starts Index column.

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("jZExDsMwDAP/4jkRKFuyxQd0LNA9yJBn5PdVO3UqNImDDgTJ42iPV9ta5/687l0jNQE31VSDNnvemCGrndvf5wlIrAWtAiqMpe5RBbqQ9DFWFTBxmBtYB9IhwLLDFGBG1lQH0oGKcmh+AIte3kHxrcl/djjf", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [SITE = _t, TIME = _t, #"UNIQUE ID" = _t, VOL = _t, #"AVG DENS" = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"SITE", type text}, {"TIME", type date}, {"UNIQUE ID", Int64.Type}, {"VOL", type number}, {"AVG DENS", type number}}),
    #"Added Index" = Table.AddIndexColumn(#"Changed Type", "Index", 0, 1),
    #"Added Index1" = Table.AddIndexColumn(#"Added Index", "Index.1", 1, 1),
    #"Lag Merged Queries" = Table.NestedJoin(#"Added Index1",{"Index"},#"Added Index1",{"Index.1"},"Added Index1",JoinKind.LeftOuter),
    #"Lag merge" = Table.ExpandTableColumn(#"Lag Merged Queries", "Added Index1", {"SITE"}, {"Lag.SITE"}),
    #"Lead Merged Queries1" = Table.NestedJoin(#"Lag merge",{"Index.1"},#"Lag merge",{"Index"},"Expanded Added Index1",JoinKind.LeftOuter),
    #"Expanded Expanded Added Index1" = Table.ExpandTableColumn(#"Lead Merged Queries1", "Expanded Added Index1", {"SITE"}, {"Lead.SITE"})
in
    #"Expanded Expanded Added Index1"

Stachu · ‎07-11-2018

@Anonymous have you looked in the code I posted? there is no explicit join there (though I'm not sure how M manages that in the background) - or you have already tested it and the performance is still bad? have you considered using Table.Buffer?

Did I answer your question? Mark my post as a solution!
Thank you for the kudos 🙂

Need help in calculating LAG and LEAD in power query

Helpful resources

Power BI Dataviz World Championships

Power BI Monthly Update - December 2025

FabCon Atlanta 2026

How to Get Your Question Answered Quickly

FabCon is coming to Atlanta

Need help in calculating LAG and LEAD in power query

Helpful resources

Power BI Dataviz World Championships

Power BI Monthly Update - December 2025

FabCon Atlanta 2026

How to Get Your Question Answered Quickly