March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount! Early bird discount ends December 31.
Register NowBe one of the first to start using Fabric Databases. View on-demand sessions with database experts and the Microsoft product team to learn just how easy it is to get started. Watch now
Hello guys, i need help to replicate the same function i have in sql to power query. I need to bring the result of the previous row of the column "ORGANIZATION_UNIT_ID" partitioned by a cluster which in this case is the "GLOBAL_EMPLOYEE_ID" of the person and ordered by the date (LOAD DATE).
here is the expected result:
LOAD_DATE | GLOBAL_EMPLOYEE_ID | ORGANIZATION_UNIT_ID | Last_Organization_Unit_ID |
31/10/2021 | 2000 | 1 | null |
30/11/2021 | 2000 | 1 | 1 |
31/12/2021 | 2000 | 1 | 1 |
31/01/2022 | 2000 | 2 | 1 |
28/02/2022 | 2000 | 2 | 2 |
31/03/2022 | 2000 | 2 | 2 |
30/04/2022 | 2000 | 1 | 2 |
31/05/2022 | 2000 | 1 | 1 |
30/06/2022 | 2000 | 3 | 1 |
31/10/2021 | 1000 | 1 | null |
30/11/2021 | 1000 | 1 | 1 |
31/12/2021 | 1000 | 2 | 1 |
31/01/2022 | 1000 | 2 | 2 |
28/02/2022 | 1000 | 3 | 2 |
31/03/2022 | 1000 | 5 | 3 |
30/04/2022 | 1000 | 5 | 5 |
31/05/2022 | 1000 | 5 | 5 |
30/06/2022 | 1000 | 5 | 5 |
This is the code I use in sql:
SELECT
LOAD_DATE,
ORGANIZATION_UNIT_ID,
GLOBAL_EMPLOYEE_ID,
LEAD(ORGANIZATION_UNIT_ID, 1)
OVER (PARTITION BY GLOBAL_EMPLOYEE_ID
ORDER BY LOAD_DATE DESC
) AS Last_Org_Unit_ID
FROM Table
I've already tried to merge the table with it even using load date -1, but the mass of data I have is very large and makes the process of becoming a data flow unfeasible
If anyone has any ideas, I'd appreciate it.
You can write that SQL right into a dataflow.
--Nate
I always find that writing the LEAD/LAG SQL is faster than any other way of doing previous row stuff. If you are using PBI Desktop, you can add "EnableFolding = true" to the Value.NativeQuery parameter to keep the query folding.
--Nate
Hi @watkinnc. I totally agree with you. In SQL to do this same treatment takes a maximum of 2 minutes. In power query, it takes hours. The problem with this and the reason is that in a few days the base that I do this in sql will be made available to me only through a data flow.
You could also do the classic indexes starting from zero and then starting from one and then merge the table to itself.
--Nate
See the working here - Open a blank query - Home - Advanced Editor - Remove everything from there and paste the below code to test
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("dc9BDoAgDETRu7Am6UwR4l2I97+GSAxooSsWLxP6aw2JQohCGWJQAO1huGITCLmXtlFP0Dc6RbvoKdCtPJu0FwgOI/OfvJe2KUbSuHqU0i2lW0p79SilW0pzwSx9Ja+lf/mWLpuyynUD", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [LOAD_DATE = _t, GLOBAL_EMPLOYEE_ID = _t, ORGANIZATION_UNIT_ID = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"LOAD_DATE", type text}, {"GLOBAL_EMPLOYEE_ID", Int64.Type}, {"ORGANIZATION_UNIT_ID", Int64.Type}}),
ListOfIDs = List.Buffer(#"Changed Type"[GLOBAL_EMPLOYEE_ID]),
ListOfValues = List.Buffer(#"Changed Type"[ORGANIZATION_UNIT_ID]),
GenList = List.Generate(()=>[x=null,i=0], each [i]<List.Count(ListOfValues), each [i=[i]+1, x=(if ListOfIDs{i}=ListOfIDs{[i]} then ListOfValues{[i]} else null)], each [x]),
Result = Table.FromColumns(Table.ToColumns(#"Changed Type")&{GenList},Table.ColumnNames(#"Changed Type")&{"Last_Organization_Unit_ID"})
in
Result
Hi @Vijay_A_Verma. Thanks a lot for the solution. It brought the expected result, but not in the required performance. I applied it to my dataset and it's been two hours processing the data flow.
I'm not an expert on the subject, but I've been trying to evolve from a grouping made by "Global ID". I can add an index inside the virtual table. The expected is that it was already possible to create a column referencing the previous result. But according to some tutorials I've seen this is only possible by referencing the previous step. (https://www.youtube.com/watch?v=UPddzZnsf5w)
Due to the complexity of the code you gave me, I'm sure you could do something like that. I believe it can be more performative for the data profile I'm working on (a few million lines) kkk.
here is the code i am using:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("dc9BDoAgDETRu7Am6UwR4l2I97+GSAxooSsWLxP6aw2JQohCGWJQAO1huGITCLmXtlFP0Dc6RbvoKdCtPJu0FwgOI/OfvJe2KUbSuHqU0i2lW0p79SilW0pzwSx9Ja+lf/mWLpuyynUD", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [LOAD_DATE = _t, GLOBAL_EMPLOYEE_ID = _t, ORGANIZATION_UNIT_ID = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"LOAD_DATE", type date}, {"GLOBAL_EMPLOYEE_ID", Int64.Type}, {"ORGANIZATION_UNIT_ID", Int64.Type}}),
#"Grouped Rows" = Table.Group(#"Changed Type", {"GLOBAL_EMPLOYEE_ID"}, {{"Data", each _, type table [LOAD_DATE=nullable date, GLOBAL_EMPLOYEE_ID=nullable number, ORGANIZATION_UNIT_ID=nullable number]}}),
#"Added Custom" = Table.AddColumn(#"Grouped Rows", "DataIndex", each Table.AddIndexColumn([Data],"Index",1))
in
#"Added Custom"
If I could do something along those lines, that would be great.
in advance, thank you very much
Hi Vijay,
This is great code. Can you explain what's going on in this code?
List.Generate(()=>[x=null,i=0], each [i]<List.Count(ListOfValues), each [i=[i]+1, x=(if ListOfIDs{i}=ListOfIDs{[i]} then ListOfValues{[i]} else null)], each [x])
What I undestood is:
()=>[x=null,i=0] - this creates record object with values x=null and i=0
each [i]<List.Count(ListOfValues) - this checks if iteration is smaller then number of items in list
each [i=[i]+1, x=(if ListOfIDs{i}=ListOfIDs{[i]} - this creates record with column i equal i + 1 and column x - here I don't understand what is happening. ListOfIDs{i} means get element i from list ListOfIDs but what does it does ListOfIDs{[i]} ?
last part returns x calculated earlier.
Thanks in advance for your explanantion.
Artur
i = [i]+1, hence in ListOfIDs{i}=ListOfIDs{[i]}, I can also use ListOfIDs{i}=ListOfIDs{i-1} => I am checking wether current value of list is = previous value of list or not. If it is equal, it means same group continues. If it is unequal, then a new group has started. If a new group is started, then result should be null. If within same group, then value is ListOfValues{[i]} where [i] means i-1. So I could have written it ListOfValues{i-1} as well.
March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!
Arun Ulag shares exciting details about the Microsoft Fabric Conference 2025, which will be held in Las Vegas, NV.