Join us at FabCon Atlanta from March 16 - 20, 2026, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.
Register now!The Power BI Data Visualization World Championships is back! Get ahead of the game and start preparing now! Learn more
Hi,
I am new to PowerBI & Powerquery and I need some help.
I have below table as source:
| job_id | step_id | run_date | run_time | duration |
| 09F901AD-0B49-48A7 | 0 | 20240723 | 160000 | 165051 |
| 09F901AD-0B49-48A7 | 1 | 20240723 | 160001 | 338 |
| 09F901AD-0B49-48A7 | 2 | 20240723 | 160339 | 164711 |
| 10O901XT-1234-4PO8 | 0 | 20240730 | 60000 | 29 |
| 10O901XT-1234-4PO8 | 1 | 20240730 | 60000 | 28 |
Table contains all job steps (can be different steps for each job).
when step_id = 0, it's the job itselve, in this case value of duration refers to complete duration including all steps of this schedule.
What is missing in this table is the schedule_id, I have no link between all steps for 1 specific job.
So I would like to create this new column to have below result:
| job_id | step_id | run_date | run_time | duration | schedule_id |
| 09F901AD-0B49-48A7 | 0 | 20240723 | 160000 | 165051 | 20240723160000 |
| 09F901AD-0B49-48A7 | 1 | 20240723 | 160001 | 338 | 20240723160000 |
| 09F901AD-0B49-48A7 | 2 | 20240723 | 160339 | 164711 | 20240723160000 |
| 10O901XT-1234-4PO8 | 0 | 20240730 | 60000 | 29 | 20240730060000 |
| 10O901XT-1234-4PO8 | 1 | 20240730 | 60000 | 28 | 20240730060000 |
Column schedule_id is the run_date & run_time of step_id = 0 of that schedule. So I need to start from records with step_id = 0 and retrieve all steps > 0 for that job_id where run_datetime is between start_rundatetime and end_rundatetime.
=> start_rundatetime = rundate & runtime of step 0
=> end_rundatetime = rundate & runtime + duration of step 0
Thanx,
Regards
Solved! Go to Solution.
Hi, I haven't used your logic, but result is the same as yours. Let me know if it works with real data.
Output:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("nZLBCsMgEET/xXOE3VlT12NK6TU99FAI+f/fqBFKqKklqyAMwmMeg8viKN0T8XTzdA3JB52iGxzlC0KgCMmRL5RPCSON7NahwfEPbnsT0TaEAySSSgiR/5TVkrpLwiKpPZK6SyaDJKhvycJZJTPUtSTQt2ThzJI4Lsk0Z+719AwJPjxm/ZKULX6+JFIb4Qaip1rY3sLWFiVzS4Wca4G9BVXL+gY=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [job_id = _t, step_id = _t, run_date = _t, run_time = _t, duration = _t]),
ChangedType = Table.TransformColumnTypes(Source,{{"step_id", Int64.Type}, {"run_date", Int64.Type}, {"run_time", Int64.Type}, {"duration", Int64.Type}}),
GroupedRows = Table.Group(ChangedType, {"job_id", "step_id"}, {{"All", each
[ a = Table.SelectRows(_, (x)=> x[step_id] = 0){0}?,
b = Table.AddColumn(_, "schedule_id", (x)=> Text.From(a[run_date]?) & Text.PadStart(Text.From(a[run_time]?), 6, "0"), type text)
][b], type table}}, GroupKind.Local,
(x,y)=> Number.From(y[step_id] = 0) ),
Combined = Table.Combine(GroupedRows[All])
in
Combined
Hi @Johny1981, check this:
Output
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("dY49CoAwDEbv0tnClx9tMyriWgcHQbz/NYyKg2gDgUfgkbdtATYZqB8jBrWouU+hCfBlsCKxOFIHnwtatBT2puLRj3feRHJd4o8kYhdoovsZobi3LpFYNOpc8itSTnwa2eoKVRSv2w8=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [job_id = _t, step_id = _t, run_date = _t, run_time = _t, duration = _t]),
ChangedType = Table.TransformColumnTypes(Source,{{"step_id", Int64.Type}, {"run_date", Int64.Type}, {"run_time", Int64.Type}, {"duration", Int64.Type}}),
GroupedRows = Table.Group(ChangedType, {"job_id"}, {{"All", each
[ a = Table.SelectRows(_, (x)=> x[step_id] = 0){0}?,
b = Table.AddColumn(_, "schedule_id", (x)=> Text.From(a[run_date]?) & Text.From(a[run_time]?), type text)
][b], type table}}),
Combined = Table.Combine(GroupedRows[All])
in
Combined
where run_datetime is between start_rundatetime and end_rundatetime.
There is no end_rundatetime column in your sample data. Please provide sample data that fully covers your issue.
Please show the expected outcome based on the sample data you provided.
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("dY49CoAwDEbv0tnClx9tMyriWgcHQbz/NYyKg2gDgUfgkbdtATYZqB8jBrWouU+hCfBlsCKxOFIHnwtatBT2puLRj3feRHJd4o8kYhdoovsZobi3LpFYNOpc8itSTnwa2eoKVRSv2w8=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [job_id = _t, step_id = _t, run_date = _t, run_time = _t, duration = _t]),
#"Grouped Rows" = Table.Group(Source, {"job_id"}, {{"Rows", each _, type table [job_id=nullable text, step_id=nullable text, run_date=nullable text, run_time=nullable text, duration=nullable text]}, {"schedule_id", each List.Min([run_date])&List.Min([run_time]), type nullable text}}),
#"Expanded Rows" = Table.ExpandTableColumn(#"Grouped Rows", "Rows", {"step_id", "run_date", "run_time", "duration"}, {"step_id", "run_date", "run_time", "duration"})
in
#"Expanded Rows"
How to use this code: Create a new Blank Query. Click on "Advanced Editor". Replace the code in the window with the code provided here. Click "Done". Once you examined the code, replace the Source step with your own source.
@lbendlin @dufoq3 thanks for the code. It's working.
In my case my table contains all job runs, my initial example was a bit too simple I see now. Here is another example with multiple runs for a job, I already included the result column schedule_id
| job_id | step_id | run_date | run_time | duration | schedule_id |
| 09F901AD-0B49-48A7 | 0 | 20240723 | 160000 | 165051 | 20240723160000 |
| 09F901AD-0B49-48A7 | 1 | 20240723 | 160001 | 338 | 20240723160000 |
| 09F901AD-0B49-48A7 | 2 | 20240723 | 160339 | 164711 | 20240723160000 |
| 09F901AD-0B49-48A7 | 0 | 20240723 | 180000 | 165251 | 20240723180000 |
| 09F901AD-0B49-48A7 | 1 | 20240723 | 180001 | 338 | 20240723180000 |
| 09F901AD-0B49-48A7 | 2 | 20240723 | 180339 | 164911 | 20240723180000 |
| 09F901AD-0B49-48A7 | 0 | 20240723 | 200000 | 165051 | 20240723200000 |
| 09F901AD-0B49-48A7 | 1 | 20240723 | 200001 | 338 | 20240723200000 |
| 09F901AD-0B49-48A7 | 2 | 20240723 | 200339 | 164711 | 20240723200000 |
| 09F901AD-0B49-48A7 | 0 | 20240723 | 220000 | 165251 | 20240723220000 |
| 09F901AD-0B49-48A7 | 1 | 20240723 | 220001 | 338 | 20240723220000 |
| 09F901AD-0B49-48A7 | 2 | 20240723 | 220339 | 164911 | 20240723220000 |
| 10O901XT-1234-4PO8 | 0 | 20240730 | 60000 | 29 | 20240730060000 |
| 10O901XT-1234-4PO8 | 1 | 20240730 | 60000 | 28 | 20240730060000 |
| 10O901XT-1234-4PO8 | 0 | 20240731 | 60000 | 29 | 20240731060000 |
| 10O901XT-1234-4PO8 | 1 | 20240731 | 60000 | 28 | 20240731060000 |
| 10O901XT-1234-4PO8 | 0 | 20240801 | 60000 | 29 | 20240801060000 |
| 10O901XT-1234-4PO8 | 1 | 20240801 | 60000 | 28 | 20240801060000 |
| 10O901XT-1234-4PO8 | 0 | 20240802 | 60000 | 29 | 20240802060000 |
| 10O901XT-1234-4PO8 | 1 | 20240802 | 60000 | 28 | 20240802060000 |
startrun_time & endrun_time a jobstep can be calculated as follows:
startrun_time => run_date & run_time
endrun_time => run_date & run_time + duration
format of time is HHMMSS
format of duration is HHMMSS (165051 means 16 hours 50 minuntes 51 seconds and 339 means 3 minutes and 39 seconds)
Step 0 is the job itselve so in case of 1st jobrun of 09F901AD-0B49-48A7, step 1 & 2 falls between 20240723160000 (startrun_time step 0) & 20240724085051 (endrun_time step 0).
Do you guys have any idea how to retrieve the schedule_id in these cases?
Hi, I haven't used your logic, but result is the same as yours. Let me know if it works with real data.
Output:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("nZLBCsMgEET/xXOE3VlT12NK6TU99FAI+f/fqBFKqKklqyAMwmMeg8viKN0T8XTzdA3JB52iGxzlC0KgCMmRL5RPCSON7NahwfEPbnsT0TaEAySSSgiR/5TVkrpLwiKpPZK6SyaDJKhvycJZJTPUtSTQt2ThzJI4Lsk0Z+719AwJPjxm/ZKULX6+JFIb4Qaip1rY3sLWFiVzS4Wca4G9BVXL+gY=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [job_id = _t, step_id = _t, run_date = _t, run_time = _t, duration = _t]),
ChangedType = Table.TransformColumnTypes(Source,{{"step_id", Int64.Type}, {"run_date", Int64.Type}, {"run_time", Int64.Type}, {"duration", Int64.Type}}),
GroupedRows = Table.Group(ChangedType, {"job_id", "step_id"}, {{"All", each
[ a = Table.SelectRows(_, (x)=> x[step_id] = 0){0}?,
b = Table.AddColumn(_, "schedule_id", (x)=> Text.From(a[run_date]?) & Text.PadStart(Text.From(a[run_time]?), 6, "0"), type text)
][b], type table}}, GroupKind.Local,
(x,y)=> Number.From(y[step_id] = 0) ),
Combined = Table.Combine(GroupedRows[All])
in
Combined
I had to adapt 1 small thing: sorting the table before I pasted your code. It seems your code expects the table in a specific order.
Thanx!
The Power BI Data Visualization World Championships is back! Get ahead of the game and start preparing now!
| User | Count |
|---|---|
| 19 | |
| 10 | |
| 9 | |
| 7 | |
| 6 |