Check your eligibility for this 50% exam voucher offer and join us for free live learning sessions to get prepared for Exam DP-700.
Get StartedDon't miss out! 2025 Microsoft Fabric Community Conference, March 31 - April 2, Las Vegas, Nevada. Use code MSCUST for a $150 discount. Prices go up February 11th. Register now.
Hello, it is my first time in here and I would be very grateful if someone can help me with this task...
I have an extract of 3 days of an log alarms of different machines, each log line has its start and end times columns filled. It's very common to have some alarms overlaping other in many ways (complete overlap, partial overlap or no overlap).
I need to threat this data to resume the total time of downtime (with no overlaps, I also saw the term "saturation" in other posts) per Machine per Day:
Type | AEG | AlarmCode | Index | StartTime | EndTime |
ERROR | Machine 1 | 239 | 535 | 2023-11-01T14:24:43.6210000 | 2023-11-01T14:26:33.8710000 |
ERROR | Machine 2 | 81 | 492 | 2023-11-02T08:04:37.2480000 | 2023-11-02T08:10:45.5140000 |
ERROR | Machine 2 | 22 | 493 | 2023-11-02T08:04:42.9820000 | 2023-11-02T08:10:45.5140000 |
ERROR | Machine 2 | 1405 | 486 | 2023-11-02T08:04:45.4760000 | 2023-11-02T08:10:47.5090000 |
ERROR | Machine 2 | 22 | 475 | 2023-11-02T08:22:02.5210000 | 2023-11-02T08:22:23.7340000 |
ERROR | Machine 2 | 1405 | 469 | 2023-11-02T08:22:03.7670000 | 2023-11-02T08:25:33.2080000 |
ERROR | Machine 2 | 22 | 473 | 2023-11-02T08:22:23.9630000 | 2023-11-02T08:25:32.7090000 |
ERROR | Machine 2 | 81 | 436 | 2023-11-02T08:36:05.2700000 | 2023-11-02T09:00:49.0560000 |
ERROR | Machine 2 | 22 | 389 | 2023-11-02T09:32:13.5200000 | 2023-11-02T10:45:16.1370000 |
ERROR | Machine 2 | 1405 | 384 | 2023-11-02T09:32:17.5090000 | 2023-11-02T10:45:16.1370000 |
ERROR | Machine 2 | 81 | 388 | 2023-11-02T09:32:26.2360000 | 2023-11-02T10:45:16.1370000 |
INFO | Machine 2 | 1000024 | 383 | 2023-11-02T09:52:47.2220000 | 2023-11-02T10:45:17.8830000 |
ERROR | Machine 3 | 101 | 356 | 2023-11-02T11:47:39.3940000 | 2023-11-02T12:46:25.8990000 |
INFO | Machine 3 | 1000024 | 359 | 2023-11-02T11:59:04.8150000 | 2023-11-02T12:44:09.4330000 |
ERROR | Machine 4 | 101 | 113 | 2023-11-03T14:23:20.8090000 | 2023-11-03T16:21:58.1280000 |
ERROR | Machine 4 | 22 | 117 | 2023-11-03T14:52:21.4570000 | 2023-11-03T16:21:58.1280000 |
ERROR | Machine 4 | 23 | 116 | 2023-11-03T14:52:21.4570000 | 2023-11-03T16:21:58.1280000 |
ERROR | Machine 4 | 28 | 114 | 2023-11-03T14:52:21.4570000 | 2023-11-03T16:21:58.1280000 |
ERROR | Machine 4 | 27 | 115 | 2023-11-03T14:52:21.4570000 | 2023-11-03T16:21:58.1280000 |
ERROR | Machine 4 | 3022 | 99 | 2023-11-03T14:52:25.6960000 | 2023-11-03T16:21:58.1280000 |
INFO | Machine 4 | 1000024 | 92 | 2023-11-03T16:15:12.7890000 | 2023-11-03T16:21:58.7270000 |
ERROR | Machine 4 | 33 | 89 | 2023-11-03T16:22:04.4610000 | 2023-11-03T16:22:08.7020000 |
ERROR | Machine 4 | 33 | 84 | 2023-11-03T16:22:10.4460000 | 2023-11-03T16:22:13.4380000 |
ERROR | Machine 4 | 33 | 80 | 2023-11-03T16:22:16.4300000 | 2023-11-03T16:22:18.4270000 |
ERROR | Machine 4 | 33 | 74 | 2023-11-03T16:22:22.4140000 | 2023-11-03T16:24:00.6540000 |
INFO | Machine 4 | 1000024 | 73 | 2023-11-03T16:23:47.9280000 | 2023-11-03T16:24:00.8940000 |
The expected output is a table like below using Power Query (but also accept other suggestions):
Thank you!
Solved! Go to Solution.
Here is an approach proposed by @jpessoa8
Hours =
VAR d = SELECTCOLUMNS(
ADDCOLUMNS(
CROSSJOIN('Dates',GENERATESERIES(0,1439)),
"ms", 'Dates'[Date] * 1440 + [Value]
),
"Value",[ms])
VAR p = SELECTCOLUMNS(
GENERATE(
VALUES('Outages'[StartTime]),
VAR EndValue = CALCULATE(MAX('Outages'[EndTime]))
RETURN GENERATESERIES(INT('Outages'[StartTime] * 1440), EndValue * 1440 - 1) ),
"Value",[Value] )
RETURN
COUNTROWS(INTERSECT(d,p))/60
see attached
And here is a Power Query (M-Code) solution with Output in whole minutes:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("tZZNaxwxDIb/StlzIqwvW/K9hR7awJJb2EMIgfSS/3+sxptss7YnO23ZOQwG4Ucvrz5mHh52X/f7u/3uZvfj8enl1+vzF4wzscdbWZdzIr5FvE14j1JJqjBkwhTPGM2VGawco4ebkU5xtiWFOH28TvfJapLKBUish7copioKivIpnKjBeQYXAjf6D3iEF0vE8hSvICWv4gto8g3aiw7XiWoi0NH19ygxFN6mPfsUH4Bc5nhdikrJtmgffT+q88yrcIJywZhjx/DoOueaFKgJ7+FeU7jukDRfVs7W2+IhrCKH6xN4a5eKGZDLFtfZZIr/0xP/hm/GsNkMThmIJ+04hX//+e2ul74ESVqCvqxelZaOJpqM01uCAma8qp5bhiZfu7oiBrqyA7tM6JE4R9+Aua/J53P56kMCje4QMNR5AqnJQXhdvpzkI56Zw20PcqUENpY2oqE90hsgrQ+UvLclYhng4TwhiA7T+jfw5hDm68CtweU68NLgehU4p+a6+5SukH2YpxV615By1pDnn75GwBiY2IP2WccUWl8GTf1SVOu157beBSQPX49TNNiJNrD7krbbmEBkxZe2QYUvuN7Y89s5bg/79xQ1kC2elKluIhAcFswxGvOfIKtsq2fpN0BeNkCsR6fhV+YD397W2+HwGw==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Type = _t, AEG = _t, AlarmCode = _t, Index = _t, StartTime = _t, EndTime = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{
{"Type", type text}, {"AEG", type text}, {"AlarmCode", Int64.Type}, {"Index", Int64.Type},
{"StartTime", type datetime}, {"EndTime", type datetime}}),
#"Round to Minute" = Table.TransformColumns(#"Changed Type",{
{"StartTime", each DateTime.From(Number.Round(Number.From(_)*1440,0)/1440), type datetime},
{"EndTime", each DateTime.From(Number.Round(Number.From(_)*1440,0)/1440), type datetime}}),
#"Add Minutes List" = Table.AddColumn(#"Round to Minute","Minutes",
each List.DateTimes([StartTime], Duration.TotalMinutes([EndTime]-[StartTime]),#duration(0,0,1,0))),
#"Grouped Rows" = Table.Group(#"Add Minutes List", {"AEG"}, {
{"all", each _,
type table [Type=nullable text, AEG=nullable text, AlarmCode=nullable number,
Index=nullable number, StartTime=datetime, EndTime=datetime, Minutes=list]}}),
//Add distinct list of all minutes column,
// change to just dates so we can count minuts per date
// create table and pivot
#"Added Custom" = Table.AddColumn(#"Grouped Rows", "Minutes", each
let
datesList =
List.Transform(
List.Distinct(
List.Combine([all][Minutes])),
each Date.From(_)),
tbl = Table.FromColumns({datesList},{"Dates"}),
#"Dates to Text" = Table.TransformColumns(tbl,{"Dates", each Date.ToText(_,"yyyy-MM-dd")}),
addCounter = Table.AddColumn(#"Dates to Text","Counter", each 1, Int64.Type),
pivot = Table.Pivot(addCounter, List.Distinct(addCounter[Dates]),"Dates","Counter",List.Sum)
in
pivot),
#"Removed Columns" = Table.RemoveColumns(#"Added Custom",{"all"}),
#"ColumnNames in Pivot" = List.Distinct(List.Combine(List.Transform(#"Removed Columns"[Minutes], each Table.ColumnNames(_)))),
#"Expand Table" = Table.ExpandTableColumn(#"Removed Columns","Minutes",#"ColumnNames in Pivot")
in
#"Expand Table"
And here is a Power Query (M-Code) solution with Output in whole minutes:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("tZZNaxwxDIb/StlzIqwvW/K9hR7awJJb2EMIgfSS/3+sxptss7YnO23ZOQwG4Ucvrz5mHh52X/f7u/3uZvfj8enl1+vzF4wzscdbWZdzIr5FvE14j1JJqjBkwhTPGM2VGawco4ebkU5xtiWFOH28TvfJapLKBUish7copioKivIpnKjBeQYXAjf6D3iEF0vE8hSvICWv4gto8g3aiw7XiWoi0NH19ygxFN6mPfsUH4Bc5nhdikrJtmgffT+q88yrcIJywZhjx/DoOueaFKgJ7+FeU7jukDRfVs7W2+IhrCKH6xN4a5eKGZDLFtfZZIr/0xP/hm/GsNkMThmIJ+04hX//+e2ul74ESVqCvqxelZaOJpqM01uCAma8qp5bhiZfu7oiBrqyA7tM6JE4R9+Aua/J53P56kMCje4QMNR5AqnJQXhdvpzkI56Zw20PcqUENpY2oqE90hsgrQ+UvLclYhng4TwhiA7T+jfw5hDm68CtweU68NLgehU4p+a6+5SukH2YpxV615By1pDnn75GwBiY2IP2WccUWl8GTf1SVOu157beBSQPX49TNNiJNrD7krbbmEBkxZe2QYUvuN7Y89s5bg/79xQ1kC2elKluIhAcFswxGvOfIKtsq2fpN0BeNkCsR6fhV+YD397W2+HwGw==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Type = _t, AEG = _t, AlarmCode = _t, Index = _t, StartTime = _t, EndTime = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{
{"Type", type text}, {"AEG", type text}, {"AlarmCode", Int64.Type}, {"Index", Int64.Type},
{"StartTime", type datetime}, {"EndTime", type datetime}}),
#"Round to Minute" = Table.TransformColumns(#"Changed Type",{
{"StartTime", each DateTime.From(Number.Round(Number.From(_)*1440,0)/1440), type datetime},
{"EndTime", each DateTime.From(Number.Round(Number.From(_)*1440,0)/1440), type datetime}}),
#"Add Minutes List" = Table.AddColumn(#"Round to Minute","Minutes",
each List.DateTimes([StartTime], Duration.TotalMinutes([EndTime]-[StartTime]),#duration(0,0,1,0))),
#"Grouped Rows" = Table.Group(#"Add Minutes List", {"AEG"}, {
{"all", each _,
type table [Type=nullable text, AEG=nullable text, AlarmCode=nullable number,
Index=nullable number, StartTime=datetime, EndTime=datetime, Minutes=list]}}),
//Add distinct list of all minutes column,
// change to just dates so we can count minuts per date
// create table and pivot
#"Added Custom" = Table.AddColumn(#"Grouped Rows", "Minutes", each
let
datesList =
List.Transform(
List.Distinct(
List.Combine([all][Minutes])),
each Date.From(_)),
tbl = Table.FromColumns({datesList},{"Dates"}),
#"Dates to Text" = Table.TransformColumns(tbl,{"Dates", each Date.ToText(_,"yyyy-MM-dd")}),
addCounter = Table.AddColumn(#"Dates to Text","Counter", each 1, Int64.Type),
pivot = Table.Pivot(addCounter, List.Distinct(addCounter[Dates]),"Dates","Counter",List.Sum)
in
pivot),
#"Removed Columns" = Table.RemoveColumns(#"Added Custom",{"all"}),
#"ColumnNames in Pivot" = List.Distinct(List.Combine(List.Transform(#"Removed Columns"[Minutes], each Table.ColumnNames(_)))),
#"Expand Table" = Table.ExpandTableColumn(#"Removed Columns","Minutes",#"ColumnNames in Pivot")
in
#"Expand Table"
You can refactor that to
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("tZZNaxwxDIb/StlzIqwvW/K9hR7awJJb2EMIgfSS/3+sxptss7YnO23ZOQwG4Ucvrz5mHh52X/f7u/3uZvfj8enl1+vzF4wzscdbWZdzIr5FvE14j1JJqjBkwhTPGM2VGawco4ebkU5xtiWFOH28TvfJapLKBUish7copioKivIpnKjBeQYXAjf6D3iEF0vE8hSvICWv4gto8g3aiw7XiWoi0NH19ygxFN6mPfsUH4Bc5nhdikrJtmgffT+q88yrcIJywZhjx/DoOueaFKgJ7+FeU7jukDRfVs7W2+IhrCKH6xN4a5eKGZDLFtfZZIr/0xP/hm/GsNkMThmIJ+04hX//+e2ul74ESVqCvqxelZaOJpqM01uCAma8qp5bhiZfu7oiBrqyA7tM6JE4R9+Aua/J53P56kMCje4QMNR5AqnJQXhdvpzkI56Zw20PcqUENpY2oqE90hsgrQ+UvLclYhng4TwhiA7T+jfw5hDm68CtweU68NLgehU4p+a6+5SukH2YpxV615By1pDnn75GwBiY2IP2WccUWl8GTf1SVOu157beBSQPX49TNNiJNrD7krbbmEBkxZe2QYUvuN7Y89s5bg/79xQ1kC2elKluIhAcFswxGvOfIKtsq2fpN0BeNkCsR6fhV+YD397W2+HwGw==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Type = _t, AEG = _t, AlarmCode = _t, Index = _t, StartTime = _t, EndTime = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Type", type text}, {"AEG", type text}, {"AlarmCode", Int64.Type}, {"Index", Int64.Type}, {"StartTime", type datetime}, {"EndTime", type datetime}}),
#"Start Date" = Table.AddColumn(#"Changed Type", "Date", each Date.From([StartTime]),type date),
Interval = Table.AddColumn(#"Start Date", "Interval", each { Int64.From(Number.From([StartTime])*1440)..Int64.From(Number.From([EndTime])*1440-1)}),
#"Grouped Rows" = Table.Group(Interval, {"AEG", "Date"}, {{"Rows", each _, type table [Type=nullable text, AEG=nullable text, AlarmCode=nullable number, Index=nullable number, StartTime=nullable datetime, EndTime=nullable datetime, Date=date, Interval=list]}}),
#"Added Custom" = Table.AddColumn(#"Grouped Rows", "Minutes", each List.Count(List.Distinct(List.Combine([Rows][Interval]))),Int64.Type)
in
#"Added Custom"
That sort of works also, but it seems that if the alarm time is > 24 hours, all of the time is allocated to the first date.
You are correct, it would also be an issue if the outage would just stretch over two days. Here's the adjusted code.
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("tZZNaxwxDIb/StlzIqwvW/K9hR7awJJb2EMIgfSS/3+sxptss7YnO23ZOQwG4Ucvrz5mHh52X/f7u/3uZvfj8enl1+vzF4wzscdbWZdzIr5FvE14j1JJqjBkwhTPGM2VGawco4ebkU5xtiWFOH28TvfJapLKBUish7copioKivIpnKjBeQYXAjf6D3iEF0vE8hSvICWv4gto8g3aiw7XiWoi0NH19ygxFN6mPfsUH4Bc5nhdikrJtmgffT+q88yrcIJywZhjx/DoOueaFKgJ7+FeU7jukDRfVs7W2+IhrCKH6xN4a5eKGZDLFtfZZIr/0xP/hm/GsNkMThmIJ+04hX//+e2ul74ESVqCvqxelZaOJpqM01uCAma8qp5bhiZfu7oiBrqyA7tM6JE4R9+Aua/J53P56kMCje4QMNR5AqnJQXhdvpzkI56Zw20PcqUENpY2oqE90hsgrQ+UvLclYhng4TwhiA7T+jfw5hDm68CtweU68NLgehU4p+a6+5SukH2YpxV615By1pDnn75GwBiY2IP2WccUWl8GTf1SVOu157beBSQPX49TNNiJNrD7krbbmEBkxZe2QYUvuN7Y89s5bg/79xQ1kC2elKluIhAcFswxGvOfIKtsq2fpN0BeNkCsR6fhV+YD397W2+HwGw==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Type = _t, AEG = _t, AlarmCode = _t, Index = _t, StartTime = _t, EndTime = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"StartTime", type datetime}, {"EndTime", type datetime}}),
#"Outage Interval" = Table.AddColumn(#"Changed Type", "Interval", each { Int64.From(Number.From([StartTime])*1440)..Int64.From(Number.From([EndTime])*1440-1)}),
#"Grouped Rows by Machine" = Table.Group(#"Outage Interval", {"AEG"}, {{"Rows", each List.Distinct(List.Combine(_[Interval])), type table [ Interval=list]}}),
#"Added Dates" = Table.AddColumn(#"Grouped Rows by Machine", "Date", each {Number.RoundDown(Number.From(List.Min(#"Changed Type"[StartTime])),0)..Number.RoundDown(Number.From(List.Max(#"Changed Type"[EndTime])),0)}),
#"Expanded Dates" = Table.ExpandListColumn(#"Added Dates", "Date"),
#"Date Interval" = Table.AddColumn(#"Expanded Dates", "DateInterval", each {[Date]*1440..([Date]+1)*1440-1}),
#"Added Custom" = Table.AddColumn(#"Date Interval", "Minutes", each List.Count(List.Intersect({[Rows],[DateInterval]})),Int64.Type),
#"Changed Type1" = Table.TransformColumnTypes(#"Added Custom",{{"Date", type date}}),
#"Removed Other Columns" = Table.SelectColumns(#"Changed Type1",{"AEG", "Date", "Minutes"})
in
#"Removed Other Columns"
That does return the same results as mine. What do you think the advantages/disadvantages are of the two approaches?
You would have to throw both at a substantially larger amount of data and see how they perform. The date range mask for example could be customized to only cover the min/max dates for each machine rather than the entire range.
Keep in mind that Power Query runs on disk, so it will nearly always be substantially slower than DAX (which runs in memory). A List.Buffer here and there might mitigate that.
@lbendlin Oh, I wasn't asking about a comparison with DAX; rather I was asking about a comparison with your M method.
You can use Query diagnostics to compare them, but the real test is at volume. Can they handle millions of rows and hundreds of dates?
here's a graphical approach using Deneb
your data is too detailed. You have millisecond level timestamps that would result in 86400000 rows per day for a mapping table. Can you reduce the granularity to second or minute level?
Hi admin! Thanks for helping.
Yeah, minute level is great, once I'm going to have many other entries.
Here is an approach proposed by @jpessoa8
Hours =
VAR d = SELECTCOLUMNS(
ADDCOLUMNS(
CROSSJOIN('Dates',GENERATESERIES(0,1439)),
"ms", 'Dates'[Date] * 1440 + [Value]
),
"Value",[ms])
VAR p = SELECTCOLUMNS(
GENERATE(
VALUES('Outages'[StartTime]),
VAR EndValue = CALCULATE(MAX('Outages'[EndTime]))
RETURN GENERATESERIES(INT('Outages'[StartTime] * 1440), EndValue * 1440 - 1) ),
"Value",[Value] )
RETURN
COUNTROWS(INTERSECT(d,p))/60
see attached
March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount! Prices go up Feb. 11th.
Check out the January 2025 Power BI update to learn about new features in Reporting, Modeling, and Data Connectivity.
User | Count |
---|---|
15 | |
13 | |
13 | |
13 | |
12 |