The ultimate Fabric, Power BI, SQL, and AI community-led learning event. Save €200 with code FABCOMM.
Get registeredEnhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends September 15. Request your voucher.
Hi there,
The dataset consists of columns for ID, start time, and end time, with tens of thousands of rows.
To observe the number of simultaneous events in minutes, I created the following query.
However, the performance is poor.
Please let me know if there is a better way.
#DataSet
#Query
SampleTime
let
Source = Csv.Document(File.Contents("C:\SampleTime.csv"),[Delimiter=",", Columns=3, Encoding=65001, QuoteStyle=QuoteStyle.None]),
Header = Table.PromoteHeaders(Source, [PromoteAllScalars=true]),
ConvertType = Table.TransformColumnTypes(Header,{{"Id", Int64.Type}, {"StartDate", type datetime}, {"EndDate", type datetime}}),
Custom = Table.AddColumn(ConvertType, "Round_StartDate", each DateTime.From(Number.RoundDown(Number.From(DateTime.From([StartDate]))*(24*60))/(24*60))),
Custom1 = Table.AddColumn(Custom, "Round_EndDate", each DateTime.From(Number.RoundUp(Number.From(DateTime.From([EndDate]))*(24*60))/(24*60)))
in
Custom1
Concurrent
let
StartDateTime = DateTime.From(Date.From(List.Min(SampleTime[StartDate]))),
EndDateTime = Date.AddDays(DateTime.From(Date.From(List.Max(SampleTime[EndDate]))), 1),
Source = List.DateTimes(StartDateTime, Duration.TotalMinutes(EndDateTime - StartDateTime), #duration(0, 0, 1, 0)),
ConvertToTable = Table.FromList(Source, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
RenameColumn = Table.RenameColumns(ConvertToTable,{{"Column1", "DateTime"}}),
ConvertType = Table.TransformColumnTypes(RenameColumn,{{"DateTime", type datetime}}),
TableBuffer = Table.Buffer(SampleTime),
Custom = Table.AddColumn(ConvertType, "Concurrent", each Table.RowCount(Table.SelectRows(TableBuffer, (x) => x[Round_StartDate] <= [DateTime] and x[Round_EndDate] >= [DateTime])))
in
Custom
Solved! Go to Solution.
Your Custom step is iterating through the entire TableBuffer and filtering for each of the 100k+ rows of ConvertType. Not terribly efficient.
I propose the following redefinition of Concurrent:
let
Source = SampleTime,
#"Added Custom" =
Table.AddColumn(Source, "Custom", each
List.DateTimes(
[Round_StartDate],
Duration.TotalMinutes([Round_EndDate] - [Round_StartDate]),
#duration(0, 0, 1, 0))
),
#"Removed Columns" = Table.SelectColumns(#"Added Custom",{"Id", "Custom"}),
#"Expanded Custom" = Table.ExpandListColumn(#"Removed Columns", "Custom"),
#"Grouped Rows" = Table.Group(#"Expanded Custom", {"Custom"}, {{"IdCount", each Table.RowCount(Table.Distinct(_)), Int64.Type}})
in
#"Grouped Rows"
This evaluates much faster.
Your Custom step is iterating through the entire TableBuffer and filtering for each of the 100k+ rows of ConvertType. Not terribly efficient.
I propose the following redefinition of Concurrent:
let
Source = SampleTime,
#"Added Custom" =
Table.AddColumn(Source, "Custom", each
List.DateTimes(
[Round_StartDate],
Duration.TotalMinutes([Round_EndDate] - [Round_StartDate]),
#duration(0, 0, 1, 0))
),
#"Removed Columns" = Table.SelectColumns(#"Added Custom",{"Id", "Custom"}),
#"Expanded Custom" = Table.ExpandListColumn(#"Removed Columns", "Custom"),
#"Grouped Rows" = Table.Group(#"Expanded Custom", {"Custom"}, {{"IdCount", each Table.RowCount(Table.Distinct(_)), Int64.Type}})
in
#"Grouped Rows"
This evaluates much faster.
Thank you for your advice!
Queries are much faster!
User | Count |
---|---|
68 | |
63 | |
59 | |
54 | |
28 |
User | Count |
---|---|
182 | |
81 | |
64 | |
46 | |
38 |