Power BI is turning 10! Tune in for a special live episode on July 24 with behind-the-scenes stories, product evolution highlights, and a sneak peek at what’s in store for the future.
Save the dateEnhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends August 31st. Request your voucher.
Hello. I am trying to convert a text field to a duration, but my data comes in like this:
I'd prefer to do this in Power Query Editor (M), rather than a calculated column (DAX). Can anyone help? Thanks!
Solved! Go to Solution.
@Anonymous Ah well, couple folks beat me to the fun. I personally like @Jakinta's solution mainly because it doesn't use regex which is like, why is that still a thing? Honestly though, @CNENFRNL it is pretty cool that you can use regex in Power Query it's just I thought regex went the way of Perl and that everyone was happy about that fact.
Anyway, not to be outdone and left with a hacky DAX solution, I created one that is just as functional and flexible as the elegant solutions by @Jakinta and @CNENFRNL:
Duration =
VAR __Separator = " "
VAR __SearchText = MAX('Table'[Text Duration])
VAR __Len = LEN(__SearchText)
VAR __Count = __Len - LEN(SUBSTITUTE(__SearchText,__Separator,"")) + 1
VAR __Table =
ADDCOLUMNS(
ADDCOLUMNS(
GENERATESERIES(1,__Count,1),
"__Word",
VAR __Text = SUBSTITUTE(__SearchText,__Separator,"|",IF([Value]=1,1,[Value]-1))
VAR __Start =
SWITCH(TRUE(),
__Count = 1,1,
[Value] = 1,1,
FIND("|",__Text)+1
)
VAR __End =
SWITCH(TRUE(),
__Count = 1,__Len,
[Value] = 1,FIND("|",__Text) - 1,
[Value] = __Count,__Len,
FIND(__Separator,__Text,__Start)-1
)
VAR __Word = MID(__Text,__Start,__End - __Start + 1)
RETURN __Word
),
"__Key",
SWITCH(TRUE(),
SEARCH("day",[__Word],,0)>0,"day",
SEARCH("hour",[__Word],,0)>0,"hour",
SEARCH("minute",[__Word],,0)>0,"minute",
SEARCH("second",[__Word],,0)>0,"second",
BLANK()
)
)
VAR __Days = MAXX(FILTER(__Table,[Value] = MAXX(FILTER(__Table,[__Key]="day"),[Value]) - 1),[__Word])+0
VAR __Hours = MAXX(FILTER(__Table,[Value] = MAXX(FILTER(__Table,[__Key]="hour"),[Value]) - 1),[__Word])+0
VAR __Minutes = MAXX(FILTER(__Table,[Value] = MAXX(FILTER(__Table,[__Key]="minute"),[Value]) - 1),[__Word])+0
VAR __Seconds = MAXX(FILTER(__Table,[Value] = MAXX(FILTER(__Table,[__Key]="second"),[Value]) - 1),[__Word])+0
RETURN
__Days * 1000000 + __Hours * 10000 + __Minutes * 100 + __Seconds
Most of the code is just Text to Table, which I should have thought about to begin with. And posted to the gallery here: Text Duration Conversion - Microsoft Power BI Community
Format text as HH:MM:SS and then use Duration.FromText
Thanks everyone!! Unfortunetly, I couldn't get the M solutions to work, but that is probably just user error. I did get the DAX solution provided by @Greg_Deckler to work (although I did modify it a little to convert the duration to hours).
Thanks again!
@Anonymous Ah well, couple folks beat me to the fun. I personally like @Jakinta's solution mainly because it doesn't use regex which is like, why is that still a thing? Honestly though, @CNENFRNL it is pretty cool that you can use regex in Power Query it's just I thought regex went the way of Perl and that everyone was happy about that fact.
Anyway, not to be outdone and left with a hacky DAX solution, I created one that is just as functional and flexible as the elegant solutions by @Jakinta and @CNENFRNL:
Duration =
VAR __Separator = " "
VAR __SearchText = MAX('Table'[Text Duration])
VAR __Len = LEN(__SearchText)
VAR __Count = __Len - LEN(SUBSTITUTE(__SearchText,__Separator,"")) + 1
VAR __Table =
ADDCOLUMNS(
ADDCOLUMNS(
GENERATESERIES(1,__Count,1),
"__Word",
VAR __Text = SUBSTITUTE(__SearchText,__Separator,"|",IF([Value]=1,1,[Value]-1))
VAR __Start =
SWITCH(TRUE(),
__Count = 1,1,
[Value] = 1,1,
FIND("|",__Text)+1
)
VAR __End =
SWITCH(TRUE(),
__Count = 1,__Len,
[Value] = 1,FIND("|",__Text) - 1,
[Value] = __Count,__Len,
FIND(__Separator,__Text,__Start)-1
)
VAR __Word = MID(__Text,__Start,__End - __Start + 1)
RETURN __Word
),
"__Key",
SWITCH(TRUE(),
SEARCH("day",[__Word],,0)>0,"day",
SEARCH("hour",[__Word],,0)>0,"hour",
SEARCH("minute",[__Word],,0)>0,"minute",
SEARCH("second",[__Word],,0)>0,"second",
BLANK()
)
)
VAR __Days = MAXX(FILTER(__Table,[Value] = MAXX(FILTER(__Table,[__Key]="day"),[Value]) - 1),[__Word])+0
VAR __Hours = MAXX(FILTER(__Table,[Value] = MAXX(FILTER(__Table,[__Key]="hour"),[Value]) - 1),[__Word])+0
VAR __Minutes = MAXX(FILTER(__Table,[Value] = MAXX(FILTER(__Table,[__Key]="minute"),[Value]) - 1),[__Word])+0
VAR __Seconds = MAXX(FILTER(__Table,[Value] = MAXX(FILTER(__Table,[__Key]="second"),[Value]) - 1),[__Word])+0
RETURN
__Days * 1000000 + __Hours * 10000 + __Minutes * 100 + __Seconds
Most of the code is just Text to Table, which I should have thought about to begin with. And posted to the gallery here: Text Duration Conversion - Microsoft Power BI Community
A generic solution regardless of single/plural forms or lower/upper cases
let
RE = (regex as text, str as text) =>
let
html = "<script>var regex = " & regex & "; var str = """ & str & """; var res = str.match(regex); document.write(res)</script>",
res = Web.Page(html)[Data]{0}[Children]{0}[Children]{1}[Text]{0}
in res,
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("VYyxDkAwFEV/5aazwWtV2Q0Wk1E6CBIGJMrg71FN6XrPuadpGKFoT2Qo12MzEDGqaTn2wYBLmKFbl94wHd2e8CRJAyI/oFD/gHBRqbyRh00rQCjMjlMSCLk/Eg/SnGO0bXLzu6Jvz2ezWXJVpvUF", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Text_Dur = _t]),
RegEx = Table.AddColumn(Source, "Num_Dur", each RE("/\d+ [A-Z]/gi", [Text_Dur])),
#"Converted Duration" = Table.TransformColumns(RegEx, {"Num_Dur", each let pair = List.Zip(List.Transform(Text.Split(_, ","), each Text.Split(_, " "))), rd = Record.FromList(List.Transform(pair{0}, Number.From), List.Transform(pair{1}, Text.Upper)) in #duration(Record.FieldOrDefault(rd, "D", 0), Record.FieldOrDefault(rd, "H", 0), Record.FieldOrDefault(rd, "M", 0), Record.FieldOrDefault(rd, "S", 0))})
in
#"Converted Duration"
A generic solution regardless of single/plural forms or lower/upper cases
let
RE = (regex as text, str as text) =>
let
html = "<script>var regex = " & regex & "; var str = """ & str & """; var res = str.match(regex); document.write(res)</script>",
res = Web.Page(html)[Data]{0}[Children]{0}[Children]{1}[Text]{0}
in res,
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("VYyxDkAwFEV/5aazwWtV2Q0Wk1E6CBIGJMrg71FN6XrPuadpGKFoT2Qo12MzEDGqaTn2wYBLmKFbl94wHd2e8CRJAyI/oFD/gHBRqbyRh00rQCjMjlMSCLk/Eg/SnGO0bXLzu6Jvz2ezWXJVpvUF", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Text_Dur = _t]),
RegEx = Table.AddColumn(Source, "Num_Dur", each RE("/\d+ [A-Z]/gi", [Text_Dur])),
#"Converted Duration" = Table.TransformColumns(RegEx, {"Num_Dur", each let pair = List.Zip(List.Transform(Text.Split(_, ","), each Text.Split(_, " "))), rd = Record.FromList(List.Transform(pair{0}, Number.From), List.Transform(pair{1}, Text.Upper)) in #duration(Record.FieldOrDefault(rd, "D", 0), Record.FieldOrDefault(rd, "H", 0), Record.FieldOrDefault(rd, "M", 0), Record.FieldOrDefault(rd, "S", 0)), type duration})
in
#"Converted Duration"
Thanks to the great efforts by MS engineers to simplify syntax of DAX! Most beginners are SUCCESSFULLY MISLED to think that they could easily master DAX; but it turns out that the intricacy of the most frequently used RANKX() is still way beyond their comprehension! |
DAX is simple, but NOT EASY! |
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMjFV8M3MKy1JVTA0UghOTc7PS1GK1YlWMlLwyC8tUjCHyZoiSxoquCRWKhgZwSXNkGVNIVqNkcXMwToMITJwO6HKQVLGBnCxWAA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [TextDuration = _t]),
#"Added Custom" = Table.AddColumn(Source, "Duration", each let l=Text.Split([TextDuration], " "), d=List.PositionOf(l,"Day")-1, h=List.PositionOf(l,"Hour")-1, m=List.PositionOf(l,"Minute")-1,s=List.PositionOf(l,"Second")-1, x=List.Transform( {d,h,m,s}, each try Number.From(l{_}) otherwise 0) in #duration ( x{0}, x{1}, x{2}, x{3}), type duration )
in
#"Added Custom"
@Anonymous I realize this is DAX but you should be able to use the same pattern replacing the DAX functions with equivalent Power Query functions. I will work on the conversion if I find the time. Also, could you paste that data as text?
Column =
VAR __Duration = [Ticket Duration]
VAR __DayLocation = SEARCH("day",__Duration,,0)
VAR __HourLocation = SEARCH("hour",__Duration,,0)
VAR __MinuteLocation = SEARCH("minute",__Duration,,0)
VAR __Days = IF(__DayLocation,VALUE(LEFT(__Duration,__DayLocation - 1)),0)
VAR __DaySeparator = IF(__DayLocation<>0,SEARCH(" ",__Duration,__DayLocation,0),1)
VAR __HourSeparator =
SWITCH(TRUE(),
__DayLocation<>0,SEARCH(" ",__Duration,__DaySeparator+1,1),
__HourLocation=0,1,
SEARCH(" ",__Duration)
)
VAR __Hours = IF(__HourLocation=0,0,VALUE(MID(__Duration,__DaySeparator,__HourSeparator - __DaySeparator + 1)))
VAR __MinuteSeparator = IF(__HourSeparator=1,1,SEARCH(" ",__Duration,__HourSeparator+1,0))
VAR __LastSeparator = SEARCH(" ",__Duration,__MinuteSeparator+1,0)
VAR __Minutes = VALUE(MID(__Duration,__MinuteSeparator,__LastSeparator - __MinuteSeparator + 1))
RETURN
__Days * 24 + __Hours + __Minutes/100
Also, this was for day, hour and minute but not seconds so there will need to be some adjustments.
Hi @Greg_Deckler, thanks for your help. I'll await your response. As requested, a sample in text format:
1 Day 8 Hours 30 Minutes 25 seconds
13 Minutes 46 seconds
5 Minutes 47 seconds
3 Hours 57 Minutes 9 seconds
1 Hour 37 Minutes 14 seconds
9 Minutes 12 seconds