Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Get certified in Microsoft Fabric—for free! For a limited time, get a free DP-600 exam voucher to use by the end of 2024. Register now

Reply
thne123
Helper II
Helper II

Summarizing inconsistent text format in column

I have a data source that provides "time spent" in various statuses, however it comes with delimited for all times it changes status, meaning one row can look like this

"1M 0w 1d 20h 0m,1w 1d 20h 14m,1w 1d 1h 32m,4w 0d 3h 50m,1w 0d 1h 59m,1d 22h 8m,1w 1d 0h 8m,1h 37m"

I'm wondering if there's a way to create a custom power query column that somehow summarizes this into just one section?

 

Another tricky part is that of course not all rows have the same amount of delimitors, nor the same set of time components. 

One row could have just days, or just hours. while some may have a combination of above etc.

 

Example:

 

KeyStatus 1
10102w 3d 11h 41m
10351w 5d 23h 24m,4d 23h 17m
10484M 4w 0d 22h 18m
10821w 5d 20h 49m,2w 5d 23h 56m,3d 22h 50m,6d 19h 30m
11461M 1w 1d 22h 35m
11612M 0w 2d 22h 36m
11721M 1w 4d 22h 18m,2M 2w 2d 2h 16m
11813w 0d 18h 42m
12073d 23h 1m
13431w 0d 0h 54m,2d 23h 15m
13621w 0d 17h 50m,1d 16h 51m,1w 6d 19h 43m,1d 1h 8m
13853w 1d 19h 4m
14372d 18h 47m
164722h 8m
16912d 17h 51m
16932d 17h 48m
5361M 0w 1d 20h 0m,1w 1d 20h 14m,1w 1d 1h 32m,4w 0d 3h 50m,1w 0d 1h 59m,1d 22h 8m,1w 1d 0h 8m,1h 37m
6071w 1d 17h 0m
8102d 8h 2m,3d 18h 30m
1 ACCEPTED SOLUTION
ThxAlot
Super User
Super User

Simple enough,

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("RVI5EsQgDPuKJzUFPjjyiH3BznYp0qTO91dgk3S2JYFk+H43zpy3tMlNehDzScbX9ksD0AKAbyoHiZ4kdiXzktviWAfHPmQ3ZWACrC+sy6vPOHi/kjynlXoldUXJV6q4fD9Jc4jZ6hB/CHp2mpaFVR6OP5RvksDqwpo8OnsMJZDFyegfbh/n6HTOHQYlAMltAJE1hmrqccBGmoJtSDCWMa3yULh5MJjnipJR3hQxTR04aS1Le3EvHIyYmw4rEgbX2qvNqbz6unPw2rxsTfWdWnCLxmqzrxZh8jQXDdvq4E8Fjz4DaeTxdGj2mcFNhCB7DVk4rXOTcVobF81x9z93EELJ/Acj3nz83x8=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Key = _t, Status = _t]),
    Replacement = {{"M", "*30*1440"}, {"w", "*7*1440"}, {"d", "*1440"}, {"h", "*60"}, {"m", ""}, {" ", "+"}, {",", "+"}},
    #"Transformed minutes" = Table.TransformColumns(Source, {"Status", each Expression.Evaluate(List.Accumulate(Replacement, _, (s,c) => Text.Replace(s, c{0}, c{1})))})
in
    #"Transformed minutes"

ThxAlot_0-1732869030111.png



Expertise = List.Accumulate(


        {Days as from Today},


        {Skills and Knowledge},


        (Current, Everyday) => Current & Day.LeanAndPractise(Everyday)


)



View solution in original post

9 REPLIES 9
ThxAlot
Super User
Super User

Simple enough,

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("RVI5EsQgDPuKJzUFPjjyiH3BznYp0qTO91dgk3S2JYFk+H43zpy3tMlNehDzScbX9ksD0AKAbyoHiZ4kdiXzktviWAfHPmQ3ZWACrC+sy6vPOHi/kjynlXoldUXJV6q4fD9Jc4jZ6hB/CHp2mpaFVR6OP5RvksDqwpo8OnsMJZDFyegfbh/n6HTOHQYlAMltAJE1hmrqccBGmoJtSDCWMa3yULh5MJjnipJR3hQxTR04aS1Le3EvHIyYmw4rEgbX2qvNqbz6unPw2rxsTfWdWnCLxmqzrxZh8jQXDdvq4E8Fjz4DaeTxdGj2mcFNhCB7DVk4rXOTcVobF81x9z93EELJ/Acj3nz83x8=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Key = _t, Status = _t]),
    Replacement = {{"M", "*30*1440"}, {"w", "*7*1440"}, {"d", "*1440"}, {"h", "*60"}, {"m", ""}, {" ", "+"}, {",", "+"}},
    #"Transformed minutes" = Table.TransformColumns(Source, {"Status", each Expression.Evaluate(List.Accumulate(Replacement, _, (s,c) => Text.Replace(s, c{0}, c{1})))})
in
    #"Transformed minutes"

ThxAlot_0-1732869030111.png



Expertise = List.Accumulate(


        {Days as from Today},


        {Skills and Knowledge},


        (Current, Everyday) => Current & Day.LeanAndPractise(Everyday)


)



@ThxAlot 

Hello again!

 

I did encounter an issue for some rows that also had a Year (Y) parameter in their string

 

I altered the "Replacement" string in advanced editor to the following: (I want sum of hours)

 

Replacement = {{"Y","*365*24"},{"M", "*30*24"}, {"w", "*7*24"}, {"d", "*24"}, {"h", ""}, {"m", "/60"}, {" ", "+"}, {",", "+"}},

 

But I still get an error with those rows containing year end up as null:

Example:

1y 0M 1w 0d 19h 35m

1y 0M 0w 0d 18h 26m

 

{"y","*365*24"}

 



Expertise = List.Accumulate(


        {Days as from Today},


        {Skills and Knowledge},


        (Current, Everyday) => Current & Day.LeanAndPractise(Everyday)


)



@ThxAlot oh...

Thanks lol 😄

@ThxAlot 

 

Thanks for the assist! This worked great, I had to add something to handle nulls but the rest was great. Thanks! 

Omid_Motamedise
Memorable Member
Memorable Member

Hi @thne123 thanks for sharing your nice challenge

you can copy the below code and paste it into the advance editor
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("RVI5EsQgDPuKJzUFPjjyiH3BznYp0qTO91dgk3S2JYFk+H43zpy3tMlNehDzScbX9ksD0AKAbyoHiZ4kdiXzktviWAfHPmQ3ZWACrC+sy6vPOHi/kjynlXoldUXJV6q4fD9Jc4jZ6hB/CHp2mpaFVR6OP5RvksDqwpo8OnsMJZDFyegfbh/n6HTOHQYlAMltAJE1hmrqccBGmoJtSDCWMa3yULh5MJjnipJR3hQxTR04aS1Le3EvHIyYmw4rEgbX2qvNqbz6unPw2rxsTfWdWnCLxmqzrxZh8jQXDdvq4E8Fjz4DaeTxdGj2mcFNhCB7DVk4rXOTcVobF81x9z93EELJ/Acj3nz83x8=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Key = _t, #"Status 1" = _t]),
L = {{"M","*30*24*60"},{"w","*7*24*60"},{"d","*24*60"},{"h","*60"},{"m","*1"},{",","+"},{" ","+"}},
Convert = Table.AddColumn(Source, "new", each List.Accumulate(L,_[Status 1], (a,b)=> Text.Replace(a,b{0},b{1}))),
#"Added Custom" = Table.AddColumn(Convert, "Total In Minute", each Expression.Evaluate([new])),
#"Added Custom1" = Table.AddColumn(#"Added Custom", "Custom", each [
a=[Total In Minute],
x=Number.IntegerDivide,
y=Text.From,
M=x(a,30*24*60),
w=x(a-M*30*24*60,7*24*60),
d=x(a-M*30*24*60-w*7*24*60,24*60),
h=x(a-M*30*24*60-w*7*24*60-d*24*60,60),
m=a-M*30*24*60-w*7*24*60-d*24*60-h*60,
r=y(M)&"M "&y(w)&"w "&y(d)&"d "&y(h)&"h "& y(m)&"m "
][r])
in
#"Added Custom1"

KriZo
Resolver II
Resolver II

You could use a Python script for it: 

 

 

 

 

 

 

import pandas as pd
import numpy as np

def convert_to_minutes(row):
    # Original string
    time_string = row['Status 1']  

    # Predefined order of suffixes and their conversion factors to minutes
    time_units = {
        "Y": 365 * 24 * 60,  # Years to minutes
        "M": 30 * 24 * 60,   # Months to minutes (approximate, assuming 30 days per month)
        "w": 7 * 24 * 60,    # Weeks to minutes
        "d": 24 * 60,        # Days to minutes
        "h": 60,             # Hours to minutes
        "m": 1               # Minutes
    }

    # Create list for all the converted entries
    converted_times = []

    # Split by commas to handle multiple time components (e.g., "2w, 3d, 11h, 41m")
    for times in time_string.split(","):
        time_in_minutes = 0  # Initialize total time for this entry
        
        # Parse the string and calculate the total minutes for each part
        for part in times.split():
            suffix = part[-1]  # Get the suffix (last character)
            
            # Check if the suffix is valid and exists in the time_units dictionary
            if suffix in time_units:
                value = int(part[:-1])  # Extract the numeric value
                time_in_minutes += value * time_units[suffix]  # Convert and accumulate the total        
        
        # Append the calculated time for this part to the list
        converted_times.append(time_in_minutes)

    # Sum the total times for all parts in the row
    total_time = np.sum(np.asarray(converted_times))

    return total_time

# Apply the function to create a new column for total time
dataset['total time in minuts'] = dataset.apply(convert_to_minutes, axis=1).astype(int)

 

 

 

 

 

 

Resultat in: 

Converted.png

Did I solve your question? Mark my post as a solution! Kudos are appreciated as well as LinkedIn endorsements.
174857.png

Good to see someone incorporating py script.

The script is also simple enough, cheers!

import re
repl = {
    "Y": '* 365 * 24 * 60',  # Years to minutes
    "M": '* 30 * 24 * 60',   # Months to minutes (approximate, assuming 30 days per month)
    "w": '* 7 * 24 * 60',    # Weeks to minutes
    "d": '* 24 * 60',        # Days to minutes
    "h": '* 60',             # Hours to minutes
    "m": '',              # Minutes
    ",": '+',
    " ": '+' 
}
ptn = re.compile('[a-zA-Z, ]')
dataset['Minutes'] = dataset.apply(lambda r: eval(ptn.sub(lambda m: repl[m.group()], r['Status'])), axis=1)

 

ThxAlot_0-1732907479027.png



Expertise = List.Accumulate(


        {Days as from Today},


        {Skills and Knowledge},


        (Current, Everyday) => Current & Day.LeanAndPractise(Everyday)


)



bhanu_gautam
Super User
Super User

@thne123 , try using below steps

 

Load your data into Power Query.
Select the column with the delimited time values.
Go to the "Transform" tab and select "Split Column" -> "By Delimiter".
Choose the delimiter (e.g., comma) and split into rows.

 

Add a custom column to parse the time components. Use the following formula to extract each component and convert it to minutes:

let
Source = [Status 1],
TimeComponents = Text.Split(Source, " "),
Minutes = List.Sum(List.Transform(TimeComponents, each
if Text.EndsWith(_, "M") then Number.FromText(Text.Start(_, Text.Length(_) - 1)) * 43200 else
if Text.EndsWith(_, "w") then Number.FromText(Text.Start(_, Text.Length(_) - 1)) * 10080 else
if Text.EndsWith(_, "d") then Number.FromText(Text.Start(_, Text.Length(_) - 1)) * 1440 else
if Text.EndsWith(_, "h") then Number.FromText(Text.Start(_, Text.Length(_) - 1)) * 60 else
if Text.EndsWith(_, "m") then Number.FromText(Text.Start(_, Text.Length(_) - 1)) else 0
))
in
Minutes

 

Go to the "Transform" tab, select "Group By".
Group by "Key" and add a new column that sums the minutes.

 

Add a custom column to convert the total minutes back to the desired format:

let
TotalMinutes = [TotalMinutes],
Months = Number.IntegerDivide(TotalMinutes, 43200),
RemainingMinutes1 = Number.Mod(TotalMinutes, 43200),
Weeks = Number.IntegerDivide(RemainingMinutes1, 10080),
RemainingMinutes2 = Number.Mod(RemainingMinutes1, 10080),
Days = Number.IntegerDivide(RemainingMinutes2, 1440),
RemainingMinutes3 = Number.Mod(RemainingMinutes2, 1440),
Hours = Number.IntegerDivide(RemainingMinutes3, 60),
Minutes = Number.Mod(RemainingMinutes3, 60),
Result = Text.Combine(
List.Select(
{
if Months > 0 then Text.From(Months) & "M" else null,
if Weeks > 0 then Text.From(Weeks) & "w" else null,
if Days > 0 then Text.From(Days) & "d" else null,
if Hours > 0 then Text.From(Hours) & "h" else null,
if Minutes > 0 then Text.From(Minutes) & "m" else null
},
each _ <> null
),
" "
)
in
Result

 




Did I answer your question? Mark my post as a solution! And Kudos are appreciated

Proud to be a Super User!




LinkedIn






Helpful resources

Announcements
November Carousel

Fabric Community Update - November 2024

Find out what's new and trending in the Fabric Community.

Live Sessions with Fabric DB

Be one of the first to start using Fabric Databases

Starting December 3, join live sessions with database experts and the Fabric product team to learn just how easy it is to get started.

Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount! Early Bird pricing ends December 9th.

Nov PBI Update Carousel

Power BI Monthly Update - November 2024

Check out the November 2024 Power BI update to learn about new features.