Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Data Days is here! Join us now for 60+ days of learning, challenges, and connection. Learn more

Reply
strachi
Regular Visitor

Moving average over non-numeric values (correct errors, fill missing values)

Hi,

how can I smooth string values in a column?

 

I have time series data (timestamp; string) with some errors or gaps in it:

 

timestamp;string

1521585642;a

1521585643;a

1521585644;

1521585645;a

1521585646;a

1521585647;x

1521585648;a

1521585649;a

1521585650;a

 

I would like to fill the gap and replace the error ("x") with the values in proximity (lets say we want the most frequent value looking at the last 2 and next 2 values). You could call this moving average with strings. The result in this simple example would be all "a" in the string-column.

 

I feel like this comes close, but MAXA does not work with strings of course:

Smooth = 
CALCULATE (
    CALCULATE (
        MAXA( 'timeseries'[string] );
        'timeseries'[Datetime]
            >= VALUES ( 'timeseries'[Datetime] ) - 4 ;
        'timeseries'[Datetime] <= VALUES ( 'timeseries'[Datetime] )
    );
    ALLEXCEPT ( 'timeseries'; 'timeseries'[Tag];'timeseries'[Logfile];'timeseries'[Datetime] )
)

Any Ideas would be greatly appreciated. I was not able to find a solution.

 

 

5 REPLIES 5
v-jiascu-msft
Microsoft Employee
Microsoft Employee

Hi @strachi,

 

Can you share a complete sample please? I can't convert the "timestamp" into a time or a date.

 

 

Best Regards,

Dale

Community Support Team _ Dale
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Hi @v-jiascu-msft, thanks for your reply.

 

In fact we can further simplify. The "timestamp" does not matter here. The first column is just to indicate the order of the timeseries data. Your can think of it as an ordered index.

 

Source:

timestamp;string

1;a

2;a

3;(blank)

4;a

5;a

6;x

7;a

8;a

9;a

 

Result I am looking for:

timestamp;string

1;a

2;a

3;a

4;a

5;a

6;a

7;a

8;a

9;a

 

The "blank" and the "x" are errors to be identified by looking at the previous and following values in the series. They should be replaced by the most frequent value "in the neighbourhood". 

 

Thank you for giving it another thought.

Sorry to push here... any ideas? @v-jiascu-msft

 

I am trying to use this to narrow down the strings in proximity to the data gap... 

FILTER(Table1;Table1[Index]<=EARLIER(Table1[Index])+1 && Table1[Index]>=EARLIER(Table1[Index])-1)

 

I guess this could help me I do not succeed in putting it together in a calculated column:

 

https://community.powerbi.com/t5/Desktop/How-to-obtain-the-most-common-value-from-a-column-and-displ...

 

Most Frequent String = 
FIRSTNONBLANK (
    TOPN (
        1; 
        VALUES ( Table1[string] ); 
        RANKX( ALL( Table1[string] ); COUNTROWS(Table1);;ASC)
    ); 
    1 
)

Anyone?

strachi
Regular Visitor

Hi,

how can I smooth string values in a column?

 

I have time series data (timestamp; string) with some errors or gaps in it:

 

timestamp;string

1521585642;a

1521585643;a

1521585644;

1521585645;a

1521585646;a

1521585647;x

1521585648;a

1521585649;a

1521585650;a

 

I would like to fill the gap and replace the error ("x") with the values in proximity (lets say we want the most frequent value looking at the last 2 and next 2 values). You could call this moving average with strings. The result in this simple example would be all "a" in the string-column.

 

I feel like this comes close, but MAXA does not work with strings of course:

Smooth = 
CALCULATE (
    CALCULATE (
        MAXA( 'timeseries'[string] );
        'timeseries'[Datetime]
            >= VALUES ( 'timeseries'[Datetime] ) - 4 ;
        'timeseries'[Datetime] <= VALUES ( 'timeseries'[Datetime] )
    );
    ALLEXCEPT ( 'timeseries'; 'timeseries'[Tag];'timeseries'[Logfile];'timeseries'[Datetime] )
)

Any Ideas would be greatly appreciated. I was not able to find a solution.

 

 

Helpful resources

Announcements
Fabric Data Days is here Carousel

Fabric Data Days 2026

Don't miss out on Data Days, June 15 through August 7. Learn Fabric, Power BI, SQL, AI and more.

May Power BI Update Carousel

Power BI Monthly Update - May 2026

Check out the May 2026 Power BI update to learn about new features.

Power BI DataViz World Championships carousel

Power BI DataViz World Championships - June 2026

A new Power BI DataViz World Championship is coming this June! Don't miss out on submitting your entry.