Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Calling all Data Engineers! Fabric Data Engineer (Exam DP-700) live sessions are back! Starting October 16th. Sign up.

Reply
strachi
Regular Visitor

Moving average over non-numeric values (correct errors, fill missing values)

Hi,

how can I smooth string values in a column?

 

I have time series data (timestamp; string) with some errors or gaps in it:

 

timestamp;string

1521585642;a

1521585643;a

1521585644;

1521585645;a

1521585646;a

1521585647;x

1521585648;a

1521585649;a

1521585650;a

 

I would like to fill the gap and replace the error ("x") with the values in proximity (lets say we want the most frequent value looking at the last 2 and next 2 values). You could call this moving average with strings. The result in this simple example would be all "a" in the string-column.

 

I feel like this comes close, but MAXA does not work with strings of course:

Smooth = 
CALCULATE (
    CALCULATE (
        MAXA( 'timeseries'[string] );
        'timeseries'[Datetime]
            >= VALUES ( 'timeseries'[Datetime] ) - 4 ;
        'timeseries'[Datetime] <= VALUES ( 'timeseries'[Datetime] )
    );
    ALLEXCEPT ( 'timeseries'; 'timeseries'[Tag];'timeseries'[Logfile];'timeseries'[Datetime] )
)

Any Ideas would be greatly appreciated. I was not able to find a solution.

 

 

5 REPLIES 5
v-jiascu-msft
Microsoft Employee
Microsoft Employee

Hi @strachi,

 

Can you share a complete sample please? I can't convert the "timestamp" into a time or a date.

 

 

Best Regards,

Dale

Community Support Team _ Dale
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Hi @v-jiascu-msft, thanks for your reply.

 

In fact we can further simplify. The "timestamp" does not matter here. The first column is just to indicate the order of the timeseries data. Your can think of it as an ordered index.

 

Source:

timestamp;string

1;a

2;a

3;(blank)

4;a

5;a

6;x

7;a

8;a

9;a

 

Result I am looking for:

timestamp;string

1;a

2;a

3;a

4;a

5;a

6;a

7;a

8;a

9;a

 

The "blank" and the "x" are errors to be identified by looking at the previous and following values in the series. They should be replaced by the most frequent value "in the neighbourhood". 

 

Thank you for giving it another thought.

Sorry to push here... any ideas? @v-jiascu-msft

 

I am trying to use this to narrow down the strings in proximity to the data gap... 

FILTER(Table1;Table1[Index]<=EARLIER(Table1[Index])+1 && Table1[Index]>=EARLIER(Table1[Index])-1)

 

I guess this could help me I do not succeed in putting it together in a calculated column:

 

https://community.powerbi.com/t5/Desktop/How-to-obtain-the-most-common-value-from-a-column-and-displ...

 

Most Frequent String = 
FIRSTNONBLANK (
    TOPN (
        1; 
        VALUES ( Table1[string] ); 
        RANKX( ALL( Table1[string] ); COUNTROWS(Table1);;ASC)
    ); 
    1 
)

Anyone?

strachi
Regular Visitor

Hi,

how can I smooth string values in a column?

 

I have time series data (timestamp; string) with some errors or gaps in it:

 

timestamp;string

1521585642;a

1521585643;a

1521585644;

1521585645;a

1521585646;a

1521585647;x

1521585648;a

1521585649;a

1521585650;a

 

I would like to fill the gap and replace the error ("x") with the values in proximity (lets say we want the most frequent value looking at the last 2 and next 2 values). You could call this moving average with strings. The result in this simple example would be all "a" in the string-column.

 

I feel like this comes close, but MAXA does not work with strings of course:

Smooth = 
CALCULATE (
    CALCULATE (
        MAXA( 'timeseries'[string] );
        'timeseries'[Datetime]
            >= VALUES ( 'timeseries'[Datetime] ) - 4 ;
        'timeseries'[Datetime] <= VALUES ( 'timeseries'[Datetime] )
    );
    ALLEXCEPT ( 'timeseries'; 'timeseries'[Tag];'timeseries'[Logfile];'timeseries'[Datetime] )
)

Any Ideas would be greatly appreciated. I was not able to find a solution.

 

 

Helpful resources

Announcements
FabCon Global Hackathon Carousel

FabCon Global Hackathon

Join the Fabric FabCon Global Hackathon—running virtually through Nov 3. Open to all skill levels. $10,000 in prizes!

FabCon Atlanta 2026 carousel

FabCon Atlanta 2026

Join us at FabCon Atlanta, March 16-20, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.

Top Solution Authors