Solved: Average

ven853 · ‎12-26-2018

I have null values in my column identified as 999.9. I need to replace them with the average of the row above and the row below. In some circumstances like shown below, the row below is also a null value. In that case I need it to keep searching down until it comes to a value other than 999.9 and then use that to average with the row above. In this example:

1. I need a formula that for row 286 will average rows 285 and 289 getting 1.6.

2. Then the formula would averave rows 286 (which is now 1.6) and row 289 getting 2.4 for row 287.

3. Then the formula would average rows 287 (which is now 2.4) and row 289 getting 2.7 for row 288.

Final product would be:

Row 286 = 1.6

Row 287 = 2.4

Row 288 = 2.7

Any help would be much appreciated.

v-jiascu-msft · ‎12-26-2018

Hi @ven853,

To be honest, it's hard to achieve it using Power Query. I would suggest you leverage the power of Python and R. I created a demo solution with both Python and R. You can choose the best one that suits you. You can download it from the attachment.

# 'dataset' holds the input data for this script

def find_next(excluded, ds):
    for item in ds:
        if item != excluded:
            return item
    return 0

result = []
column1 = dataset.iloc[:, 0]
for index in range(len(column1)):
    if column1[index] != 999.9:
        result.append(column1[index])
    else:
        next = find_next(999.9, column1[index + 1:])
        result.append((next + result[-1]) / 2)
dataset["new"] = result

# 'dataset' holds the input data for this script

find_next <- function(excluded, ds) {
    for (item in ds[,1]) {
        if (item != excluded) {
            return(item)
        }
    }
    return(0)
}

result <- c()
ds_length <- nrow(dataset)
for (index in 1: ds_length) {
    if (dataset[index, 1] == 999.9) {
        result[index] <- (tail(result, 1) + find_next(999.9, tail(dataset, -index))) / 2.0
    }
    else{
        result[index] <- dataset[index, 1]
    }
}
final <- data.frame(result)

Best Regards,
Dale

Community Support Team _ Dale
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

View solution in original post

v-jiascu-msft · ‎12-26-2018

Hi @ven853,

To be honest, it's hard to achieve it using Power Query. I would suggest you leverage the power of Python and R. I created a demo solution with both Python and R. You can choose the best one that suits you. You can download it from the attachment.

# 'dataset' holds the input data for this script

def find_next(excluded, ds):
    for item in ds:
        if item != excluded:
            return item
    return 0

result = []
column1 = dataset.iloc[:, 0]
for index in range(len(column1)):
    if column1[index] != 999.9:
        result.append(column1[index])
    else:
        next = find_next(999.9, column1[index + 1:])
        result.append((next + result[-1]) / 2)
dataset["new"] = result

# 'dataset' holds the input data for this script

find_next <- function(excluded, ds) {
    for (item in ds[,1]) {
        if (item != excluded) {
            return(item)
        }
    }
    return(0)
}

result <- c()
ds_length <- nrow(dataset)
for (index in 1: ds_length) {
    if (dataset[index, 1] == 999.9) {
        result[index] <- (tail(result, 1) + find_next(999.9, tail(dataset, -index))) / 2.0
    }
    else{
        result[index] <- dataset[index, 1]
    }
}
final <- data.frame(result)

Best Regards,
Dale

Community Support Team _ Dale
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.