Data Derivatives in Power BI: A Python-based Inter...

technolog · ‎06-30-2023

Screenshot 2023-06-30 at 19.31.16.png

This article focuses on exploring the use of interpolation and approximation of data sets in Power BI utilizing Python.

Given a data set, as depicted in the following figure:

x = np.array([15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75])
y1 = np.array([243, 265, 287, 313, 343, 373, 401, 435, 475, 519, 571, 630, 699])
y2 = np.array([244, 269, 291, 320, 350, 379, 406, 442, 482, 526, 578, 633, 703])
y3 = np.array([371, 379, 395, 411, 427, 445, 463, 485, 509, 533, 555, 583, 615])
y4 = np.array([374, 382, 399, 416, 433, 457, 482, 509, 533, 561, 587, 615, 645])
y5 = np.array([370, 382, 390, 398, 408, 420, 432, 446, 460, 476, 494, 514, 538])
y6 = np.array([370, 388, 398, 410, 424, 438, 452, 466, 482, 500, 518, 540, 564])
y7 = np.array([348, 378, 408, 454, 508, 566, 638, 720, 0  , 0  , 0  , 0  , 0  ])

An important aspect to be considered is that a portion of our y7 data contains zeroes, which will be crucial to bear in mind during further calculations.

The aim is to construct derivatives from the data for all values of x and y, inclusive of y2, y3, etc. To facilitate this, we commence by building an interpolation based on these data, from which we will then derive.

import numpy as np
import matplotlib.pyplot as plt
from scipy import interpolate


x = np.array([15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75])
y1 = np.array([243, 265, 287, 313, 343, 373, 401, 435, 475, 519, 571, 630, 699])
y2 = np.array([244, 269, 291, 320, 350, 379, 406, 442, 482, 526, 578, 633, 703])
y3 = np.array([371, 379, 395, 411, 427, 445, 463, 485, 509, 533, 555, 583, 615])
y4 = np.array([374, 382, 399, 416, 433, 457, 482, 509, 533, 561, 587, 615, 645])
y5 = np.array([370, 382, 390, 398, 408, 420, 432, 446, 460, 476, 494, 514, 538])
y6 = np.array([370, 388, 398, 410, 424, 438, 452, 466, 482, 500, 518, 540, 564])
y7 = np.array([348, 378, 408, 454, 508, 566, 638, 720, 0  , 0  , 0  , 0  , 0  ])
y8 = np.array([351, 388, 430, 478, 528, 588, 660, 740, 0  , 0  , 0  , 0  , 0  ])
y9 = np.array([358, 378, 400, 424, 463, 502, 538, 584, 628, 680, 0  , 0  , 0  ])
y10 = np.array([363, 383, 402, 430, 469, 507, 545, 590, 633, 685, 0  , 0  , 0  ])
y11 = np.array([243, 269, 295, 320, 350, 383, 416, 453, 497, 545, 593, 662, 722])
y12 = np.array([243, 264, 287, 313, 339, 372, 405, 445, 486, 534, 582, 649, 711])

y_arr = np.concatenate((y1,y2,y3,y4,y5,y6,y7,y8,y9,y10,y11,y12), axis = 0)
y_arr = np.reshape(y_arr, (12, 13))

i = 1

for y_list in y_arr:
    x = np.array([15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75])
    y_list = y_list[y_list > 0]
    
    x = np.resize(x, len(y_list))
    
    minx = np.min(x)
    maxx = np.max(x)
        
    
    tck = interpolate.splrep(x, y_list, s=0)

    xnew = np.arange(minx, maxx, 0.5)
    ynew = interpolate.splev(xnew, tck, der=0)
    yder2 = interpolate.splev(xnew, tck, der=1)
    
    fig, ax1 = plt.subplots()

    ax2 = ax1.twinx()
  
    ax1.plot(x, y_list, 'o', xnew, ynew, x, y_list, 'b')
    plt.title('Cubic-spline interpolation for ' + str(i) + ' line')

    ax2.plot(xnew, yder2)

    plt.show()
    i = i+1

Following the implementation of the aforementioned method, we are presented with the subsequent graph. This illustrates the original data points along with the derivative values:

An observation of the derivative reveals a considerable irregularity, which is primarily due to the non-uniform arrangement of data points. This irregularity may impose challenges for any conclusions to be drawn from these data in the future.

Hence, we aim to improve the initial code to perform data approximation instead of interpolation, and then compute the derivative from the resulting function.

import scipy as sp
import numpy as np
import matplotlib.pyplot as plt
from scipy import interpolate

x = np.array([15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75])
y = np.array([243, 265, 287, 313, 343, 373, 401, 435, 475, 519, 571, 630, 699])
y2 = np.array([244, 269, 291, 320, 350, 379, 406, 442, 482, 526, 578, 633, 703])
y3 = np.array([371, 379, 395, 411, 427, 445, 463, 485, 509, 533, 555, 583, 615])
y4 = np.array([374, 382, 399, 416, 433, 457, 482, 509, 533, 561, 587, 615, 645])
y5 = np.array([370, 382, 390, 398, 408, 420, 432, 446, 460, 476, 494, 514, 538])
y6 = np.array([370, 388, 398, 410, 424, 438, 452, 466, 482, 500, 518, 540, 564])
y7 = np.array([348, 378, 408, 454, 508, 566, 638, 720, 0  , 0  , 0  , 0  , 0  ])
y8 = np.array([351, 388, 430, 478, 528, 588, 660, 740, 0  , 0  , 0  , 0  , 0  ])
y9 = np.array([358, 378, 400, 424, 463, 502, 538, 584, 628, 680, 0  , 0  , 0  ])
y10 = np.array([363, 383, 402, 430, 469, 507, 545, 590, 633, 685, 0  , 0  , 0  ])
y11 = np.array([243, 269, 295, 320, 350, 383, 416, 453, 497, 545, 593, 662, 722])
y12 = np.array([243, 264, 287, 313, 339, 372, 405, 445, 486, 534, 582, 649, 711])

y_arr = np.concatenate((y,y2,y3,y4,y5,y6,y7,y8,y9,y10,y11,y12), axis = 0)
y_arr = np.reshape(y_arr, (12, 13))

i = 1

for y_list in y_arr:
    x = np.array([15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75])
    y = y_list
    
    y = y[y > 0]
    
    x = np.resize(x, len(y))
    
    
    minx = np.min(x)
    maxx = np.max(x)
    
    d=3 # степень полинома
    fp, residuals, rank, sv, rcond = np.polyfit(x, y, d, full=True) # Модель
    f = sp.poly1d(fp) # аппроксимирующая функция
    y1=[fp[0]*x[i]**2+fp[1]*x[i]+fp[2] for i in range(0,len(x))] # значения функции a*x**2+b*x+c
    so=round(sum([abs(y[i]-y1[i]) for i in range(0,len(x))])/(len(x)*sum(y))*100,4) # средняя ошибка
    fx = np.linspace(x[0], x[-1] + 1, len(x)) # можно установить вместо len(x) большее число для интерполяции

    y_list = f(fx)

    tck = interpolate.splrep(x, y_list, s=0)

    xnew = np.arange(minx, maxx, 0.5)
    ynew = interpolate.splev(xnew, tck, der=0)
    yder2 = interpolate.splev(xnew, tck, der=1)

    fig, ax1 = plt.subplots()

    ax2 = ax1.twinx()

    ax1.plot(x, y_list, 'o', xnew, ynew, x, y_list, 'b')
    plt.title('Approximation for ' + str(i) + ' line')
    ax2.plot(xnew, yder2)

    plt.show()
    i = i + 1

The result is displayed below:

As demonstrated, employing an approximation function results in a significantly smoother derivative of the function.

We proceed by examining the implementation for y8:

Upon inspection, we notice that excessive data along the x-axis is eliminated, with no attempt to transition to zero.

Subsequently, to continue working with the code and set up the necessary visualizations, we transfer the scripts into Power BI, as displayed below:

The contrast between the two approaches - interpolation and approximation - can be vividly discerned here.

Data Derivatives in Power BI: A Python-based Interpolation and Approximation Analysis

Mastering Dynamic Stock Management with Cumulative...

Explore Small Multiples for the New Card Visual in...

Re: Get certified in Microsoft Fabric—for free!

Re: Get certified in Microsoft Fabric—for free!

Re: Get certified in Microsoft Fabric—for free!

Join us at the 2025 Microsoft Fabric Community Conference

Data Derivatives in Power BI: A Python-based Interpolation and Approximation Analysis

Mastering Dynamic Stock Management with Cumulative...

Explore Small Multiples for the New Card Visual in...

Re: Get certified in Microsoft Fabric—for free!

Re: Get certified in Microsoft Fabric—for free!

Re: Get certified in Microsoft Fabric—for free!