Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!
Scenario:
Usually, we can forecast values and return a line chart by using R visual in Power BI. But since it can only return image, how can we return the values as a table? And how can we use these values in the subsequent analysis?
Sample data:
The sample data is as follows and the complete data is in the attachment.
Date |
Sales |
2019/1/1 |
2 |
2019/1/2 |
4 |
2019/1/3 |
6 |
… |
… |
2019/5/24 |
288 |
2019/5/25 |
289 |
2019/5/26 |
3 |
2019/5/27 |
6 |
2019/5/28 |
9 |
… |
… |
2019/12/30 |
657 |
2019/12/31 |
660 |
Expected output:
Details:
Since the R visual in Power BI can only return image, what we need to do is to transfer the result data table into an image.
Generally, we can create an ARIMA model to forecast values. So we choose “auto.arima” function in “forecast“ package. It returns the best ARIMA model according to AIC, AICc and BIC value. The function conducts a search over possible models within the order constraints provided.
Before continuing, make some preparations, just like what in my previous blog, and install packages: forecast, gridExtra, odbc and DBI.
# Load "forecast" package after installed
library(forecast)
# Use ts() function to convert a numeric vector into an R time series object
ARIMA_Sales<-ts(dataset$Sales,start=c(1))
# Fit best ARIMA model to univariate time series
Sales_Forecast<-auto.arima(ARIMA_Sales,seasonal=TRUE)
# Forecast time series
# h means number of periods for forecasting, here is 100 days
Predicted_Sales<- forecast(Sales_Forecast, h=100)
# plot an image
plot(Predicted_Sales)
Here is the result:
library(forecast)
ARIMA_Sales<-ts(dataset$Sales,start=c(1))
Sales_Forecast<-auto.arima(ARIMA_Sales,seasonal=TRUE)
# In order to better present the results, we only forecast 20 days
Predicted_Sales<- forecast(Sales_Forecast, h=20)
# Create data frame
Predicted_Sales<-data.frame(Predicted_Sales)
# Create Forecast Date
# max() function is used to get the last date in the dataset
# as.Date() function is used to convert the object into Date type
# nrow() function is used to return the number of rows of an array
# 1:nrow(Predicted_Sales) returns a column from 1 to Predicted_Sales row number
Predicted_Sales$ForecastDate <- as.Date(max(dataset$Date))+1:nrow(Predicted_Sales)
# Load "gridExtra" package
library(gridExtra)
# Display the table as a grid graphic
# [, c(6,1)] is used to specify the output columns and their order
grid.table(Predicted_Sales[,c(6,1)])
Here is the result:
# Use dataset$ForecastDaysValue to dynamically control the forecast period
Predicted_Sales <- forecast(Sales_Forecast, h=dataset$ForecastDaysValue)
Here is the result:
Extension:
Now that the data is displayed in the R vsiual, can it be used in subsequent analysis?
Based on my research, there is no direct way, but I found a relatively feasible solution. We can save the result into SQL Server database, connect to it via DirectQuery and set automatic page refresh with the smallest possible time interval, to get the latest calculated result of the R visual. Here is part of the script:
library(odbc)
library(DBI)
# Connect to a SQL Server database
con <- dbConnect(odbc(),Driver = "SQL Server", Server = "YourServerName",Database = "YourDatabase", Trusted_Connection = "True")
# Drop table if it already exists
if (dbExistsTable(con, "temp_forecast"))
dbRemoveTable(con, "temp_forecast")
# Write the data frame to the database
dbWriteTable(con, name = "temp_forecast", value = Predicted_Sales, row.names = FALSE)
You can open the attached .pbix file and enter your SQL Server name and the database name to test it.
Author: Icey Zhang
Reviewer: Ula Huang, Kerry Wang
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.