Introduction

Recurrent Neural Networks (RNNs) are a special type of neural networks that are suitable for learning representations of sequential data like text in Natural Language Processing (NLP). We will walk through a complete example of using RNNs for time series prediction, covering data preprocessing, model building, training, evaluation, and visualisation. This article will introduce Keras for RNN and provide an end-to-end system using RNN for time series prediction.

Introduction to Keras

Unlike traditional neural networks which assume that all inputs and outputs are independent of each other, RNNs make use of sequential information with the output dependent on the previous computations. For a more detailed introduction of RNN, please find this article RNN Introduction. Keras, developed by Google, is a high-level deep learning API known for its modular, user-friendly, and extensible features, facilitating the implementation of neural networks. In contrast, TensorFlow is an end-to-end open-source deep learning framework that provides comprehensive support for various machine learning tasks. Acknowledging the benefits of both platforms, Google merged Keras into TensorFlow as its official high-level API, enhancing accessibility and enabling support for multiple backend neural network computations. This integration streamlines the implementation of neural networks, providing users with simplified model development processes.

Below is a basic pipeline when using Keras for building a model:

Building a model in Keras. Image from: What is Keras
  1. Define a network: Define different layers in the model and the connections between them. Keras has two main model types: Sequential and Functional.
  2. Compile a network: Convert code in a machine understandable format. To compile code, we define the loss function to calculate losses in the model, optimiser to reduce the loss, and metrics to find the model accuracy.
  3. Fit the network: After compiling the model, train and fit model on the data.
  4. Evaluate the network: After fitting the model, evaluate the error in the model.
  5. Make predictions: Make predictions on new data.

A complete application for time series prediction

Set up the environment

We assume that you have Python installed (version 3.11.3). We need to have the TensorFlow and the TensorFlow Datasets packages installed on the system. In this tutorial we used the TensorFlow version 2.13.0.

Use:

pip install tensorflow
pip install matplotlib

Import the necessary packages

from pandas import read_csv
import numpy as np
import math
from keras.models import Sequential
from keras.layers import Dense, SimpleRNN
from sklearn.preprocessing import MinMaxScaler
from sklearn.metrics import mean_squared_error
import matplotlib.pyplot as plt

Read data from url

#Read data from given url and extract the second column
def read_data(url):
    df = read_csv(url, usecols=[1], engine='python')
    data = np.array(df.values.astype('float32'))
#Normalise data into (0,1) range 
    scaler = MinMaxScaler(feature_range=(0, 1))
    data = scaler.fit_transform(data).flatten()
    n = len(data)
    return data, nsunspots_url = 'https://raw.githubusercontent.com/jbrownlee/Datasets/master/monthly-sunspots.csv'
data, n = read_data(sunspots_url)

Split data into training and test sets

#Splitting data into train and test based on split ratio
def get_train_test(split_percent, data):
    n = len(data)
    split = int(n * split_percent)
    train_data = data[:split]
    test_data = data[split:]
    return train_data, test_data

split_percent = 0.8
train_data, test_data = get_train_test(split_percent, data)

The MinMaxScaler scales and translates each feature individually so that it is in the given range such as [0, 1]. The data are split into 80% training set and 20% testing set.

Reshaping data for Keras

This step is to prepare data for Keras model training. It is important to reshape input data in preparing data for use with a neural network. This helps to ensure that the network can make use of the data. For sequence prediction tasks like time series forecasting, the input shape typically is (total_samples x time_steps x features). total_samples is the number of data points in the dataset. time_steps refers to the number of historical observations used to predict the next time step. features denote the number of variables used for prediction.

#Reshape data into input-output pairs with specified time steps
def get_XY(dat, time_steps):
    Y_ind = np.arange(time_steps, len(dat), time_steps)
    Y = dat[Y_ind]
    rows_x = len(Y)
#Prepare Training and testing data
    X = dat[range(time_steps*rows_x)]
    X = np.reshape(X, (rows_x, time_steps, 1))    
    return X, Ytime_steps = 12
trainX, trainY = get_XY(train_data, time_steps)
testX, testY = get_XY(test_data, time_steps)

Create the RNN Model and Train

#Define the RNN model
def create_RNN(units, dense_units, input_shape, activation):
    model = Sequential()
    model.add(SimpleRNN(units, input_shape=input_shape, 
                        activation=activation[0], return_sequences=True))
    model.add(Dense(dense_units, activation=activation[1]))
#Compile the model
    model.compile(loss='mean_squared_error', optimizer='adam')
    return model

model = create_RNN(units=3, dense_units=1, input_shape=(time_steps,1), 
                   activation=['tanh', 'tanh'])
model.fit(trainX, trainY, epochs=10, batch_size=1, verbose=2)

The function above returns a sequential model with two layers, a SimpleRNN layer with three units and a Dense layer with one unit. It defines the input shape at 12×1 and tanh activation function is used. The SimpleRNN layer utilises the TensorFlow Keras API’s implementation, ensuring that the hidden state is recurrently fed back as input to the next state.The Dense layer is a standard fully connected layer which performs a linear transformation of the input data followed by an activation function.

loss function measures the difference between the predicted output of a model and the actual output, while an optimizer adjusts the model’s parameters to minimise the loss function. The model is compiled with the mean squared error loss and Adam gradient descent optimiser.

Model training log output.

The model performed as expected, with losses continuing to decline in each period. This reduction in losses means that the model is effectively learning from the training data and making progress in optimising its parameters to better fit the training data. This is a positive sign that the model is moving towards a solution and improving its predictive capabilities over time.

Compute and print the Root Mean Square Error (RMSE)

#Get error of predictions to evaluate it
def print_error(trainY, testY, train_predict, test_predict): 
    train_predict = train_predict.reshape(-1)
    test_predict = test_predict.reshape(-1)
    train_rmse = math.sqrt(mean_squared_error(trainY,train_predict))
    test_rmse = math.sqrt(mean_squared_error(testY, test_predict))
    print('Train RMSE: %.3f RMSE' % (train_rmse))
    print('Test RMSE: %.3f RMSE' % (test_rmse))   #Make predictions
train_predict = model.predict(trainX)
test_predict = model.predict(testX)
#Mean square error
print_error(trainY, testY, train_predict, test_predict)
Root Mean Squared Error.

RMSE is a measure of the differences between values predicted by the model and the actual observed values. Lower RMSE values typically indicate better performance of the model. According to a commonly used rule of thumb, RMSE values falling between 0.2 and 0.5 indicate that the model can relatively predict the data accurately. We can find from the result above, the test RMSE is 0.087 RMSE. Given this value falls lower than 0.2, it can be inferred that the RMSE values are relatively low. Consequently, it suggests that the model is performing well on the testing datasets.

Visualise the Results

#Plot the result
def plot_result(trainY, testY, train_predict, test_predict):
    actual = np.append(trainY, testY)
    predictions = np.append(train_predict, test_predict)
    rows = len(actual)
    plt.figure(figsize=(15, 6), dpi=80)
    plt.plot(range(rows), actual)
    plt.plot(range(rows), predictions)
    plt.axvline(x=len(trainY), color='r')
    plt.legend(['Actual', 'Predictions'])
    plt.xlabel('Observation number after given time steps')
    plt.ylabel('Sunspots scaled')
    plt.title('Actual and Predicted Values. The Red Line Separates The Training And Test Examples')

plot_result(trainY, testY, train_predict, test_predict)
Visualising the original data and resulting predictions.

According to the graph, we can see that the predicted values closely match the actual values even in the test set portion, which indicates that the RNN model has learned the underlying patterns in the data well and it can effectively make predictions.

Conclusion

In this article, we have shown how to implement a simple Recurrent Neural Network model for time series prediction using Keras with the TensorFlow Python package. By following the step-by-step process, beginners can gain a foundational understanding of RNNs and their application in sequential data modelling.

References

Catch the latest version of this article over on Medium.com. Hit the button below to join our readers there.

Learn more on Medium