How to use LSTM Neural Networks to trade Crude Oil

I couldn't find a shorter title

Mar 19, 2023

No time to waste, lets get to it.

joke - be like a neural network - devRant

Preamble Backdrop

The global arena is heavily influenced by the enigmatic essence of crude oil, an indispensable commodity that permeates the economic, political, and technological realms. To cast our gaze upon the volatility that is inherent in the energy resource market, forecasting the trajectory of crude oil prices is of paramount importance. This foresight not only empowers governments and energy departments with the tools necessary for informed policy-making, but also paves the way for sustainable economic growth. As such, the ability to accurately predict crude oil prices is invaluable.

Two predominant approaches to prediction are employed by researchers – one qualitative and the other quantitative, which encompasses econometric and statistical models.The majority of scholars opt for the latter methodology.

Nevertheless, the crude oil market is fraught with non-linearity, complicating the task of forecasting its movements. Neural network techniques offer a promising tool for tackling nonlinear time series forecasting challenges.

Project Objective

In this endeavor, I shall harness the power of deep learning, utilizing recurrent neural networks (e.g., LSTM, LSTM with dropout, etc.) and feed-forward neural networks (dense layer), to perform time series forecasting for WTI crude oil prices. The predictions generated by these models will offer valuable insights into the crude oil industry.

Data Synopsis

The behavior of crude oil prices is influenced by an intricate web of factors, resulting in a seemingly mysterious dance of price movements. My analysis will take into account energy resource prices and oil-sensitive stock prices, as I believe these two dimensions are inextricably linked to fluctuations in crude oil prices.

I have assembled data from various sources to create a comprehensive dataset:

Energy Resource Price

Crude Oil Prices: WTI (Western Texas Intermediate) crude oil price daily historical data, spanning from the 1980s to 2020, were obtained from the API of the U.S. Energy Information Administration (EIA).
Propane & Natural Gas:

Propane and natural gas are both prominent energy sources.

Propane, a byproduct of crude oil and natural gas processing, is heavily influenced by crude oil prices. Natural gas, meanwhile, has a correlation coefficient of 0.25 with crude oil, suggesting that 25% of natural gas price changes can be attributed to shifts in oil prices (on average, throughout the study period).

Daily price data for propane and natural gas, spanning from the 1990s to 2020, were also sourced from the EIA.

Oil-sensitive Stock Stock prices, which reflect real-time market information and are not subject to revision, hold potential as valuable predictors for crude oil prices. Therefore, oil-sensitive stock prices were also included.

Oil Company Stock Prices: Stock prices for the following oil companies, sourced from Yahoo Finance, were used as predictors for oil market volatility:

British Petroleum (BP): A British multinational oil and gas company and the world's 6th largest. ExxonMobil (XOM): An American multinational oil company headquartered in Texas. Chevron (CVR): A global petroleum industry leader, producing an average of 791M barrels of net oil-equivalent per day in the U.S. in 2018.

The "adjclose" value was used as the "stock price" in predictors, as it represents the closing price adjusted for splits and dividend distributions. Stock price history spans from 1997 to 2020.

Solar Company Stock Prices: Solar company share prices, like those of NextExtra Energy (NEE), are closely correlated to crude oil prices. This U.S. solar company's stock prices from 1997 to 2020 were included in the final dataset.

Time Series Adjustment: Upon gathering data from all sources, I aligned the time range of the dataset to form a time series spanning from January 8, 1997, to November 3, 2020.

Final Dataset The resulting dataset comprises 5951 rows × 8 columns. The multivariate predictor columns include:

Date
BP Stock Price
XOM Stock Price
CVR Stock Price
NEE Stock Price
Propane Price
Natural Gas Price
The target variable: Crude Oil Price

By leveraging this comprehensive dataset and employing deep learning techniques, this project endeavors to unravel the enigma of crude oil price fluctuations. Through the predictions generated, valuable business insights can be gleaned for the crude oil industry, potentially guiding policy-making and economic development in a world where energy resources remain indispensable.

Let us write the code now:

# Import required libraries

import pandas as pd

import numpy as np

import quandl

import requests_html

from yahoo_fin import stock_info as si

import matplotlib.pyplot as plt

import seaborn as sns

from sklearn.preprocessing import StandardScaler

# Install required libraries

!pip install quandl

!pip install yahoo_fin

!pip install requests_html

# API configuration to authorize connection, add your key to quotes

quandl.ApiConfig.api_key = "XXX-"

# Obtain WTI spot price data, below is the URL of the documentation

# https://www.quandl.com/data/EIA/PET_RWTC_D-Cushing-OK-WTI-Spot-Price-FOB-Daily

wti_data = quandl.get("EIA/PET_RWTC_D", authtoken="XXXX-")

# Pull Texas Propane price data

tx_propaine_data = quandl.get("EIA/PET_EER_EPLLPA_PF4_Y44MB_DPG_D", authtoken="XXXXX-")

# Pull Natural Gas price data

natural_gas_data = quandl.get("FRED/DHHNGSP", authtoken="XXXXX-")

# Combining the three datasets together assigning proper column names

combined_quandl_data = pd.concat([wti_data, tx_propaine_data, natural_gas_data], axis=1)

combined_quandl_data = combined_quandl_data.dropna()

combined_quandl_data.columns = ['WTI', 'TX_PROP', 'HEN_Nat_GAS']

# Assign the ticker list that we want to scrape

tickers_list = ['XOM', 'CVX', 'BP', 'NEE']

# Pull historical price data for each stock to match with news score later

dow_prices = {ticker : si.get_data(ticker, start_date='01/08/1997', end_date='11/04/2020', interval='1d') for ticker in tickers_list}

# Create a dataframe with stock prices

prep_data = pd.DataFrame(dow_prices['XOM']['adjclose']).rename(columns={"adjclose": "XOM"})

for i in tickers_list[1:]:

prep_data[i] = pd.DataFrame(dow_prices[i]['adjclose'])

# Combine the stock prices dataframe with the quandl data

final_dataset = pd.concat([combined_quandl_data, prep_data], axis=1)

final_dataset = final_dataset.dropna()

# Save the data

final_dataset.to_csv("/content/drive/My Drive/Colab Notebooks/Final_dataset.csv")

# Create return features for each ticker, use a pct_change as the return

return_data = final_dataset.pct_change()

return_data.dropna(inplace=True)

# Create a dataframe with the percentage changes and a column for whether there was an increase or decrease

df = pd.DataFrame(return_data['WTI'])

df['Increase'] = np.where(df['WTI'] > 0, 1, 0)

# Drop the last row because of the NaN

df.drop(df.tail(1).index, inplace=True)

# Plot the data

df.plot(subplots=True, grid=True, layout=(3,4), figsize=(15,15))

plt.show()

# Correlation analysis

df_corr = df.corr()

fig, ax = plt.subplots(figsize=(10, 10))

sns.heatmap(df_corr, annot=True)

# Prepare the data for training and testing

y = df['Increase']

X = df.drop(['Increase', 'WTI'], axis=1)

# Split the data into train and test partitions

# Use 80% of the data for train, and rest

#for validation

train_pct_index = int(0.8 * len(X))

X_train, X_test = X[:train_pct_index], X[train_pct_index:]

y_train, y_test = y[:train_pct_index], y[train_pct_index:]

#Scale the data using StandardScaler

scaler = StandardScaler()

X_train = scaler.fit_transform(X_train)

X_test = scaler.transform(X_test)

#Convert the arrays back to dataframes

X_train = pd.DataFrame(X_train)

X_test = pd.DataFrame(X_test)

y_train = pd.DataFrame(y_train)

y_test = pd.DataFrame(y_test)

#Print the shapes of the train and test data

print(X_train.shape, y_train.shape)

print(X_test.shape, y_test.shape)

# Reset the index on everything to avoid any issues with dates and integers

X_train.reset_index(inplace=True, drop=True)

X_test.reset_index(inplace=True, drop=True)

y_train.reset_index(inplace=True, drop=True)

y_test.reset_index(inplace=True, drop=True)

# Put X_train and y_train together (did not scale Y)

# Put X_test and y_test together (again, did not scale Y before)

df_train = pd.concat([X_train, y_train], axis=1)

df_test = pd.concat([X_test, y_test], axis=1)

# Print the shapes of the train and test data

print(df.shape, df_train.shape, df_test.shape)

df_train.head()

# Print the count of each value of Increase column

print(df['Increase'].value_counts())

# Split a multivariate sequence into samples

def split_sequences(sequences, n_steps):

X, y = [], []

for i in range(0, len(sequences) - n_steps):

# Find the end of this pattern

end_ix = i + n_steps

# Gather input and output parts of the pattern

seq_x, seq_y = sequences[i:end_ix, :-1], sequences[end_ix-1, -1]

X.append(seq_x)

y.append(seq_y)

return np.array(X), np.array(y)

# Setting time look back of 10

n_steps = 10

X_train, y_train = split_sequences(np.array(df_train), n_steps)

X_test, y_test = split_sequences(np.array(df_test), n_steps)

# Check the shape of the train and test data

print(X_train.shape, y_train.shape)

print(X_test.shape, y_test.shape)

# Verify no NaN values

print(np.isnan(X_train).sum())

print(np.isnan(y_train).sum())

print(np.isnan(X_test).sum())

print(np.isnan(y_test).sum())

# Print the first element of X_train

print(X_train[0])

# Print the first element of y_train and the first 10 elements of df

print(y_train[0], df.head(10))

# Confirm the correct features and steps

n_steps = X_train.shape[1]

n_features = X_train.shape[2]

print(n_steps, n_features)

# Now let's build a model

# Need to update for classification

# Define the number of steps and features

n_steps = X_train.shape[1]

n_features = X_train.shape[2]

# Define the model

model = Sequential()

model.add(SimpleRNN(60, input_shape=(n_steps, n_features), activation='relu'))

model.add(Dense(1, activation='sigmoid'))

model.summary()

# Compile the model

model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['acc'])

# Set up early stopping

es = EarlyStopping(monitor='val_acc', mode='max', patience=10, verbose=1, restore_best_weights=True)

# Fit the model

model.fit(X_train, y_train, epochs=500, batch_size=5, validation_split=0.2, verbose=1, callbacks=[es], shuffle=True)

# Make a prediction

pred = model.predict(X_test)

print(pred)

# Round the predictions to 0 or 1

pred = np.round(pred, 0)

pred

# Import confusion matrix and classification report from sklearn metrics

from sklearn.metrics import confusion_matrix

from sklearn.metrics import classification_report

# Print the confusion matrix and classification report

print(confusion_matrix(y_test, pred))

print(classification_report(y_test, pred))

# Plot the test results

plt.figure(figsize=(20, 10))

plt.plot(np.arange(X_test[900:].shape[0]), y_test[900:], color='blue') # actual data

plt.plot(np.arange(X_test[900:].shape[0]), pred[900:], color='grey') # predicted data

plt.suptitle('Test Results')

plt.xlabel('Time')

plt.ylabel('Increase?')

plt.show()

# Now let's build a model

# Since this is a univariate problem, n_features will be 1 (defined this before)

# Define the model

model = Sequential()

model.add(LSTM(30, input_shape=(n_steps, n_features), activation='relu'))

model.add(Dense(1, activation='sigmoid'))

model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['acc'])

model.summary()

# Set up early stopping

es = EarlyStopping(monitor='val_acc', mode='max', patience=10, verbose=1, restore_best_weights=True)

# Fit the model

model.fit(X_train, y_train, epochs=500, batch_size=5, validation_split=0.2, verbose=1, callbacks=[es], shuffle=True)

# Make a prediction

pred = model.predict(X_test)

print(pred)

# Round the predictions to 0 or 1

pred = np.round(pred, 0)

print(pred)

# Plot the confusion matrix and classification report

plt.figure(figsize=(20, 10))

print(confusion_matrix(y_test, pred))

print(classification_report(y_test, pred))

# Plot the timeseries plot on the train and validation data

plt.plot(np.arange(X_test.shape[0]), y_test, color='blue') # actual data

plt.plot(np.arange(X_test.shape[0]), pred, color='red') # predicted data

plt.suptitle('Test Results')

plt.xlabel('Time')

plt.ylabel('Increase?')

plt.show()

# Zoom in to see last few values

plt.figure(figsize=(20, 10))

plt.plot(np.arange(X_test[900:].shape[0]), y_test[900:], color='blue') # actual data

plt.plot(np.arange(X_test[900:].shape[0]), pred[900:], color='grey') # predicted data

plt.suptitle('Test Results')

plt.xlabel('Time')

plt.ylabel('Increase?')

plt.show()

# Define the number of steps and features

n_steps = X_train.shape[1]

n_features = X_train.shape[2]

# Define the model

model = Sequential()

model.add((LSTM(64, return_sequences=True, activation='relu', input_shape=(n_steps, n_features))))

model.add(Dropout(0.1))

model.add(Bidirectional(LSTM(32, activation='relu')))

model.add(Dropout(0.1))

model.add(Dense(32, activation='relu'))

model.add(Dropout(0.1))

model.add(Dense(1, activation='sigmoid'))

model.summary()

# Compile the model

model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])

# early stopping

es = EarlyStopping(monitor='val_accuracy',

mode='max',

patience=20,

verbose=1,

restore_best_weights=True)

# fit model, if you run locally and remember the seed, you will be able to rerun 6 epochs and get the best val accuracy

model.fit(X_train, y_train,

epochs=100,

batch_size=5,

validation_split=0.2,

verbose=1,

callbacks=[es],

shuffle=True)

#The EarlyStopping callback is created with monitor='val_accuracy', which means it will monitor the validation accuracy for improvements.

The fit method is called on the model object with the training data (X_train, y_train), epochs=100, batch_size=5, and validation_split=0.2. This means that 20% of the training data will be used for validation during training. The es callback is also passed as an argument to stop training early if there is no improvement in validation accuracy after 20 epochs.

# make a prediction

pred = model.predict(X_test)# the pred

# print(pred) # round them!

pred = np.round(pred,0)

# print(pred) # run all if you get an error...

# confusion matrix - put this at the top!

from sklearn.metrics import confusion_matrix

from sklearn.metrics import classification_report

print(confusion_matrix(y_test, pred)) # looks pretty good!

print(classification_report(y_test, pred))

plt.figure(figsize=(20,10))

# show timeseries plot on the train and validation data

plt.plot(np.arange(X_test.shape[0]), y_test, color='blue') # actual data

plt.plot(np.arange(X_test.shape[0]), pred, color='red') # predicted data

plt.suptitle('Test Results')

plt.xlabel('Time')

plt.ylabel('Increase?')

plt.show()

#Zooming in to see last few values

plt.figure(figsize=(20,10))

plt.plot(np.arange(X_test[900:].shape[0]), y_test[900:], color='blue') # actual data

plt.plot(np.arange(X_test[900:].shape[0]), pred[900:], color='grey') # predicted data

plt.suptitle('Test Results')

plt.xlabel('Time')

plt.ylabel('Increase?')

plt.show()

# since this is a univariate problem, n_features will be 1 (defined this before)

# define model

model = Sequential()

model.add(Conv1D(filters=128, kernel_size=3, input_shape=(n_steps,n_features))) # notice how input shape goes in first layer

model.add(MaxPooling1D(2))

model.add(Bidirectional(LSTM(30,

return_sequences=True, # remember, if stacking layers, need to return sequences!

activation='relu',

recurrent_dropout=0.2)))

model.add(GRU(20, activation='relu'))

model.add(Dropout(0.1)

# define the model

model = Sequential()

model.add(Conv1D(filters=64, kernel_size=3, input_shape=(n_steps, n_features)))

model.add(Bidirectional(LSTM(30, activation='relu', return_sequences=True)))

model.add(GRU(20, activation='relu'))

model.add(Dropout(0.1))

model.add(Dense(1, activation='sigmoid'))

model.summary()

# set up early stopping

es = EarlyStopping(monitor='val_acc', mode='max', patience=10, verbose=1, restore_best_weights=True)

# fit the model (using early stopping)

model.fit(X_train, y_train, epochs=500, batch_size=5, validation_split=0.2, verbose=1, callbacks=[es], shuffle=True)

# make predictions

pred = model.predict(X_test)

pred = np.round(pred, 0)

# calculate and print confusion matrix and classification report

from sklearn.metrics import confusion_matrix

from sklearn.metrics import classification_report

print(confusion_matrix(y_test, pred))

print(classification_report(y_test, pred))

# show timeseries plot on the test data

plt.figure(figsize=(20, 10))

plt.plot(np.arange(X_test.shape[0]), y_test, color='blue')

plt.plot(np.arange(X_test.shape[0]), pred, color='grey')

plt.suptitle('Test Results')

plt.xlabel('Time')

plt.ylabel('Increase?')

plt.show()

# zoom in to see last few values

plt.figure(figsize=(20, 10))

plt.plot(np.arange(X_test[900:].shape[0]), y_test[900:], color='blue')

plt.plot(np.arange(X_test[900:].shape[0]), pred[900:], color='grey')

plt.suptitle('Test Results')

plt.xlabel('Time')

plt.ylabel('Increase?')

plt.show()

In this code, we aim to predict crude oil prices using three different types of neural networks: a simple Recurrent Neural Network (RNN), a Long Short-Term Memory (LSTM) network, and an LSTM network with an additional layer and dropout. These neural networks are designed to model sequential data, making them suitable for time series predictions like the crude oil price.

First, we acquire the crude oil price dataset from the Quandl API. Afterward, we preprocess the data by scaling the values and reshaping it into a format that can be fed into our neural networks.

Next, we create three distinct models: an RNN, an LSTM, and an LSTM with an additional layer and dropout. The LSTM model, in particular, is designed to better capture long-term dependencies in time series data. By adding dropout layers, we aim to reduce overfitting, enabling the model to generalize better to unseen data.

Once the models are built, we train them using the historical crude oil price data. Training involves the models learning the patterns and dependencies in the data, adjusting their internal weights to minimize the error in their predictions.

After training, we evaluate the performance of each model by comparing their predictions to the actual crude oil prices. We then visualize the results using a line chart, showing the actual prices and the predictions made by each of our three models. This visualization allows us to easily compare the performance of the models and assess how well they were able to capture the trends in the crude oil price data.

In summary, this code demonstrates how to use different types of neural networks to predict crude oil prices, providing insights into the potential future trends in the market. By understanding these trends, one can make more informed decisions regarding investments, trading, or policy-making in the energy sector.

If you are reading this post and thinking “What the Hell did I just read?”

Then it is okay. Learning Python can seem like a daunting task at the beginning but you can work your way through it.
However there are a lot of free courses to learn python online.

https://www.udemy.com/course/python-hackcc/
https://www.freecodecamp.org/learn/data-analysis-with-python/
https://www.freecodecamp.org/learn/machine-learning-with-python/
https://www.coursera.org/learn/python

Learning python in and of itself will not make you a good trader. There are many amazing coders who cannot trade and many great traders who cannot code. It is not the only way to get good at trading. However, the more you automate your trading less likely you are to “tilt” or trade based on emotions.

Weekly plan would be out per usual before globex open.

-Fin

AlgoFlows’s Newsletter

How to use LSTM Neural Networks to trade Crude Oil

I couldn't find a shorter title