Using CNN for financial time assortment prediction
- Get link
- X
- Other Apps
Last Updated on November 20, 2023
Convolutional neural networks have their roots in image processing. It was first printed in LeNet to acknowledge the MNIST handwritten digits. However, convolutional neural networks aren’t restricted to coping with footage.
In this tutorial, we’re going to check out an occasion of using CNN for time assortment prediction with an utility from financial markets. By methodology of this occasion, we’ll uncover some strategies in using Keras for model teaching as properly.
After ending this tutorial, you will know
- What a typical multidimensional financial data assortment seems to be like like?
- How can CNN utilized to time assortment in a classification draw back
- How to utilize generators to feed data to educate a Keras model
- How to supply a custom-made metric for evaluating a Keras model
Let’s get started

Using CNN for financial time assortment prediction
Photo by Aron Visuals, some rights reserved.
Tutorial overview
This tutorial is cut up into 7 elements; they’re:
- Background of the thought
- Preprocessing of information
- Data generator
- The model
- Training, validation, and check out
- Extensions
- Does it work?
Background of the thought
In this tutorial we’re following the paper titled “CNNpred: CNN-based stock market prediction using a iverse set of variables” by Ehsan Hoseinzade and Saman Haratizadeh. The data file and sample code from the creator may be present in github:
The goal of the paper is easy: To predict the next day’s path of the stock market (i.e., up or down as compared with for the time being), subsequently it is a binary classification draw back. However, it is fascinating to see how this draw back are formulated and solved.
We have seen the examples on using CNN for sequence prediction. If we ponder Dow Jones Industrial Average (DJIA) as an illustration, we would assemble a CNN with 1D convolution for prediction. This is wise on account of a 1D convolution on a time assortment is roughly computing its shifting widespread or using digital signal processing phrases, making use of a filter to the time assortment. It must current some clues regarding the improvement.
However, as soon as we check out financial time assortment, it is pretty a typical sense that some derived alerts are useful for predictions too. For occasion, worth and amount collectively can current a better clue. Also one other technical indicators such as a result of the shifting widespread of assorted window dimension are useful too. If we put all these align collectively, we may have a desk of information, which each and every time event has numerous choices, and the target stays to be to predict the trail of one time assortment.
In the CNNpred paper, 82 such choices are prepared for the DJIA time assortment:

Excerpt from the CNNpred paper displaying the itemizing of choices used.
Unlike LSTM, which there’s an particular thought of time steps utilized, we present data as a matrix in CNN fashions. As confirmed throughout the desk below, the choices all through numerous time steps are launched as a 2D array.
Preprocessing of information
In the following, we try and implement the considered the CNNpred from scratch using Tensorflow’s keras API. While there is a reference implementation from the creator throughout the github hyperlink above, we reimplement it differently as an illustration some Keras strategies.
Firstly the information are 5 CSV recordsdata, each for a definite market index, under the Dataset
itemizing from github repository above, or we’re capable of moreover get a duplicate proper right here:
- CNNpred-data.zip
The enter data has a date column and a status column to find out the ticker picture for the market index. We can go away the date column as time index and take away the determine column. The rest are all numerical.
As we’ll predict the market path, we first try and create the classification label. The market path is printed as a result of the closing index of tomorrow as compared with for the time being. If we have now now be taught the information proper right into a pandas DataPhysique, we’re in a position to make use of X["Close"].pct_change()
to look out the share change, which a optimistic change for the market goes up. So we’re capable of shift this to 1 time step once more as our label:
1 2 | ... X[“Target”] = (X[“Close”].pct_change().shift(–1) > 0).astype(int) |
The line of code above is to compute the share change of the closing index and align the information with the previous day. Then convert the information into each 1 or 0 for whether or not or not the share change is optimistic.
For 5 data file throughout the itemizing, we be taught each of them as a separate pandas DataPhysique and maintain them in a Python dictionary:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | ... data = {} for filename in os.listdir(DATADIR): if not filename.lower().endswith(“.csv”): proceed # be taught solely the CSV recordsdata filepath = os.path.be part of(DATADIR, filename) X = pd.read_csv(filepath, index_col=“Date”, parse_dates=True) # elementary preprocessing: get the determine, the classification # Save the objective variable as a column in dataframe for easier dropna() determine = X[“Name”][0] del X[“Name”] cols = X.columns X[“Target”] = (X[“Close”].pct_change().shift(–1) > 0).astype(int) X.dropna(inplace=True) # Fit the same old scaler using the teaching dataset index = X.index[X.index > TRAIN_TEST_CUTOFF] index = index[:int(len(index) * TRAIN_VALID_RATIO)] scaler = StandardScaler().match(X.loc[index, cols]) # Save scale reworked dataframe X[cols] = scaler.rework(X[cols]) data[name] = X |
The outcomes of the above code is a DataPhysique for each index, which the classification label is the column “Target” whereas all completely different columns are enter choices. We moreover normalize the information with an everyday scaler.
In time assortment points, it is usually low-cost to not minimize up the information into teaching and check out models randomly, nevertheless to rearrange a cutoff stage by which the information sooner than the cutoff is teaching set whereas that afterwards is the check out set. The scaling above are based on the teaching set nevertheless utilized to the entire dataset.
Data generator
We aren’t going to utilize all time steps straight, nevertheless instead, we use a set dimension of N time steps to predict the market path at step N+1. In this design, the window of N time steps can start from anyplace. We can merely create quite a few DataFrames with large amount of overlaps with one another. To save memory, we’ll assemble an data generator for teaching and validation, as follows:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 | ... TRAIN_TEST_CUTOFF = ‘2023-04-21’ TRAIN_VALID_RATIO = 0.75 def datagen(data, seq_len, batch_size, targetcol, selection): “As a generator to offer samples for Keras model” batch = [] whereas True: # Pick one dataframe from the pool key = random.choice(itemizing(data.keys())) df = data[key] input_cols = [c for c in df.columns if c != targetcol] index = df.index[df.index < TRAIN_TEST_CUTOFF] minimize up = int(len(index) * TRAIN_VALID_RATIO) if selection == ‘observe’: index = index[:split] # fluctuate for the teaching set elif selection == ‘professional’: index = index[split:] # fluctuate for the validation set # Pick one place, then clip a sequence dimension whereas True: t = random.choice(index) # select one time step n = (df.index == t).argmax() # uncover its place throughout the dataframe if n–seq_len+1 < 0: proceed # can not get enough data for one sequence dimension physique = df.iloc[n–seq_len+1:n+1] batch.append([frame[input_cols].values, df.loc[t, targetcol]]) break # if we get enough for a batch, dispatch if len(batch) == batch_size: X, y = zip(*batch) X, y = np.expand_dims(np.array(X), 3), np.array(y) yield X, y batch = [] |
Generator is a specific function in Python that does not return
a value nevertheless to yield
in iterations, such {{that a}} sequence of information are produced from it. For a generator to be used in Keras teaching, it is anticipated to yield
a batch of enter data and objective. This generator imagined to run indefinitely. Hence the generator function above is created with an infinite loop begins with whereas True
.
In each iteration, it randomly select one DataPhysique from the Python dictionary, then contained in the fluctuate of time steps of the teaching set (i.e., the beginning portion), we start from a random stage and take N time steps using the pandas iloc[start:end]
syntax to create a enter under the variable physique
. This DataPhysique shall be a 2D array. The objective label is that of the ultimate time step. The enter data and the label are then appended to the itemizing batch
. Until we amassed for one batch’s dimension, we dispatch it from the generator.
The remaining 4 traces on the code snippet above is to dispatch a batch for teaching or validation. We purchase the itemizing of enter data (each a 2D array) along with an inventory of objective label into variables X
and y
, then convert them into numpy array so it might effectively work with our Keras model. We need in order so as to add one other dimension to the numpy array X
using np.expand_dims()
as a result of design of the group model, as outlined below.
The Model
The 2D CNN model launched throughout the distinctive paper accepts an enter tensor of kind $Ntimes m events 1$ for N the number of time steps and m the number of choices in each time step. The paper assumes $N=60$ and $m=82$.
The model consists of of three convolutional layers, as described as follows:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | ... def cnnpred_2d(seq_len=60, n_features=82, n_filters=(8,8,8), droprate=0.1): “2D-CNNpred model in response to the paper” model = Sequential([ Input(shape=(seq_len, n_features, 1)), Conv2D(n_filters[0], kernel_size=(1, n_features), activation=“relu”), Conv2D(n_filters[1], kernel_size=(3,1), activation=“relu”), MaxPool2D(pool_size=(2,1)), Conv2D(n_filters[2], kernel_size=(3,1), activation=“relu”), MaxPool2D(pool_size=(2,1)), Flatten(), Dropout(droprate), Dense(1, activation=“sigmoid”) ]) return model |
and the model is launched by the following:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 | Model: “sequential” _________________________________________________________________ Layer (sort) Output Shape Param # ================================================================= conv2d (Conv2D) (None, 60, 1, 8) 664 _________________________________________________________________ conv2d_1 (Conv2D) (None, 58, 1, 8) 200 _________________________________________________________________ max_pooling2d (MaxPooling2D) (None, 29, 1, 8) 0 _________________________________________________________________ conv2d_2 (Conv2D) (None, 27, 1, 8) 200 _________________________________________________________________ max_pooling2d_1 (MaxPooling2 (None, 13, 1, 8) 0 _________________________________________________________________ flatten (Flatten) (None, 104) 0 _________________________________________________________________ dropout (Dropout) (None, 104) 0 _________________________________________________________________ dense (Dense) (None, 1) 105 ================================================================= Total params: 1,169 Trainable params: 1,169 Non-trainable params: 0 |
The first convolutional layer has 8 fashions, and is utilized all through all choices in each time step. It is adopted by a second convolutional layer to consider three consecutive days straight, for it is a frequent notion that three days may make a improvement throughout the stock market. It is then utilized to a max pooling layer and one different convolutional layer sooner than it is flattened proper right into a one-dimensional array and utilized to a fully-connected layer with sigmoid activation for binary classification.
Training, validation, and check out
That’s it for the model. The paper used MAE as a result of the loss metric and likewise monitor for accuracy and F1 ranking to seek out out the usual of the model. We must stage out that F1 ranking depends upon precision and recall ratios, which can be every considering the optimistic classification. The paper, nonetheless, ponder the standard of the F1 from optimistic and opposed classification. Explicitly, it is the F1-macro metric:
$$
F_1 = frac{1}{2}left(
frac{2cdot frac{TP}{TP+FP} cdot frac{TP}{TP+FN}}{frac{TP}{TP+FP} + frac{TP}{TP+FN}}
+
frac{2cdot frac{TN}{TN+FN} cdot frac{TN}{TN+FP}}{frac{TN}{TN+FN} + frac{TN}{TN+FP}}
correct)
$$
The fraction $frac{TP}{TP+FP}$ is the precision with TP and FP the number of true optimistic and false optimistic. Similarly $frac{TP}{TP+FN}$ is the recall. The first time interval throughout the large parenthesis above is the normal F1 metric that thought-about optimistic classifications. And the second time interval is the reverse, which thought-about the opposed classifications.
While this metric is on the market in scikit-learn as sklearn.metrics.f1_score()
there is not a equal in Keras. Hence we might create our private by borrowing code from this stackexchange question:
The teaching course of can take hours to complete. Hence we have to save the model within the midst of the teaching so that we would interrupt and resume it. We may make use of checkpoint choices in Keras:
1 2 3 4 5 6 | checkpoint_path = “./cp2d-{epoch}-{val_f1macro:.2f}.h5” callbacks = [ ModelCheckpoint(checkpoint_path, monitor=‘val_f1macro’, mode=“max”, verbose=0, save_best_only=True, save_weights_only=False, save_freq=“epoch”) ] |
We prepare a filename template checkpoint_path
and ask Keras to fill throughout the epoch amount along with validation F1 ranking into the filename. We reserve it by monitoring the validation’s F1 metric, and this metric is supposed to increase when the model will get greater. Hence we transfer throughout the mode="max"
to it.
It must now be trivial to educate our model, as follows:
1 2 3 4 5 6 7 8 9 10 11 | seq_len = 60 batch_size = 128 n_epochs = 20 n_features = 82 model = cnnpred_2d(seq_len, n_features) model.compile(optimizer=“adam”, loss=“mae”, metrics=[“acc”, f1macro]) model.match(datagen(data, seq_len, batch_size, “Target”, “observe”), validation_data=datagen(data, seq_len, batch_size, “Target”, “professional”), epochs=n_epochs, steps_per_epoch=400, validation_steps=10, verbose=1, callbacks=callbacks) |
Two components to note throughout the above snippets. We outfitted "acc"
as a result of the accuracy along with the function f1macro
outlined above as a result of the metrics
parameter to the compile()
function. Hence these two metrics shall be monitored all through teaching. Because the function known as f1macro
, we consult with this metric throughout the checkpoint’s monitor
parameter as val_f1macro
.
Separately, throughout the match()
function, we provided the enter data by the use of the datagen()
generator as outlined above. Calling this function will produce a generator, which by way of the teaching loop, batches are fetched from it one after one different. Similarly, validation data are moreover provided by the generator.
Because the character of a generator is to dispatch data indefinitely. We need to inform the teaching course of on learn the way to stipulate a epoch. Recall that in Keras phrases, a batch is one iteration of doing gradient descent substitute. An epoch is supposed to be one cycle by the use of all data throughout the dataset. At the tip of an epoch is the time to run validation. It may be the prospect for working the checkpoint we outlined above. As Keras has no choice to infer the dimensions of the dataset from a generator, we have now to tell what variety of batch it must course of in a single epoch using the steps_per_epoch
parameter. Similarly, it is the validation_steps
parameter to tell what variety of batch are utilized in each validation step. The validation does not affect the teaching, nevertheless it might report again to us the metrics we have now an curiosity. Below is a screenshot of what we’re going to see within the midst of teaching, which we’re going to see that the metric for teaching set are updated on each batch nevertheless that for validation set is provided solely on the end of epoch:
1 2 3 4 5 6 | Epoch 1/20 400/400 [==============================] – 43s 106ms/step – loss: 0.4062 – acc: 0.6184 – f1macro: 0.5237 – val_loss: 0.4958 – val_acc: 0.4969 – val_f1macro: 0.4297 Epoch 2/20 400/400 [==============================] – 44s 111ms/step – loss: 0.2760 – acc: 0.7489 – f1macro: 0.7304 – val_loss: 0.5007 – val_acc: 0.4984 – val_f1macro: 0.4833 Epoch 3/20 60/400 [===>……………………..] – ETA: 39s – loss: 0.2399 – acc: 0.7783 – f1macro: 0.7643 |
After the model accomplished teaching, we’re in a position to check out it with unseen data, i.e., the check out set. Instead of manufacturing the check out set randomly, we create it from the dataset in a deterministic methodology:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 | def testgen(data, seq_len, targetcol): “Return array of all check out samples” batch = [] for key, df in data.objects(): input_cols = [c for c in df.columns if c != targetcol] # uncover the start of check out sample t = df.index[df.index >= TRAIN_TEST_CUTOFF][0] n = (df.index == t).argmax() for i in fluctuate(n+1, len(df)+1): physique = df.iloc[i–seq_len:i] batch.append([frame[input_cols].values, physique[targetcol][–1]]) X, y = zip(*batch) return np.expand_dims(np.array(X),3), np.array(y) # Prepare check out data test_data, test_target = testgen(data, seq_len, “Target”) # Test the model test_out = model.predict(test_data) test_pred = (test_out > 0.5).astype(int) print(“accuracy:”, accuracy_score(test_pred, test_target)) print(“MAE:”, mean_absolute_error(test_pred, test_target)) print(“F1:”, f1_score(test_pred, test_target)) |
The building of the function testgen()
is resembling that of datagen()
we outlined above. Except in datagen()
the output data’s first dimension is the number of samples in a batch nevertheless in testgen()
is the the entire check out samples.
Using the model for prediction will produce a floating stage between 0 and 1 as we’re using the sigmoid activation function. We will convert this into 0 or 1 by using the brink at 0.5. Then we use the options from scikit-learn to compute the accuracy, indicate absolute error and F1 ranking (which accuracy is just one minus the MAE).
Tying all these collectively, the entire code is as follows:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 | import os import random import numpy as np import pandas as pd import tensorflow as tf from tensorflow.keras import backend as Ok from tensorflow.keras.layers import Dense, Dropout, Flatten, Conv2D, MaxPool2D, Input from tensorflow.keras.fashions import Sequential, load_model from tensorflow.keras.callbacks import ModelCheckpoint from sklearn.preprocessing import StandardScaler from sklearn.metrics import accuracy_score, f1_score, mean_absolute_error DATADIR = “./Dataset” TRAIN_TEST_CUTOFF = ‘2023-04-21’ TRAIN_VALID_RATIO = 0.75 # https://datascience.stackexchange.com/questions/45165/how-to-get-accuracy-f1-precision-and-recall-for-a-keras-model # to implement F1 ranking for validation in a batch def recall_m(y_true, y_pred): true_positives = Ok.sum(Ok.spherical(Ok.clip(y_true * y_pred, 0, 1))) possible_positives = Ok.sum(Ok.spherical(Ok.clip(y_true, 0, 1))) recall = true_positives / (possible_positives + Ok.epsilon()) return recall def precision_m(y_true, y_pred): true_positives = Ok.sum(Ok.spherical(Ok.clip(y_true * y_pred, 0, 1))) predicted_positives = Ok.sum(Ok.spherical(Ok.clip(y_pred, 0, 1))) precision = true_positives / (predicted_positives + Ok.epsilon()) return precision def f1_m(y_true, y_pred): precision = precision_m(y_true, y_pred) recall = recall_m(y_true, y_pred) return 2*((precision*recall)/(precision+recall+Ok.epsilon())) def f1macro(y_true, y_pred): f_pos = f1_m(y_true, y_pred) # opposed mannequin of the information and prediction f_neg = f1_m(1–y_true, 1–Ok.clip(y_pred,0,1)) return (f_pos + f_neg)/2 def cnnpred_2d(seq_len=60, n_features=82, n_filters=(8,8,8), droprate=0.1): “2D-CNNpred model in response to the paper” model = Sequential([ Input(shape=(seq_len, n_features, 1)), Conv2D(n_filters[0], kernel_size=(1, n_features), activation=“relu”), Conv2D(n_filters[1], kernel_size=(3,1), activation=“relu”), MaxPool2D(pool_size=(2,1)), Conv2D(n_filters[2], kernel_size=(3,1), activation=“relu”), MaxPool2D(pool_size=(2,1)), Flatten(), Dropout(droprate), Dense(1, activation=“sigmoid”) ]) return model def datagen(data, seq_len, batch_size, targetcol, selection): “As a generator to offer samples for Keras model” batch = [] whereas True: # Pick one dataframe from the pool key = random.choice(itemizing(data.keys())) df = data[key] input_cols = [c for c in df.columns if c != targetcol] index = df.index[df.index < TRAIN_TEST_CUTOFF] minimize up = int(len(index) * TRAIN_VALID_RATIO) assert minimize up > seq_len, “Training data too small for sequence dimension {}”.format(seq_len) if selection == ‘observe’: index = index[:split] # fluctuate for the teaching set elif selection == ‘professional’: index = index[split:] # fluctuate for the validation set else: elevate NotImplementedError # Pick one place, then clip a sequence dimension whereas True: t = random.choice(index) # select one time step n = (df.index == t).argmax() # uncover its place throughout the dataframe if n–seq_len+1 < 0: proceed # this sample is not enough for one sequence dimension physique = df.iloc[n–seq_len+1:n+1] batch.append([frame[input_cols].values, df.loc[t, targetcol]]) break # if we get enough for a batch, dispatch if len(batch) == batch_size: X, y = zip(*batch) X, y = np.expand_dims(np.array(X), 3), np.array(y) yield X, y batch = [] def testgen(data, seq_len, targetcol): “Return array of all check out samples” batch = [] for key, df in data.objects(): input_cols = [c for c in df.columns if c != targetcol] # uncover the start of check out sample t = df.index[df.index >= TRAIN_TEST_CUTOFF][0] n = (df.index == t).argmax() # extract sample using a sliding window for i in fluctuate(n+1, len(df)+1): physique = df.iloc[i–seq_len:i] batch.append([frame[input_cols].values, physique[targetcol][–1]]) X, y = zip(*batch) return np.expand_dims(np.array(X),3), np.array(y) # Read data into pandas DataFrames data = {} for filename in os.listdir(DATADIR): if not filename.lower().endswith(“.csv”): proceed # be taught solely the CSV recordsdata filepath = os.path.be part of(DATADIR, filename) X = pd.read_csv(filepath, index_col=“Date”, parse_dates=True) # elementary preprocessing: get the determine, the classification # Save the objective variable as a column in dataframe for easier dropna() determine = X[“Name”][0] del X[“Name”] cols = X.columns X[“Target”] = (X[“Close”].pct_change().shift(–1) > 0).astype(int) X.dropna(inplace=True) # Fit the same old scaler using the teaching dataset index = X.index[X.index < TRAIN_TEST_CUTOFF] index = index[:int(len(index) * TRAIN_VALID_RATIO)] scaler = StandardScaler().match(X.loc[index, cols]) # Save scale reworked dataframe X[cols] = scaler.rework(X[cols]) data[name] = X seq_len = 60 batch_size = 128 n_epochs = 20 n_features = 82 # Produce CNNpred as a binary classification draw back model = cnnpred_2d(seq_len, n_features) model.compile(optimizer=“adam”, loss=“mae”, metrics=[“acc”, f1macro]) model.summary() # print model building to console # Set up callbacks and match the model # We use custom-made validation ranking f1macro() and subsequently monitor for “val_f1macro” checkpoint_path = “./cp2d-{epoch}-{val_f1macro:.2f}.h5” callbacks = [ ModelCheckpoint(checkpoint_path, monitor=‘val_f1macro’, mode=“max”, verbose=0, save_best_only=True, save_weights_only=False, save_freq=“epoch”) ] model.match(datagen(data, seq_len, batch_size, “Target”, “observe”), validation_data=datagen(data, seq_len, batch_size, “Target”, “professional”), epochs=n_epochs, steps_per_epoch=400, validation_steps=10, verbose=1, callbacks=callbacks) # Prepare check out data test_data, test_target = testgen(data, seq_len, “Target”) # Test the model test_out = model.predict(test_data) test_pred = (test_out > 0.5).astype(int) print(“accuracy:”, accuracy_score(test_pred, test_target)) print(“MAE:”, mean_absolute_error(test_pred, test_target)) print(“F1:”, f1_score(test_pred, test_target)) |
Extensions
The distinctive paper known as the above model “2D-CNNpred” and there is a mannequin known as “3D-CNNpred”. The idea is not solely ponder the varied choices of 1 stock market index nevertheless cross consider with many market indices to help prediction on one index. Refer to the desk of choices and time steps above, the information for one market index is launched as 2D array. If we stack up numerous such data from fully completely different indices, we constructed a 3D array. While the objective label is an identical, nevertheless allowing us to take a look at a definite market would possibly current some additional information to help prediction.
Because the type of the information modified, the convolutional group moreover outlined barely fully completely different, and the information generators need some modification accordingly as properly. Below is the entire code of the 3D mannequin, which the change from the sooner 2nd mannequin must be self-explanatory:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 | import os import random import numpy as np import pandas as pd import tensorflow as tf from tensorflow.keras import backend as Ok from tensorflow.keras.layers import Dense, Dropout, Flatten, Conv2D, MaxPool2D, Input from tensorflow.keras.fashions import Sequential, load_model from tensorflow.keras.callbacks import ModelCheckpoint from sklearn.preprocessing import StandardScaler from sklearn.metrics import accuracy_score, f1_score, mean_absolute_error DATADIR = “./Dataset” TRAIN_TEST_CUTOFF = ‘2023-04-21’ TRAIN_VALID_RATIO = 0.75 # https://datascience.stackexchange.com/questions/45165/how-to-get-accuracy-f1-precision-and-recall-for-a-keras-model # to implement F1 ranking for validation in a batch def recall_m(y_true, y_pred): true_positives = Ok.sum(Ok.spherical(Ok.clip(y_true * y_pred, 0, 1))) possible_positives = Ok.sum(Ok.spherical(Ok.clip(y_true, 0, 1))) recall = true_positives / (possible_positives + Ok.epsilon()) return recall def precision_m(y_true, y_pred): true_positives = Ok.sum(Ok.spherical(Ok.clip(y_true * y_pred, 0, 1))) predicted_positives = Ok.sum(Ok.spherical(Ok.clip(y_pred, 0, 1))) precision = true_positives / (predicted_positives + Ok.epsilon()) return precision def f1_m(y_true, y_pred): precision = precision_m(y_true, y_pred) recall = recall_m(y_true, y_pred) return 2*((precision*recall)/(precision+recall+Ok.epsilon())) def f1macro(y_true, y_pred): f_pos = f1_m(y_true, y_pred) # opposed mannequin of the information and prediction f_neg = f1_m(1–y_true, 1–Ok.clip(y_pred,0,1)) return (f_pos + f_neg)/2 def cnnpred_3d(seq_len=60, n_stocks=5, n_features=82, n_filters=(8,8,8), droprate=0.1): “3D-CNNpred model in response to the paper” model = Sequential([ Input(shape=(n_stocks, seq_len, n_features)), Conv2D(n_filters[0], kernel_size=(1,1), activation=“relu”, data_format=“channels_last”), Conv2D(n_filters[1], kernel_size=(n_stocks,3), activation=“relu”), MaxPool2D(pool_size=(1,2)), Conv2D(n_filters[2], kernel_size=(1,3), activation=“relu”), MaxPool2D(pool_size=(1,2)), Flatten(), Dropout(droprate), Dense(1, activation=“sigmoid”) ]) return model def datagen(data, seq_len, batch_size, target_index, targetcol, selection): “As a generator to offer samples for Keras model” # Learn regarding the data’s choices and time axis input_cols = [c for c in data.columns if c[0] != targetcol] tickers = sorted(set(c for _,c in input_cols)) n_features = len(input_cols) // len(tickers) index = data.index[data.index < TRAIN_TEST_CUTOFF] minimize up = int(len(index) * TRAIN_VALID_RATIO) assert minimize up > seq_len, “Training data too small for sequence dimension {}”.format(seq_len) if selection == “observe”: index = index[:split] # fluctuate for the teaching set elif selection == ‘professional’: index = index[split:] # fluctuate for the validation set else: elevate NotImplementedError # Infinite loop to generate a batch batch = [] whereas True: # Pick one place, then clip a sequence dimension whereas True: t = random.choice(index) n = (data.index == t).argmax() if n–seq_len+1 < 0: proceed # this sample is not enough for one sequence dimension physique = data.iloc[n–seq_len+1:n+1][input_cols] # convert physique with two stage of indices into 3D array kind = (len(tickers), len(physique), n_features) X = np.full(kind, np.nan) for i,ticker in enumerate(tickers): X[i] = physique.xs(ticker, axis=1, stage=1).values batch.append([X, data[targetcol][target_index][t]]) break # if we get enough for a batch, dispatch if len(batch) == batch_size: X, y = zip(*batch) yield np.array(X), np.array(y) batch = [] def testgen(data, seq_len, target_index, targetcol): “Return array of all check out samples” input_cols = [c for c in data.columns if c[0] != targetcol] tickers = sorted(set(c for _,c in input_cols)) n_features = len(input_cols) // len(tickers) t = data.index[data.index >= TRAIN_TEST_CUTOFF][0] n = (data.index == t).argmax() batch = [] for i in fluctuate(n+1, len(data)+1): # Clip a window of seq_len ends at row place i-1 physique = data.iloc[i–seq_len:i] objective = physique[targetcol][target_index][–1] physique = physique[input_cols] # convert physique with two stage of indices into 3D array kind = (len(tickers), len(physique), n_features) X = np.full(kind, np.nan) for i,ticker in enumerate(tickers): X[i] = physique.xs(ticker, axis=1, stage=1).values batch.append([X, target]) X, y = zip(*batch) return np.array(X), np.array(y) # Read data into pandas DataFrames data = {} for filename in os.listdir(DATADIR): if not filename.lower().endswith(“.csv”): proceed # be taught solely the CSV recordsdata filepath = os.path.be part of(DATADIR, filename) X = pd.read_csv(filepath, index_col=“Date”, parse_dates=True) # elementary preprocessing: get the determine, the classification # Save the objective variable as a column in dataframe for easier dropna() determine = X[“Name”][0] del X[“Name”] cols = X.columns X[“Target”] = (X[“Close”].pct_change().shift(–1) > 0).astype(int) X.dropna(inplace=True) # Fit the same old scaler using the teaching dataset index = X.index[X.index < TRAIN_TEST_CUTOFF] index = index[:int(len(index) * TRAIN_VALID_RATIO)] scaler = StandardScaler().match(X.loc[index, cols]) # Save scale reworked dataframe X[cols] = scaler.rework(X[cols]) data[name] = X # Transform data into 3D dataframe (multilevel columns) for key, df in data.objects(): df.columns = pd.MultiIndex.from_product([df.columns, [key]]) data = pd.concat(data.values(), axis=1) seq_len = 60 batch_size = 128 n_epochs = 20 n_features = 82 n_stocks = 5 # Produce CNNpred as a binary classification draw back model = cnnpred_3d(seq_len, n_stocks, n_features) model.compile(optimizer=“adam”, loss=“mae”, metrics=[“acc”, f1macro]) model.summary() # print model building to console # Set up callbacks and match the model # We use custom-made validation ranking f1macro() and subsequently monitor for “val_f1macro” checkpoint_path = “./cp3d-{epoch}-{val_f1macro:.2f}.h5” callbacks = [ ModelCheckpoint(checkpoint_path, monitor=‘val_f1macro’, mode=“max”, verbose=0, save_best_only=True, save_weights_only=False, save_freq=“epoch”) ] model.match(datagen(data, seq_len, batch_size, “DJI”, “Target”, “observe”), validation_data=datagen(data, seq_len, batch_size, “DJI”, “Target”, “professional”), epochs=n_epochs, steps_per_epoch=400, validation_steps=10, verbose=1, callbacks=callbacks) # Prepare check out data test_data, test_target = testgen(data, seq_len, “DJI”, “Target”) # Test the model test_out = model.predict(test_data) test_pred = (test_out > 0.5).astype(int) print(“accuracy:”, accuracy_score(test_pred, test_target)) print(“MAE:”, mean_absolute_error(test_pred, test_target)) print(“F1:”, f1_score(test_pred, test_target)) |
While the model above is for next-step prediction, it does not stop you from making prediction for okay steps ahead must you change the objective label to a definite calculation. This is also an practice for you.
Does it work?
As in all prediction initiatives throughout the financial market, it is on a regular basis unrealistic to rely on a extreme accuracy. The teaching parameter throughout the code above can produce barely better than 50% accuracy throughout the testing set. While the number of epochs and batch dimension are deliberately set smaller to keep away from losing time, there should not be rather a lot room for enchancment.
In the distinctive paper, it is reported that the 3D-CNNpred carried out greater than 2D-CNNpred nevertheless solely attaining the F1 ranking of decrease than 0.6. This is already doing greater than three baseline fashions talked about throughout the paper. It is also of some use, nevertheless not a magic that will help you to earn cash quick.
From machine finding out methodology perspective, proper right here we classify a panel of information into whether or not or not the market path is up or down the next day. Hence whereas the information is not an image, it resembles one since every are launched inside the kind of a 2D array. The technique of convolutional layers can resulting from this reality utilized, nevertheless we would use a definite filter dimension to match the intuition we usually have for financial time assortment.
Further readings
The distinctive paper is on the market at:
- “CNNPred: CNN-based stock market prediction using several data sources”, by Ehsan Hoseinzade, Saman Haratizadeh, 2023.
(https://arxiv.org/abs/1810.08923)
If you are new to finance utility and have to assemble the connection between machine finding out strategies and finance, you can uncover this information useful:
- Machine Learning in Finance: From Theory to Practice, by Matthew F. Dixon, Igor Halperin, and Paul Bilokon. 2000.
(https://www.amazon.com/dp/3030410676/)
On the identical topic, we have now now a earlier publish on using CNN for time assortment, nevertheless using 1D convolutional layers;
- How to develop convolutional neural group fashions for time assortment forecasting
You may also uncover the following documentation helpful to make clear some syntax we used above:
- Panads shopper data: https://pandas.pydata.org/pandas-docs/stable/user_guide/index.html
- Keras model teaching API: https://keras.io/api/models/model_training_apis/
- Keras callbacks API: https://keras.io/api/callbacks/
Summary
In this tutorial, you discovered how a CNN model could also be constructed for prediction in financial time assortment.
Specifically, you realized:
- How to create 2D convolutional layers to course of the time assortment
- How to present the time assortment data in a multidimensional array so that the convolutional layers could also be utilized
- What is an data generator for Keras model teaching and learn the way to make use of it
- How to observe the effectivity of model teaching with a custom-made metric
- What to rely on in predicting financial market
How to Develop Convolutional Neural Network Models…
How to Develop LSTM Models for Time Series Forecasting
Convolutional Neural Networks for Multi-Step Time…
Deep Learning Models for Univariate Time Series Forecasting
How to Use Mask R-CNN in Keras for Object Detection…
A Gentle Introduction to Object Recognition With…
- Get link
- X
- Other Apps
Comments
Post a Comment