Neural Network Models for Combined Classification and Regression
- Get link
- X
- Other Apps
Some prediction points require predicting every numeric values and a class label for the same enter.
A straightforward methodology is to develop every regression and classification predictive fashions on the similar information and use the fashions sequentially.
An varied and typically less complicated methodology is to develop a single neural neighborhood model which will predict every a numeric and class label value from the similar enter. This is named a multi-output model and may be comparatively easy to develop and take into account using trendy deep learning libraries harking back to Keras and TensorFlow.
In this tutorial, you will uncover straightforward strategies to develop a neural neighborhood for combined regression and classification predictions.
After ending this tutorial, you will know:
- Some prediction points require predicting every numeric and class label values for each enter occasion.
- How to develop separate regression and classification fashions for points that require numerous outputs.
- How to develop and take into account a neural neighborhood model in a position to making simultaneous regression and classification predictions.
Let’s get started.

Develop Neural Network for Combined Classification and Regression
Photo by Sang Trinh, some rights reserved.
Tutorial Overview
This tutorial is cut up into three parts; they’re:
- Single Model for Regression and Classification
- Separate Regression and Classification Models
- Abalone Dataset
- Regression Model
- Classification Model
- Combined Regression and Classification Models
Single Model for Regression and Classification
It is widespread to develop a deep learning neural neighborhood model for a regression or classification disadvantage, nevertheless on some predictive modeling duties, we’d want to develop a single model which will make every regression and classification predictions.
Regression refers to predictive modeling points that include predicting a numeric value given an enter.
Classification refers to predictive modeling points that include predicting a class label or chance of sophistication labels for a given enter.
For additional on the excellence between classification and regression, see the tutorial:
- Difference Between Classification and Regression in Machine Learning
There is also some points the place we want to predict every a numerical value and a classification value.
One methodology to fixing this disadvantage is to develop a separate model for each prediction that is required.
The disadvantage with this methodology is that the predictions made by the separate fashions may diverge.
An alternate methodology that may be utilized when using neural neighborhood fashions is to develop a single model in a position to making separate predictions for a numeric and class output for the same enter.
This is named a multi-output neural neighborhood model.
The benefit of such a model is that we have now now a single model to develop and maintain in its place of two fashions and that teaching and updating the model on every output varieties on the similar time may present additional consistency throughout the predictions between the two output varieties.
We will develop a multi-output neural neighborhood model in a position to making regression and classification predictions on the similar time.
First, let’s select a dataset the place this requirement is wise and start by rising separate fashions for every regression and classification predictions.
Separate Regression and Classification Models
In this half, we’re going to start by selecting an precise dataset the place we’d need regression and classification predictions on the similar time, then develop separate fashions for each type of prediction.
Abalone Dataset
We will use the “abalone” dataset.
Determining the age of an abalone is a time-consuming course of and it is fascinating to search out out the age from bodily particulars alone.
This is a dataset that describes the bodily particulars of abalone and requires predicting the number of rings of the abalone, which is a proxy for the age of the creature.
You can examine additional regarding the dataset from proper right here:
The “age” may be predicted as every a numerical value (in years) or a class label (ordinal 12 months as a class).
No should get hold of the dataset as we’re going to get hold of it robotically as part of the labored examples.
The dataset provides an occasion of a dataset the place we’d need every a numerical and classification of an enter.
First, let’s develop an occasion to acquire and summarize the dataset.
1 2 3 4 5 6 7 8 9 10 | # load and summarize the abalone dataset from pandas import read_csv from matplotlib import pyplot # load dataset url = ‘https://uncooked.githubusercontent.com/jbrownlee/Datasets/grasp/abalone.csv’ dataframe = read_csv(url, header=None) # summarize kind print(dataframe.kind) # summarize first few traces print(dataframe.head()) |
Running the occasion first downloads and summarizes the type of the dataset.
We can see that there are 4,177 examples (rows) that we are going to use to teach and take into account a model and 9 choices (columns) along with the objective variable.
We can see that every one enter variables are numeric apart from the first, which is a string value.
To keep information preparation straightforward, we’re going to drop the first column from our fashions and provides consideration to modeling the numeric enter values.
1 2 3 4 5 6 7 | (4177, 9) 0 1 2 3 4 5 6 7 8 0 M 0.455 0.365 0.095 0.5140 0.2245 0.1010 0.150 15 1 M 0.350 0.265 0.090 0.2255 0.0995 0.0485 0.070 7 2 F 0.530 0.420 0.135 0.6770 0.2565 0.1415 0.210 9 3 M 0.440 0.365 0.125 0.5160 0.2155 0.1140 0.155 10 4 I 0.330 0.255 0.080 0.2050 0.0895 0.0395 0.055 7 |
We can use the information because the premise for rising separate regression and classification Multilayer Perceptron (MLP) neural neighborhood fashions.
Note: we’re not attempting to develop an optimum model for this dataset; in its place we’re demonstrating a selected strategy: rising a model which will make every regression and classification predictions.
Regression Model
In this half, we’re going to develop a regression MLP model for the abalone dataset.
First, we must always separate the columns into enter and output elements and drop the first column that contains string values.
We may additionally drive all loaded columns to have a float form (anticipated by neural neighborhood fashions) and doc the number of enter choices, which is ready to should be recognized by the model later.
1 2 3 4 5 | ... # break up into enter (X) and output (y) variables X, y = dataset[:, 1:–1], dataset[:, –1] X, y = X.astype(‘float’), y.astype(‘float’) n_features = X.kind[1] |
Next, we’ll break up the dataset proper right into a follow and check out dataset.
We will use a 67% random sample to teach the model and the remaining 33% to guage the model.
1 2 3 | ... # break up information into follow and check out items X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=1) |
We can then define an MLP neural neighborhood model.
The model might have two hidden layers, the first with 20 nodes and the second with 10 nodes, every using ReLU activation and “he normal” weight initialization (an incredible comply with). The number of layers and nodes have been chosen arbitrarily.
The output layer might have a single node for predicting a numeric value and a linear activation function.
1 2 3 4 5 6 | ... # define the keras model model = Sequential() model.add(Dense(20, input_dim=n_features, activation=‘relu’, kernel_initializer=‘he_normal’)) model.add(Dense(10, activation=‘relu’, kernel_initializer=‘he_normal’)) model.add(Dense(1, activation=‘linear’)) |
The model could be educated to attenuate the suggest squared error (MSE) loss function using the environment friendly Adam mannequin of stochastic gradient descent.
1 2 3 | ... # compile the keras model model.compile(loss=‘mse’, optimizer=‘adam’) |
We will follow the model for 150 epochs with a mini-batch dimension of 32 samples, as soon as extra chosen arbitrarily.
1 2 3 | ... # match the keras model on the dataset model.match(X_train, y_train, epochs=150, batch_size=32, verbose=2) |
Finally, after the model is educated, we’re going to take into account it on the holdout check out dataset and report the suggest absolute error (MAE).
1 2 3 4 5 | ... # take into account on check out set yhat = model.predict(X_test) error = mean_absolute_error(y_test, yhat) print(‘MAE: %.3f’ % error) |
Tying this all collectively, the entire occasion of an MLP neural neighborhood for the abalone dataset framed as a regression disadvantage is listed underneath.
Running the occasion will put collectively the dataset, match the model, and report an estimate of model error.
Note: Your outcomes may vary given the stochastic nature of the algorithm or evaluation course of, or variations in numerical precision. Consider working the occasion a few situations and study the widespread consequence.
In this case, we’ll see that the model achieved an error of about 1.5 (rings).
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | … Epoch 145/150 88/88 – 0s – loss: 4.6130 Epoch 146/150 88/88 – 0s – loss: 4.6182 Epoch 147/150 88/88 – 0s – loss: 4.6277 Epoch 148/150 88/88 – 0s – loss: 4.6437 Epoch 149/150 88/88 – 0s – loss: 4.6166 Epoch 150/150 88/88 – 0s – loss: 4.6132 MAE: 1.554 |
So far so good.
Next, let’s check out rising an similar model for classification.
Classification Model
The abalone dataset may be framed as a classification disadvantage the place each “ring” integer is taken as a separate class label.
The occasion and model are rather a lot the similar as a result of the above occasion for regression, with a few important modifications.
This requires first assigning a separate integer for each “ring” value, starting at 0 and ending on the entire number of “classes” minus one.
This may be achieved using the LabelEncoder.
We may additionally doc all the number of programs as all the number of distinctive encoded class values, which could be needed by the model later.
1 2 3 4 | ... # encode strings to integer y = LabelEncoder().fit_transform(y) n_class = len(distinctive(y)) |
After splitting the information into follow and check out items as sooner than, we’ll define the model and alter the number of outputs from the model to equal the number of programs and use the softmax activation function, widespread for multi-class classification.
1 2 3 4 5 6 | ... # define the keras model model = Sequential() model.add(Dense(20, input_dim=n_features, activation=‘relu’, kernel_initializer=‘he_normal’)) model.add(Dense(10, activation=‘relu’, kernel_initializer=‘he_normal’)) model.add(Dense(n_class, activation=‘softmax’)) |
Given we have now now encoded class labels as integer values, we’ll match the model by minimizing the sparse categorical cross-entropy loss function, acceptable for multi-class classification duties with integer encoded class labels.
1 2 3 | ... # compile the keras model model.compile(loss=‘sparse_categorical_crossentropy’, optimizer=‘adam’) |
After the model is match on the teaching dataset as sooner than, we’ll take into account the effectivity of the model by calculating the classification accuracy on the hold-out check out set.
1 2 3 4 5 6 | ... # take into account on check out set yhat = model.predict(X_test) yhat = argmax(yhat, axis=–1).astype(‘int’) acc = accuracy_score(y_test, yhat) print(‘Accuracy: %.3f’ % acc) |
Tying this all collectively, the entire occasion of an MLP neural neighborhood for the abalone dataset framed as a classification disadvantage is listed underneath.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 | # classification mlp model for the abalone dataset from numpy import distinctive from numpy import argmax from pandas import read_csv from tensorflow.keras.fashions import Sequential from tensorflow.keras.layers import Dense from sklearn.metrics import accuracy_score from sklearn.model_selection import train_test_split from sklearn.preprocessing import LabelEncoder # load dataset url = ‘https://uncooked.githubusercontent.com/jbrownlee/Datasets/grasp/abalone.csv’ dataframe = read_csv(url, header=None) dataset = dataframe.values # break up into enter (X) and output (y) variables X, y = dataset[:, 1:–1], dataset[:, –1] X, y = X.astype(‘float’), y.astype(‘float’) n_features = X.kind[1] # encode strings to integer y = LabelEncoder().fit_transform(y) n_class = len(distinctive(y)) # break up information into follow and check out items X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=1) # define the keras model model = Sequential() model.add(Dense(20, input_dim=n_features, activation=‘relu’, kernel_initializer=‘he_normal’)) model.add(Dense(10, activation=‘relu’, kernel_initializer=‘he_normal’)) model.add(Dense(n_class, activation=‘softmax’)) # compile the keras model model.compile(loss=‘sparse_categorical_crossentropy’, optimizer=‘adam’) # match the keras model on the dataset model.match(X_train, y_train, epochs=150, batch_size=32, verbose=2) # take into account on check out set yhat = model.predict(X_test) yhat = argmax(yhat, axis=–1).astype(‘int’) acc = accuracy_score(y_test, yhat) print(‘Accuracy: %.3f’ % acc) |
Running the occasion will put collectively the dataset, match the model, and report an estimate of model error.
Note: Your outcomes may vary given the stochastic nature of the algorithm or evaluation course of, or variations in numerical precision. Consider working the occasion a few situations and study the widespread consequence.
In this case, we’ll see that the model achieved an accuracy of about 27%.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | … Epoch 145/150 88/88 – 0s – loss: 1.9271 Epoch 146/150 88/88 – 0s – loss: 1.9265 Epoch 147/150 88/88 – 0s – loss: 1.9265 Epoch 148/150 88/88 – 0s – loss: 1.9271 Epoch 149/150 88/88 – 0s – loss: 1.9262 Epoch 150/150 88/88 – 0s – loss: 1.9260 Accuracy: 0.274 |
So far so good.
Next, let’s check out rising a combined model in a position to every regression and classification predictions.
Combined Regression and Classification Models
In this half, we’ll develop a single MLP neural neighborhood model which will make every regression and classification predictions for a single enter.
This is named a multi-output model and may be developed using the helpful Keras API.
For additional on this handy API, which may be powerful for novices, see the tutorials:
- TensorFlow 2 Tutorial: Get Started in Deep Learning With tf.keras
- How to Use the Keras Functional API for Deep Learning
First, the dataset should be prepared.
We can put collectively the dataset as we did sooner than for classification, although we must always at all times save the encoded objective variable with a separate title to tell apart it from the raw objective variable values.
1 2 3 4 | ... # encode strings to integer y_class = LabelEncoder().fit_transform(y) n_class = len(distinctive(y_class)) |
We can then break up the enter, raw output, and encoded output variables into follow and check out items.
1 2 3 | ... # break up information into follow and check out items X_train, X_test, y_train, y_test, y_train_class, y_test_class = train_test_split(X, y, y_class, test_size=0.33, random_state=1) |
Next, we’ll define the model using the helpful API.
The model takes the similar number of inputs as sooner than with the standalone fashions and makes use of two hidden layers configured within the similar method.
1 2 3 4 5 | ... # enter seen = Input(kind=(n_features,)) hidden1 = Dense(20, activation=‘relu’, kernel_initializer=‘he_normal’)(seen) hidden2 = Dense(10, activation=‘relu’, kernel_initializer=‘he_normal’)(hidden1) |
We can then define two separate output layers that hook up with the second hidden layer of the model.
The first is a regression output layer that has a single node and a linear activation function.
1 2 3 | ... # regression output out_reg = Dense(1, activation=‘linear’)(hidden2) |
The second is a classification output layer that has one node for each class being predicted and makes use of a softmax activation function.
1 2 3 | ... # classification output out_clas = Dense(n_class, activation=‘softmax’)(hidden2) |
We can then define the model with a single enter layer and two output layers.
1 2 3 | ... # define model model = Model(inputs=seen, outputs=[out_reg, out_clas]) |
Given the two output layers, we’ll compile the model with two loss options, suggest squared error loss for the first (regression) output layer and sparse categorical cross-entropy for the second (classification) output layer.
1 2 3 | ... # compile the keras model model.compile(loss=[‘mse’,‘sparse_categorical_crossentropy’], optimizer=‘adam’) |
We may additionally create a plot of the model for reference.
This requires that pydot and pygraphviz are put in. If this could be a disadvantage, you presumably can comment out this line and the import assertion for the plot_model() function.
1 2 3 | ... # plot graph of model plot_model(model, to_file=‘model.png’, show_shapes=True) |
Each time the model makes a prediction, it’ll predict two values.
Similarly, when teaching the model, it’ll need one objective variable per sample for each output.
As such, we’ll follow the model, fastidiously providing every the regression objective and classification objective information to each output of the model.
1 2 3 | ... # match the keras model on the dataset model.match(X_train, [y_train,y_train_class], epochs=150, batch_size=32, verbose=2) |
The match model can then make a regression and classification prediction for each occasion throughout the hold-out check out set.
1 2 3 | ... # make predictions on check out set yhat1, yhat2 = model.predict(X_test) |
The first array could be utilized to guage the regression predictions via suggest absolute error.
1 2 3 4 | ... # calculate error for regression model error = mean_absolute_error(y_test, yhat1) print(‘MAE: %.3f’ % error) |
The second array could be utilized to guage the classification predictions via classification accuracy.
1 2 3 4 5 | ... # take into account accuracy for classification model yhat2 = argmax(yhat2, axis=–1).astype(‘int’) acc = accuracy_score(y_test_class, yhat2) print(‘Accuracy: %.3f’ % acc) |
And that’s it.
Tying this collectively, the entire occasion of teaching and evaluating a multi-output model for combiner regression and classification predictions on the abalone dataset is listed underneath.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 | # mlp for combined regression and classification predictions on the abalone dataset from numpy import distinctive from numpy import argmax from pandas import read_csv from sklearn.metrics import mean_absolute_error from sklearn.metrics import accuracy_score from sklearn.model_selection import train_test_split from sklearn.preprocessing import LabelEncoder from tensorflow.keras.fashions import Model from tensorflow.keras.layers import Input from tensorflow.keras.layers import Dense from tensorflow.keras.utils import plot_model # load dataset url = ‘https://uncooked.githubusercontent.com/jbrownlee/Datasets/grasp/abalone.csv’ dataframe = read_csv(url, header=None) dataset = dataframe.values # break up into enter (X) and output (y) variables X, y = dataset[:, 1:–1], dataset[:, –1] X, y = X.astype(‘float’), y.astype(‘float’) n_features = X.kind[1] # encode strings to integer y_class = LabelEncoder().fit_transform(y) n_class = len(distinctive(y_class)) # break up information into follow and check out items X_train, X_test, y_train, y_test, y_train_class, y_test_class = train_test_split(X, y, y_class, test_size=0.33, random_state=1) # enter seen = Input(kind=(n_features,)) hidden1 = Dense(20, activation=‘relu’, kernel_initializer=‘he_normal’)(seen) hidden2 = Dense(10, activation=‘relu’, kernel_initializer=‘he_normal’)(hidden1) # regression output out_reg = Dense(1, activation=‘linear’)(hidden2) # classification output out_clas = Dense(n_class, activation=‘softmax’)(hidden2) # define model model = Model(inputs=seen, outputs=[out_reg, out_clas]) # compile the keras model model.compile(loss=[‘mse’,‘sparse_categorical_crossentropy’], optimizer=‘adam’) # plot graph of model plot_model(model, to_file=‘model.png’, show_shapes=True) # match the keras model on the dataset model.match(X_train, [y_train,y_train_class], epochs=150, batch_size=32, verbose=2) # make predictions on check out set yhat1, yhat2 = model.predict(X_test) # calculate error for regression model error = mean_absolute_error(y_test, yhat1) print(‘MAE: %.3f’ % error) # take into account accuracy for classification model yhat2 = argmax(yhat2, axis=–1).astype(‘int’) acc = accuracy_score(y_test_class, yhat2) print(‘Accuracy: %.3f’ % acc) |
Running the occasion will put collectively the dataset, match the model, and report an estimate of model error.
Note: Your outcomes may vary given the stochastic nature of the algorithm or evaluation course of, or variations in numerical precision. Consider working the occasion a few situations and study the widespread consequence.
A plot of the multi-output model is created, clearly exhibiting the regression (left) and classification (correct) output layers associated to the second hidden layer of the model.

Plot of the Multi-Output Model for Combine Regression and Classification Predictions
In this case, we’ll see that the model achieved every an inexpensive error of about 1.495 (rings) and an similar accuracy as sooner than of about 25.6%.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | … Epoch 145/150 88/88 – 0s – loss: 6.5707 – dense_2_loss: 4.5396 – dense_3_loss: 2.0311 Epoch 146/150 88/88 – 0s – loss: 6.5753 – dense_2_loss: 4.5466 – dense_3_loss: 2.0287 Epoch 147/150 88/88 – 0s – loss: 6.5970 – dense_2_loss: 4.5723 – dense_3_loss: 2.0247 Epoch 148/150 88/88 – 0s – loss: 6.5640 – dense_2_loss: 4.5389 – dense_3_loss: 2.0251 Epoch 149/150 88/88 – 0s – loss: 6.6053 – dense_2_loss: 4.5827 – dense_3_loss: 2.0226 Epoch 150/150 88/88 – 0s – loss: 6.5754 – dense_2_loss: 4.5524 – dense_3_loss: 2.0230 MAE: 1.495 Accuracy: 0.256 |
Further Reading
This half provides additional belongings on the topic in case you might be searching for to go deeper.
Tutorials
- Difference Between Classification and Regression in Machine Learning
- TensorFlow 2 Tutorial: Get Started in Deep Learning With tf.keras
- Best Results for Standard Machine Learning Datasets
- How to Use the Keras Functional API for Deep Learning
Summary
In this tutorial, you discovered straightforward strategies to develop a neural neighborhood for combined regression and classification predictions.
Specifically, you realized:
- Some prediction points require predicting every numeric and class label values for each enter occasion.
- How to develop separate regression and classification fashions for points that require numerous outputs.
- How to develop and take into account a neural neighborhood model in a position to making simultaneous regression and classification predictions.
Do you could have any questions?
Ask your questions throughout the suggestions underneath and I’ll do my biggest to answer.
How to Develop a Framework to Spot-Check Machine…
How to Code a Neural Network with Backpropagation In…
How to Develop Voting Ensembles With Python
Blending Ensemble Machine Learning With Python
How to Choose Loss Functions When Training Deep…
Robust Regression for Machine Learning in Python
- Get link
- X
- Other Apps
Comments
Post a Comment