Imagine your are in a candy store. There are three types of candies: chocolate, gummy bears, and lollipops. You want to teach a robot how to figure out what kind of candy it is looking at just by looking at its shape and color. This is a problem called mutliclass classification
Now, let's explain how we can use something called logistic regression ot help the robot decide. Don't worry about the fancy name-it's just a way of making choices based on some numbers!
Think of logistic regression as a way of answering yes-or-no questions. For example, if the robot asks:
Instead of asking just one question, the robot asks three:
Step 1: Features and weights
The robot uses the following parameters to caclulate the probabilities
In this example we will train the Logistic regression on multiclass dataset. The dataset will also be created in this example and it is the Candy dataset.
The first step is to import the required libraries. We will need NumPy (to create the dataset), the LogisticRegression method from the sklearn.linear_model module, and the train_test_split method from the sklearn.model_selection module. Finally, we will use the classification_report method from the sklearn.metrics module.
After splitting the train data (X_train, y_train) will be used to train the LgoisticRegression algorithm. The test dataset (X_test, y_test) will be used to test the trained model.
Finally we will give the robot new candy to classify. In this case we will define new candy sample with round shape and brown color (0,1).
Now, let's explain how we can use something called logistic regression ot help the robot decide. Don't worry about the fancy name-it's just a way of making choices based on some numbers!
What is Logistic Regression?
Think of logistic regression as a way of answering yes-or-no questions. For example, if the robot asks:
- Is this candy chocolate? - it gets an answer that's a number between 0 and 1, like 0.8 (which means it's a 80% sure it's chocolate).
- If the robot asks about gummy bears, it might get 0.1 which means only 10% sure.
How Does Multiclass Logistic Regression Work?
Instead of asking just one question, the robot asks three:
- IS this candy chocolate?
- Is this candy a gummy bear?
- Is this candy a lollipop?
- Chocolate - 0.7 (70%)
- Gummy Bears - 0.2 (20%)
- Lollipops - 0.1 (10%)
The math behind the Multiclass Logistic Regression
Step 1: Features and weightsThe robot uses the following parameters to caclulate the probabilities
- \(x\) - Features of the candy (shape, color)
- \(w\) - Weights corresponding to each feature, which determine their importance
- \(b\) - a bias term to adjust the results.
- \(e\) - Euler's number, a mathematical constant often used in probability and exponential calculations
- \(K\) - the total number of candy types (e.g. 3)
- \(z_j\) - the score for candy \(j\)
- \(w_{j,i}\) - weight for feature \(i\) of candy type \(j\).
- \(x_i\) - value of feature \(i\)
- \(b_j\) - bias for candy type \(j\)
- \(P(y=j)\) - the probability of the candy being type \(j\)
- \(z_j\) - the score for candy \(j\).
- \sum_{k=1}^K e^{z_k} - the sum of exponential scores for all K candy types, ensuring the probabilites sum to 1.
Example of multiclass logistic regression in Python
In this example we will train the Logistic regression on multiclass dataset. The dataset will also be created in this example and it is the Candy dataset.The first step is to import the required libraries. We will need NumPy (to create the dataset), the LogisticRegression method from the sklearn.linear_model module, and the train_test_split method from the sklearn.model_selection module. Finally, we will use the classification_report method from the sklearn.metrics module.
import numpy as npThe second step is to create the candy data. The X variable will contain candy features (input variables) which are shape and color. The shape value 0 indicates a round candy while value 1 indicates square shaped candy. The color has three values where 0 idnicates red candy, 1 brown candy, and 2 yellow candy. So frist column in the dataset is shape an the second is color.
from sklearn.linear_model import LogisticRegression
from sklearn.moel_selection import train_test_split
from sklearn.metrics import classification_report
# Features: shape and colorThe labels y (target variable) contains three values where 0 is for chocolate, 1 for gummy bears and 2 for lollipops.
X = np.array([
[0, 0], # Red round candy
[1, 1], # Brown square candy
[0, 2], # Yellow round candy
[1, 0], # Red square candy
[0, 1], # Brown round candy
[1, 2] # Yellow square candy
])
y = np.array([0, 0, 2, 1, 0, 2])Now that dataset is defined you can split the dataset on train and test datasets using train_test_split method. The dataset (X,y) will be divided on train and test dataset in 70:30 ratio and to do that in train_test_split function we will set the test_size paramter to 0.3. We will also define the random_state = 42 to shuffle the data before splitting.
After splitting the train data (X_train, y_train) will be used to train the LgoisticRegression algorithm. The test dataset (X_test, y_test) will be used to test the trained model.
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)If you execute the code written so far nothing will happen. To show some results we will need to test the model. To do that we will use the built in function predict() using which the trained model will predict the output based on the provided input. The output will be stored under variable name y_predict. This variable will be used in the classification_report function alongside the y_test values to measure the performance of the trained model on unseen data.
# Train the model
model = LogisticRegression(multi_class='multinomial', solver='lbfgs')
model.fit(X_train, y_train)
y_pred = model.predict(X_test)The classification output is given below.
# Print results
print("Classification Report:")
print(classification_report(y_test, y_pred, target_names=["Chocolate", "Gummy Bears", "Lollipops"]))
Classification Report: precision recall f1-score support Chocolate 0.00 0.00 0.00 2.0 Gummy Bears 0.00 0.00 0.00 0.0 Lollipops 0.00 0.00 0.00 0.0 accuracy 0.00 2.0 macro avg 0.00 0.00 0.00 2.0 weighted avg 0.00 0.00 0.00 2.0The resuls are all 0 since the test dataset contains only two samples that belong to class 0.
Finally we will give the robot new candy to classify. In this case we will define new candy sample with round shape and brown color (0,1).
new_candy = np.array([[0, 1]]) # Brown round candyThe output of the previous code block is given below.
prediction = model.predict(new_candy)
print("Predicted Candy:", prediction)
Predicted Candy: [2]So the logistic regression predicted that brown round candy actually belongs to class 2. However, the true value should be equal to 0 since in the initial dataset the same samples has the label 0 i.e. it belongs to class 0.