Loading icon

Building Logistic Regression from Scratch in Python

Post banner image
Share:

In this blog post, we'll dive into the logistic regression model, a fundamental algorithm for binary classification. We'll be writing our logistic regression function and optimization function from scratch in Python, without using libraries such as scikit-learn. This will give you a deeper understanding of the inner workings of logistic regression.

Logistic Regression Overview

SMV Cost Function

Logistic Regression is a statistical method for analyzing a dataset in which there are one or more independent variables that determine an outcome. The outcome is measured with a dichotomous variable (in which there are only two possible outcomes). The logistic function is defined as:

Code Implementation

Let's start by implementing the logistic function, followed by the cost function which we aim to minimize.

    
        import numpy as np
    
        # Logistic Function
        def logistic_function(x, beta):
            z = np.dot(x, beta[1:]) + beta[0]
            return 1 / (1 + np.exp(-z))
        
        # Cost Function
        def cost_function(x, y, beta):
            m = len(y)
            total_cost = -(1 / m) * np.sum(
                y * np.log(logistic_function(x, beta)) + (1 - y) * np.log(
                    1 - logistic_function(x, beta)))
            return total_cost
    

Now, let's implement the optimization function using Gradient Descent to find the optimal parameters β.

    
        # Gradient Descent Function to minimize the cost function
        def gradient_descent(x, y, beta, learning_rate, iterations):
            m = len(y)
            cost_history = np.zeros(iterations)
        
            for i in range(iterations):
                beta[0] = beta[0] - (learning_rate/m) * np.sum(
                    logistic_function(x, beta) - y)
                beta[1:] = beta[1:] - (learning_rate/m) * np.dot(
                    x.T, logistic_function(x, beta) - y)
                cost_history[i] = cost_function(x, y, beta)
        
            return beta, cost_history
    

Prediction Function

    
        # Prediction Function
        def predict(x, beta):
            '''
            Returns the probability that each observation belongs to class 1
            '''
            return logistic_function(x, beta)
        
        # Threshold Function
        def classify(predictions, threshold=0.5):
            '''
            Classifies the predictions into class 0 or 1 based on a specified threshold
            '''
            classes = np.zeros_like(predictions)
            classes[predictions >= threshold] = 1
            return classes
    

In the predict function, we use the logistic function with the optimized coefficients to compute the probabilities of belonging to class 1 for new data. In the classify function, we apply a threshold (default is 0.5) to these probabilities to obtain binary class predictions. If the probability is greater than or equal to 0.5, the function classifies the observation as class 1; otherwise, it classifies the observation as class 0.

Let's put it all together and run our logistic regression on some data.

        
            # Training data (x and y)
            # Assume there are 20 data points
            x = np.array([[2], [3], [10], [19], [23], [10], [18], [22], [7], [5], 
                          [24], [29], [30], [34], [35], [28], [33], [40], [42], [45]])
            y = np.array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1])  # 0s and 1s indicating the two classes

            new_x = np.array([[15], [25], [8], [36], [48]])

            # Ensure x is in the correct shape for our functions
            x = np.hstack((np.ones((x.shape[0], 1)), x))  # Adding a column of ones for the intercept term
            new_x = np.hstack((np.ones((new_x.shape[0], 1)), new_x))  # Adding a column of ones for the intercept term
            
            # Initial coefficients
            beta = np.zeros(x.shape[1])
            
            # Set learning rate and number of iterations
            learning_rate = 0.01
            iterations = 1000
            
            # Define the logistic function, cost function, and gradient descent function as before
            
            # Run Gradient Descent
            optimized_beta, cost_history = gradient_descent(x, y, beta, learning_rate, iterations)
            
            # Get probabilities
            probabilities = predict(new_x, optimized_beta)
            
            # Get binary class predictions
            binary_predictions = classify(probabilities)
            
            # Output the binary predictions
            print(binary_predictions)            
            
        
    

x is a 20x2 matrix, where each row represents a data point, and there are two columns (one for the intercept term and one for the single feature).

y is a vector of length 20, where each entry is the class label (0 or 1) for the corresponding data point in x.

new_x is a 5x2 matrix representing new data points we want to make predictions on, formatted similarly to x.

After running the logistic regression training, making predictions, and classifying the new data points, the binary_predictions vector will contain the predicted class labels for the new_x data points.

Conclusion

In this blog post, we implemented a logistic regression model from scratch in Python. We defined the logistic function, cost function, and used gradient descent to optimize the model parameters. This exercise provides a clear understanding of the logistic regression algorithm, which is the foundation for more complex machine learning models.