YouTip LogoYouTip

Ml Logistic Regression

Logistic Regression is a statistical learning method widely used for classification problems. Although the name contains "regression", it is actually an algorithm used for binary or multi-class classification problems. Logistic Regression uses the logistic function (also called the Sigmoid function) to map the output of linear regression between 0 and 1, thereby predicting the probability of a certain event occurring. Logistic Regression is widely used in various classification problems, such as: * Spam detection (spam/not spam) * Disease prediction (diseased/not diseased) * Customer churn prediction (churn/not churn) ### Logistic Regression Model !(#) ### Loss Function The loss function for Logistic Regression is the Log Loss function, with the following form: !(#) ### Gradient Descent Solution Like linear regression, Logistic Regression typically uses gradient descent to optimize the loss function and solve for parameters w and b. The gradient update rules for Logistic Regression are as follows: !(#) Continuously update w and b until the loss function converges. * * * ## Implementing Logistic Regression with Python Next, we will use Python and the Scikit-learn library to implement a simple Logistic Regression model. ### 1. Import Necessary Libraries ## Example import numpy as np import matplotlib.pyplot as plt from sklearn.datasets import load_iris from sklearn.model_selection import train_test_split from sklearn.linear_model import LogisticRegression from sklearn.metrics import accuracy_score, confusion_matrix, classification_report ### 2. Load Dataset We will use the Iris dataset that comes with Scikit-learn. The Iris dataset contains 150 samples, each with 4 features, and the goal is to classify the samples into 3 categories. To simplify the problem, we only use the first two features and convert it into a binary classification problem. ## Example # Load dataset iris = load_iris() X = iris.data[:, :2]# Only use the first two features y =(iris.target!=0) * 1# Convert target to binary classification # Split training and test sets X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42) ### 3. Train Logistic Regression Model ## Example # Create Logistic Regression model model = LogisticRegression() # Train model model.fit(X_train, y_train) ### 4. Model Evaluation ## Example import numpy as np import matplotlib.pyplot as plt from sklearn.datasets import load_iris from sklearn.model_selection import train_test_split from sklearn.linear_model import LogisticRegression from sklearn.metrics import accuracy_score, confusion_matrix, classification_report # Load dataset iris = load_iris() X = iris.data[:, :2]# Only use the first two features y =(iris.target!=0) * 1# Convert target to binary classification # Split training and test sets X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42) # Create Logistic Regression model model = LogisticRegression() # Train model model.fit(X_train, y_train) # Predict on test set y_pred = model.predict(X_test) # Calculate accuracy accuracy = accuracy_score(y_test, y_pred) print(f"Model accuracy: {accuracy:.2f}") # Confusion matrix conf_matrix = confusion_matrix(y_test, y_pred) print("Confusion Matrix:") print(conf_matrix) # Classification report class_report = classification_report(y_test, y_pred) print("Classification Report:") print(class_report) The output is: Model accuracy: 1.00Confusion Matrix:[ ]Classification Report: precision recall f1-score support 0 1.00 1.00 1.00 19 1 1.00 1.00 1.00 26 accuracy 1.00 45 macro avg 1.00 1.00 1.00 45 weighted avg 1.00 1.00 1.00 45 ### 5. Visualize Decision Boundary ## Example import numpy as np import matplotlib.pyplot as plt from sklearn.datasets import load_iris from sklearn.model_selection import train_test_split from sklearn.linear_model import LogisticRegression from sklearn.metrics import accuracy_score, confusion_matrix, classification_report # Load dataset iris = load_iris() X = iris.data[:, :2]# Only use the first two features y =(iris.target!=0) * 1# Convert target to binary classification # Split training and test sets X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42) # Create Logistic Regression model model = LogisticRegression() # Train model model.fit(X_train, y_train) # Predict on test set y_pred = model.predict(X_test) # Visualize decision boundary x_min, x_max = X[:,0].min() - 1, X[:,0].max() + 1 y_min, y_max = X[:,1].min() - 1, X[:,1].max() + 1 xx, yy = np.meshgrid(np.arange(x_min, x_max,0.01), np.arange(y_min, y_max,0.01)) Z = model.predict(np.c_[xx.ravel(), yy.ravel()]) Z = Z.reshape(xx.shape) plt.contourf(xx, yy, Z, alpha=0.8) plt.scatter(X[:,0], X[:,1], c=y, edgecolors='k', marker='o') plt.xlabel('Sepal length') plt.ylabel('Sepal width') plt.title('Logistic Regression Decision Boundary') plt.show() The display is as follows: !(#) ### Summary * Logistic Regression uses the Sigmoid function to convert the output of linear regression into probability values, used for solving binary classification problems. * The training process of Logistic Regression optimizes model parameters by minimizing the log loss function. * Gradient descent is a commonly used optimization method to update model parameters $w$ and $b$. * The `scikit-learn` library in Python provides a simple and easy-to-use interface to implement Logistic Regression, and makes it easy to perform model training, evaluation, and visualization.
← Ml SvmOpencv Filter Effects β†’