Sklearn Model Evaluation

Model evaluation and tuning is a key step to ensure model generalization ability and prediction accuracy. Through appropriate evaluation metrics and tuning methods, we can effectively improve model performance and avoid the risk of overfitting or underfitting. This chapter will introduce cross-validation, grid search, random search, model evaluation methods and other content in detail. * * * ## 1、Cross-Validation ### Introducing the Concept of Cross-Validation Cross-Validation is a technique used to evaluate model performance. It divides the dataset into multiple subsets (folds) and trains and tests the model multiple times to obtain more stable and reliable evaluation results. Cross-validation helps detect whether the model is overfitting and can more accurately evaluate the model's generalization ability. Common cross-validation methods include: * **K-fold Cross-Validation**: Divide the data into K folds, select one fold as the test set each time, and use the other K-1 folds as the training set. Repeat K times, and finally calculate the average of the K results. * **Leave-One-Out Cross-Validation (LOOCV)**: Keep only one data point as the test set each time, and use the remaining data as the training set. This method is very time-consuming but can be used for small datasets. * **Stratified K-fold Cross-Validation**: In K-fold, ensure that the class distribution in each fold is similar to the entire dataset, which is suitable for imbalanced class situations. scikit-learn provides various cross-validation methods, such as cross_val_score and cross_val_predict, which can help us perform cross-validation efficiently. ### Using cross_val_score to Perform K-fold Cross-Validation The cross_val_score function is used to perform K-fold cross-validation and returns the score results for each fold, helping us evaluate the model's stability and performance. Using cross_val_score to perform K-fold cross-validation: ## Example from sklearn.model_selection import cross_val_score from sklearn.ensemble import RandomForestClassifier from sklearn.datasets import load_iris # Load data data = load_iris() X, y = data.data, data.target # Create model model = RandomForestClassifier() # Perform K-fold cross-validation scores = cross_val_score(model, X, y, cv=5)# 5-fold cross-validation print(f"Cross-validation scores: {scores}") print(f"Mean accuracy: {scores.mean()}") * **`cv=5`**: Indicates performing 5-fold cross-validation. * **`scores`**: Returns the score for each fold. The final result is the average of these scores, representing the model's performance. The output is as follows: Cross-validation scores: [0.96666667 0.96666667 0.93333333 0.96666667 1. ]Mean accuracy: 0.9666666666666668 * * * ## 2、Grid Search and Random Search ### Using GridSearchCV for Hyperparameter Tuning GridSearchCV is a technique that finds the best hyperparameters by exhaustively searching all hyperparameter combinations. GridSearchCV provides a set of candidate parameter values, evaluates the performance of each combination, and finally selects the best parameter combination. **Common parameters of GridSearchCV:** * **`param_grid`**: The hyperparameter grid to be tuned, usually a dictionary where keys are parameter names and values are candidate parameter values. * **`cv`**: The number of folds for cross-validation, usually set to 5 or 10. scikit-learn using GridSearchCV example: ## Example from sklearn.model_selection import GridSearchCV from sklearn.svm import SVC from sklearn.datasets import load_iris # Load data data = load_iris() X, y = data.data, data.target # Create model model = SVC() # Define hyperparameter grid param_grid ={'kernel': ['linear','rbf'],'C': [1,10,100]} # Perform grid search grid_search = GridSearchCV(model, param_grid, cv=5) grid_search.fit(X, y) # Output best parameters and best score print(f"Best parameters: {grid_search.best_params_}") print(f"Best score: {grid_search.best_score_}") * **`grid_search.best_params_`**: Returns the best hyperparameter combination found in the grid search. * **`grid_search.best_score_`**: Returns the cross-validation score with the best parameter combination. The output is as follows: Best parameters: {'C': 1, 'kernel': 'linear'}Best score: 0.9800000000000001 ### Using RandomizedSearchCV to Speed Up the Tuning Process RandomizedSearchCV is a more efficient hyperparameter tuning method. It accelerates the tuning process by randomly selecting a certain number of combinations from the hyperparameter space for evaluation. RandomizedSearchCV is suitable when the hyperparameter space is large, as it can save computation time. **Common parameters of RandomizedSearchCV:** * **`param_distributions`**: The hyperparameter distribution to be tuned, usually a dictionary. Values can be distribution objects (such as distributions in `scipy.stats`) or discrete value lists. * **`n_iter`**: The number of iterations for random search, i.e., the number of randomly selected hyperparameter combinations. scikit-learn using RandomizedSearchCV example: ## Example from sklearn.model_selection import RandomizedSearchCV from sklearn.svm import SVC from sklearn.datasets import load_iris from scipy.stats import uniform # Load data data = load_iris() X, y = data.data, data.target # Create model model = SVC() # Define hyperparameter distribution param_distributions ={'C': uniform(0,10),'kernel': ['linear','rbf']} # Perform random search random_search = RandomizedSearchCV(model, param_distributions, n_iter=10, cv=5) random_search.fit(X, y) # Output best parameters and best score print(f"Best parameters: {random_search.best_params_}") print(f"Best score: {random_search.best_score_}") * **`random_search.best_params_`**: Returns the best hyperparameter combination found in the random search. * **`random_search.best_score_`**: Returns the cross-validation score with the best parameter combination. The output is as follows: Best parameters: {'C': 8.355688344706016, 'kernel': 'rbf'}Best score: 0.9866666666666667 * * * ## 3、Model Evaluation ### Using classification_report, confusion_matrix, roc_auc_score For classification models, we usually use metrics such as accuracy, recall, and F1 score to evaluate model performance. scikit-learn provides many evaluation tools to help us understand the model's performance in depth. classification_report - Provides information such as precision, recall, F1 score, and support (number of samples in each class). ## Example from sklearn.metrics import classification_report # Assume y_test is the true label, y_pred is the model's prediction result print(classification_report(y_test, y_pred)) confusion_matrix - The confusion matrix is used to show the performance of a classification model on each class, especially how positive classes are predicted as negative classes and vice versa. ## Example from sklearn.metrics import confusion_matrix # Assume y_test is the true label, y_pred is the model's prediction result print(confusion_matrix(y_test, y_pred)) roc_auc_score - ROC AUC (Area Under the Receiver Operating Characteristic Curve) is a metric for evaluating classification model performance, especially suitable for imbalanced datasets. The higher the AUC value, the better the model performance. ## Example from sklearn.metrics import roc_auc_score # Assume y_test is the true label, y_pred_proba is the model's predicted probability print(f"ROC AUC Score: {roc_auc_score(y_test, y_pred_proba)}") ### Regression Model Evaluation: mean_squared_error, r2_score For regression problems, common evaluation metrics include Mean Squared Error (MSE) and R-squared (R²). mean_squared_error - Mean Squared Error is a common evaluation standard for regression models. It calculates the mean of the squared errors between predicted and true values. ## Example from sklearn.metrics import mean_squared_error # Assume y_test is the true value, y_pred is the predicted value print(f"Mean Squared Error: {mean_squared_error(y_test, y_pred)}") r2_score - The coefficient

YouTip

Sklearn Model Evaluation

📂 Categories