Pytorch Torch Nn Crossentropyloss
[ PyTorch torch.nn Reference Manual](#)\\\\n\\\\n* * *\\\\n\\\\n`torch.nn.CrossEntropyLoss` is a loss function in PyTorch used for multi-class classification.\\\\n\\\\nIt combines nn.LogSoftmax and nn.NLLLoss, and is commonly used in tasks such as image classification and text classification.\\\\n\\\\n### Function Definition\\\\n\\\\ntorch.nn.CrossEntropyLoss(weight=None, ignore_index=-100, reduction='mean', label_smoothing=0.0)\\\\n**Parameter Description:**\\\\n\\\\n* `weight` (Tensor): Assigns different weights to each class, used for class imbalance situations.\\\\n* `ignore_index` (int): Ignores loss calculation for the specified index. Default is -100.\\\\n* `reduction` (str): Loss aggregation method. Options are `'mean'`, `'sum'`, `'none'`. Default is `'mean'`.\\\\n* `label_smoothing` (float): Label smoothing parameter, value between 0 and 1. Default is 0.\\\\n\\\\n### Mathematical Principle\\\\n\\\\nCross-entropy loss formula:\\\\n\\\\nLoss = -log(exp(y_true) / sum(exp(y_i)))\\\\nThat is, the larger the predicted probability of the correct class, the smaller the loss.\\\\n\\\\n* * *\\\\n\\\\n## Usage Examples\\\\n\\\\n### Example 1: Basic Usage\\\\n\\\\nCreate and use cross-entropy loss:\\\\n\\\\n## Instance\\\\n\\\\nimport torch\\\\n\\\\nimport torch.nn as nn\\\\n\\\\n# Create Loss Function\\\\n\\\\n criterion = nn.CrossEntropyLoss()\\\\n\\\\n# Model Output's logitsοΌUnnormalized)\\\\n\\\\n# Shape: (batch_size, num_classes)\\\\n\\\\n outputs = torch.randn(4,10)\\\\n\\\\n# True Labels\\\\n\\\\n labels = torch.tensor([2,5,1,7])\\\\n\\\\n# Calculate Loss\\\\n\\\\n loss = criterion(outputs, labels)\\\\n\\\\nprint("Model Output (logits):", outputs.tolist())\\\\n\\\\nprint("True Labels:", labels.item())\\\\n\\\\nprint("Cross Entropy Loss:", loss.item())\\\\n\\\\n### Example 2: Class Weights\\\\n\\\\nHandling class imbalance:\\\\n\\\\n## Instance\\\\n\\\\nimport torch\\\\n\\\\nimport torch.nn as nn\\\\n\\\\n# Class Weights: Give Higher Weights to Minority Classes\\\\n\\\\n# Assume 10 Classes, Class 3 and Class 7 Are More Important\\\\n\\\\n weight = torch.ones(10)\\\\n\\\\n weight=2.0\\\\n\\\\n weight=2.0\\\\n\\\\ncriterion_weighted = nn.CrossEntropyLoss(weight=weight)\\\\n\\\\noutputs = torch.randn(4,10)\\\\n\\\\n labels = torch.tensor([2,3,7,5])\\\\n\\\\nloss = criterion_weighted(outputs, labels)\\\\n\\\\nprint("Weighted Cross Entropy Loss:", loss.item())\\\\n\\\\n### Example 3: Label Smoothing\\\\n\\\\nUsing label smoothing to prevent overfitting:\\\\n\\\\n## Instance\\\\n\\\\nimport torch\\\\n\\\\nimport torch.nn as nn\\\\n\\\\n# TagSmoothing: 0.1 Represents 10% Probability Evenly Distributed to Other Classes\\\\n\\\\n criterion_smooth = nn.CrossEntropyLoss(label_smoothing=0.1)\\\\n\\\\noutputs = torch.randn(4,10)\\\\n\\\\n labels = torch.tensor([2,5,1,7])\\\\n\\\\nloss = criterion_smooth(outputs, labels)\\\\n\\\\nprint("withTagSmoothing'sLoss:", loss.item())\\\\n\\\\n# Comparison:without TagSmoothing\\\\n\\\\n criterion = nn.CrossEntropyLoss()\\\\n\\\\n loss_no_smooth = criterion(outputs, labels)\\\\n\\\\nprint("without TagSmoothing'sLoss:", loss_no_smooth.item())\\\\n\\\\n### Example 4: Complete Classification Training Process\\\\n\\\\nComplete model training example:\\\\n\\\\n## Instance\\\\n\\\\nimport torch\\\\n\\\\nimport torch.nn as nn\\\\n\\\\nimport torch.optim as optim\\\\n\\\\n# Simple Classification Model\\\\n\\\\nclass Classifier(nn.Module):\\\\n\\\\ndef __init__ (self, input_dim=784, num_classes=10):\\\\n\\\\nsuper(Classifier,self). __init__ ()\\\\n\\\\nself.fc= nn.Sequential(\\\\n\\\\n nn.Linear(input_dim,256),\\\\n\\\\n nn.ReLU(),\\\\n\\\\n nn.Dropout(0.5),\\\\n\\\\n nn.Linear(256, num_classes)\\\\n\\\\n)\\\\n\\\\ndef forward(self, x):\\\\n\\\\nreturn self.fc(x)\\\\n\\\\n# Initialize Model and Loss\\\\n\\\\n model = Classifier()\\\\n\\\\n criterion = nn.CrossEntropyLoss()\\\\n\\\\n optimizer = optim.Adam(model.parameters(), lr=0.001)\\\\n\\\\n# Simulate Training Data\\\\n\\\\n batch_size =32\\\\n\\\\n x = torch.randn(batch_size,784)# Input\\\\n\\\\n y = torch.randint(0,10,(batch_size,))# Tag\\\\n\\\\n# Forward pass\\\\n\\\\n model.train()\\\\n\\\\n outputs = model(x)\\\\n\\\\n loss = criterion(outputs, y)\\\\n\\\\n# Backpropagation\\\\n\\\\n optimizer.zero_grad()\\\\n\\\\n loss.backward()\\\\n\\\\n optimizer.step()\\\\n\\\\nprint("Batch Loss:", loss.item())\\\\n\\\\n# Prediction\\\\n\\\\n model.eval()\\\\n\\\\nwith torch.no_grad():\\\\n\\\\n outputs = model(x)\\\\n\\\\n predictions = outputs.argmax(dim=1)\\\\n\\\\n accuracy =(predictions == y).float().mean()\\\\n\\\\nprint("Prediction Accuracy:", accuracy.item())\\\\n\\\\n### Example 5: Using ignore_index\\\\n\\\\nIgnoring specific labels:\\\\n\\\\n## Instance\\\\n\\\\nimport torch\\\\n\\\\nimport torch.nn as nn\\\\n\\\\n# Ignore label=-100 Samples\\\\n\\\\n criterion = nn.CrossEntropyLoss(ignore_index=-100)\\\\n\\\\noutputs = torch.randn(5,10)\\\\n\\\\n# Certain samples'sTagis -100οΌRepresents Ignoring\\\\n\\\\n labels = torch.tensor([2, -100,5, -100,7])\\\\n\\\\nloss = criterion(outputs, labels)\\\\n\\\\nprint("After ignoring special tags'sLoss:", loss.item())\\\\n\\\\n### Example 6: Different Reduction Methods\\\\n\\\\nControlling loss aggregation methods:\\\\n\\\\n## Instance\\\\n\\\\nimport torch\\\\n\\\\nimport torch.nn as nn\\\\n\\\\noutputs = torch.randn(4,10)\\\\n\\\\n labels = torch.tensor([2,5,1,7])\\\\n\\\\n# mean: Return Average Loss\\\\n\\\\n loss_mean = nn.CrossEntropyLoss(reduction='mean')(outputs, labels)\\\\n\\\\nprint("mean:", loss_mean.item())\\\\n\\\\n# sum: Return Sum\\\\n\\\\n loss_sum = nn.CrossEntropyLoss(reduction='sum')(outputs, labels)\\\\n\\\\nprint("sum:", loss_sum.item())\\\\n\\\\n# none: Return Loss Per Sample\\\\n\\\\n loss_none = nn.CrossEntropyLoss(reduction='none')(outputs, labels)\\\\n\\\\nprint("none:", loss_none.tolist())\\\\n\\\\n* * *\\\\n\\\\n## Common Questions\\\\n\\\\n### Q1: What is the difference between CrossEntropyLoss and NLLLoss?\\\\n\\\\nCrossEntropyLoss = LogSoftmax + NLLLoss. It has softmax built-in, so you don't need to add it manually.\\\\n\\\\n### Q2: Why don't model outputs use softmax?\\\\n\\\\nCrossEntropyLoss automatically calculates softmax internally; using logits directly improves numerical stability.\\\\n\\\\n### Q3: What scenarios are suitable for label smoothing?\\\\n\\\\nLabel smoothing is suitable for situations with a large number of classes, which can improve the model's generalization ability.\\\\n\\\\n* * *\\\\n\\\\n## Usage Scenarios\\\\n\\\\nThe main application scenarios for `nn.CrossEntropyLoss` include:\\\\n\\\\n* **Image Classification**: Such as CIFAR-10, ImageNet\\\\n* **Text Classification**: Sentiment analysis, topic classification\\\\n* **Multi-class Tasks**: Any classification task with more than 2 classes\\\\n\\\\n> Note: Labels should be class indices (0 to num_classes-1), not one-hot encoding.\\\\n\\\\n* * PyTorch torch.nn Reference Manual](#)
YouTip