PyTorch First Neural Network |
In this chapter we will introduce how to implement a simple feedforward neural network using PyTorch to complete a binary classification task.
The following example shows how to use PyTorch to implement a simple neural network for training a binary classification task.
The network structure includes input layer, hidden layer and output layer, with ReLU activation function and Sigmoid activation function used.
Mean squared error loss function and stochastic gradient descent optimizer are adopted.
The training process adjusts model parameters step by step through forward propagation, loss calculation, backward propagation and parameter update.
Example
# Import PyTorch libraries
import torch
import torch.nn as nn
# Define input layer size, hidden layer size, output layer size and batch size
n_in, n_h, n_out, batch_size = 10, 5, 1, 10
# Create virtual input data and target data
x = torch.randn(batch_size, n_in) # Randomly generate input data
y = torch.tensor([[1.0], [0.0], [0.0],
[1.0], [1.0], [1.0], [0.0], [0.0], [1.0], [1.0]]) # Target output data
# Create sequential model containing linear layer, ReLU activation function and Sigmoid activation function
model = nn.Sequential(
nn.Linear(n_in, n_h), # Linear transformation from input layer to hidden layer
nn.ReLU(), # ReLU activation function of hidden layer
nn.Linear(n_h, n_out), # Linear transformation from hidden layer to output layer
nn.Sigmoid() # Sigmoid activation function of output layer
)
# Define mean square error loss function and stochastic gradient descent optimizer
criterion = torch.nn.MSELoss()
optimizer = torch.optim.SGD(model.parameters(), lr=0.01) # Learning rate is 0.01
# Execute gradient descent algorithm for model training
for epoch in range(50): # Iterate 50 times
y_pred = model(x) # Forward propagation, calculate prediction value
loss = criterion(y_pred, y) # Calculate loss
print('epoch: ', epoch, 'loss: ', loss.item()) # Print loss value
optimizer.zero_grad() # Clear gradients
loss.backward() # Backward propagation, calculate gradients
optimizer.step() # Update model parameters
The output results are similar to the following:
epoch: 0 loss: 0.2591968774795532 epoch: 1 loss: 0.25902628898620605 epoch: 2 loss: 0.25885599851608276 epoch: 3 loss: 0.25868603587150574 epoch: 4 loss: 0.25851646065711975...
Define network parameters:
n_in, n_h, n_out, batch_size = 10, 5, 1, 10n_in: Input layer size is 10, meaning each data point has 10 features.n_h: Hidden layer size is 5, meaning the hidden layer contains 5 neurons.n_out: Output layer size is 1, meaning output a scalar representing binary classification result (0 or 1).batch_size: Each batch contains 10 samples.
Generate input data and target data:
x = torch.randn(batch_size, n_in) # Randomly generate input data
y = torch.tensor([[1.0], [0.0], [0.0], [1.0], [1.0], [1.0], [0.0], [0.0], [1.0], [1.0]]) # Target output data
x: Randomly generated input data matrix with shape (10, 10), representing 10 samples, each sample has 10 features.y: Target output data (labels), representing class labels (0 or 1) for each input sample, a 10x1 tensor.
Define neural network model:
model = nn.Sequential(
nn.Linear(n_in, n_h), # Linear transformation from input layer to hidden layer
nn.ReLU(), # ReLU activation function of hidden layer
nn.Linear(n_h, n_out), # Linear transformation from hidden layer to output layer
nn.Sigmoid() # Sigmoid activation function of output layer
)
nn.Sequential is used to define network layers in sequence.
nn.Linear(n_in, n_h): Defines linear transformation from input layer to hidden layer, input features are 10, hidden layer has 5 neurons.nn.ReLU(): Adds ReLU activation function after hidden layer, adding non-linearity.nn.Linear(n_h, n_out): Defines linear transformation from hidden layer to output layer, output is 1 neuron.nn.Sigmoid(): Uses Sigmoid activation function at output layer, mapping results to between 0 and 1, used for binary classification tasks.
Define loss function and optimizer:
criterion = torch.nn.MSELoss() # Use mean square error loss function
optimizer = torch.optim.SGD(model.parameters(), lr=0.01) # Use stochastic gradient descent optimizer, learning rate is 0.01
Training loop:
for epoch in range(50): # Train for 50 rounds
y_pred = model(x) # Forward propagation, calculate prediction value
loss = criterion(y_pred, y) # Calculate loss
print('epoch: ', epoch, 'loss: ', loss.item()) # Print loss value
optimizer.zero_grad() # Clear gradients
loss.backward() # Backward propagation, calculate gradients
optimizer.step() # Update model parameters
for epoch in range(50): Perform 50 training iterations.y_pred = model(x): Perform forward propagation, use current model parameters to calculate prediction values for input datax.loss = criterion(y_pred, y): Calculate loss between prediction values and target valuesy.optimizer.zero_grad(): Clear gradient values from previous round of training.loss.backward(): Backward propagation, calculate gradients of loss function with respect to model parameters.optimizer.step(): Update model parameters according to calculated gradients.
Visualization code:
Example
import torch
import torch.nn as nn
import matplotlib.pyplot as plt
# Define input layer size, hidden layer size, output layer size and batch size
n_in, n_h, n_out, batch_size = 10, 5, 1, 10
# Create virtual input data and target data
x = torch.randn(batch_size, n_in) # Randomly generate input data
y = torch.tensor([[1.0], [0.0], [0.0],
[1.0], [1.0], [1.0], [0.0], [0.0], [1.0], [1.0]]) # Target output data
# Create sequential model containing linear layer, ReLU activation function and Sigmoid activation function
model = nn.Sequential(
nn.Linear(n_in, n_h), # Linear transformation from input layer to hidden layer
nn.ReLU(), # ReLU activation function of hidden layer
nn.Linear(n_h, n_out), # Linear transformation from hidden layer to output layer
nn.Sigmoid() # Sigmoid activation function of output layer
)
# Define mean square error loss function and stochastic gradient descent optimizer
criterion = torch.nn.MSELoss()
optimizer = torch.optim.SGD(model.parameters(), lr=0.01) # Learning rate is 0.01
# Used to store loss values for each round
losses = []
# Execute gradient descent algorithm for model training
for epoch in range(50): # Iterate 50 times
y_pred = model(x) # Forward propagation, calculate prediction value
loss = criterion(y_pred, y) # Calculate loss
losses.append(loss.item()) # Record loss value
print(f'Epoch [{epoch+1}/50], Loss: {loss.item():.4f}') # Print loss value
optimizer.zero_grad() # Clear gradients
loss.backward() # Backward propagation, calculate gradients
optimizer.step() # Update model parameters
# Visualize loss change curve
plt.figure(figsize=(8,5))
plt.plot(range(1,51), losses, label='Loss')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.title('Training Loss Over Epochs')
plt.legend()
plt.grid()
plt.show()
# Visualize comparison between predicted results and actual target values
y_pred_final = model(x).detach().numpy() # Final prediction values
y_actual = y.numpy() # Actual values
plt.figure(figsize=(8,5))
plt.plot(range(1, batch_size + 1), y_actual, 'o-', label='Actual', color='blue')
plt.plot(range(1, batch_size + 1), y_pred_final, 'x--', label='Predicted', color='red')
plt.xlabel('Sample Index')
plt.ylabel('Value')
plt.title('Actual vs Predicted Values')
plt.legend()
plt.grid()
plt.show()
Displays as shown below:
Another Example
We assume we have a two-dimensional dataset, and the goal is to classify points into two categories based on their positions (e.g., red and blue points).
The following example shows how to use a neural network to complete a simple binary classification task, laying the foundation for more complex tasks. Through PyTorch's modular interface, building, training and visualization of neural networks are very intuitive.
1. Data Preparation
First, we generate some simple two-dimensional data:
Example
import torch
import torch.nn as nn
import torch.optim as optim
import matplotlib.pyplot as plt
# Generate some random data
n_samples = 100
data = torch.randn(n_samples, 2) # Generate 100 two-dimensional data points
labels = (data[:,0]**2 + data[:,1]**2 < 1).float().unsqueeze(1) # Points inside circle are 1, outside are 0
# Visualize data
plt.scatter(data[:,0], data[:,1], c=labels.squeeze(), cmap='coolwarm')
plt.title("Generated Data")
plt.xlabel("Feature 1")
plt.ylabel("Feature 2")
plt.show()
Data Description:
datais the input two-dimensional points, each point has two features.labelsis the target classification, points inside the circular region are 1, otherwise 0.
Displays as follows:
2. Define Neural Network
Create a simple feedforward neural network using PyTorch.
A feedforward neural network uses one hidden layer, capturing nonlinear patterns in data through simple linear transformations and activation functions.
Example
class SimpleNN(nn.Module):
def __init__(self):
super(SimpleNN, self).__init__()
# Define layers of neural network
self.fc1 = nn.Linear(2, 4) # Input layer has 2 features, hidden layer has 4 neurons
self.fc2 = nn.Linear(4, 1) # Hidden layer output to 1 neuron (for binary classification)
self.sigmoid = nn.Sigmoid() # Binary classification activation function
def forward(self, x):
x = torch.relu(self.fc1(x)) # Use ReLU activation function
x = self.sigmoid(self.fc2(x)) # Output layer uses Sigmoid activation function
return x
# Instantiate model
model = SimpleNN()
3. Define Loss Function and Optimizer
Example
# Define loss function and optimizer for binary classification
criterion = nn.BCELoss() # Binary cross entropy loss
optimizer = optim.SGD(model.parameters(), lr=0.1) # Use stochastic gradient descent optimizer
4. Train Model
Train the model with data so it learns to classify.
Example
# Training
epochs = 100
for epoch in range(epochs):
# Forward propagation
outputs = model(data)
loss = criterion(outputs, labels)
# Backward propagation
optimizer.zero_grad()
loss.backward()
optimizer.step()
# Print loss every 10 rounds
if (epoch + 1) % 10 == 0:
print(f'Epoch [{epoch + 1}/{epochs}], Loss: {loss.item():.4f}')
5. Test Model and Visualize Results
We test the model and draw the decision boundary on the image.
Example
# Visualize decision boundary
def plot_decision_boundary(model, data):
x_min, x_max = data[:,0].min() - 1, data[:,0].max() + 1
y_min, y_max = data[:,1].min() - 1, data[:,1].max() + 1
xx, yy = torch.meshgrid(torch.arange(x_min, x_max, 0.1), torch.arange(y_min, y_max, 0.1), indexing='ij')
grid = torch.cat([xx.reshape(-1,1), yy.reshape(-1,1)], dim=1)
predictions = model(grid).detach().numpy().reshape(xx.shape)
plt.contourf(xx, yy, predictions, levels=[0,0.5,1], cmap='coolwarm', alpha=0.7)
plt.scatter(data[:,0], data[:,1], c=labels.squeeze(), cmap='coolwarm', edgecolors='k')
plt.title("Decision Boundary")
plt.show()
plot_decision_boundary(model, data)
6. Complete Code
Complete code as follows:
Example
import torch
import torch.nn as nn
import torch.optim as optim
import matplotlib.pyplot as plt
# Generate some random data
n_samples = 100
data = torch.randn(n_samples, 2) # Generate 100 two-dimensional data points
labels = (data[:,0]**2 + data[:,1]**2 < 1).float().unsqueeze(1) # Points inside circle are 1, outside are 0
# Visualize data
plt.scatter(data[:,0], data[:,1], c=labels.squeeze(), cmap='coolwarm')
plt.title("Generated Data")
plt.xlabel("Feature 1")
plt.ylabel("Feature 2")
plt.show()
# Define feedforward neural network
class SimpleNN(nn.Module):
def __init__(self):
super(SimpleNN, self).__init__()
# Define layers of neural network
self.fc1 = nn.Linear(2, 4) # Input layer has 2 features, hidden layer has 4 neurons
self.fc2 = nn.Linear(4, 1) # Hidden layer output to 1 neuron (for binary classification)
self.sigmoid = nn.Sigmoid() # Binary classification activation function
def forward(self, x):
x = torch.relu(self.fc1(x)) # Use ReLU activation function
x = self.sigmoid(self.fc2(x)) # Output layer uses Sigmoid activation function
return x
# Instantiate model
model = SimpleNN()
# Define loss function and optimizer
criterion = nn.BCELoss() # Binary cross entropy loss
optimizer = optim.SGD(model.parameters(), lr=0.1) # Use stochastic gradient descent optimizer
# Training
epochs = 100
for epoch in range(epochs):
# Forward propagation
outputs = model(data)
loss = criterion(outputs, labels)
# Backward propagation
optimizer.zero_grad()
loss.backward()
optimizer.step()
# Print loss every 10 rounds
if (epoch + 1) % 10 == 0:
print(f'Epoch [{epoch + 1}/{epochs}], Loss: {loss.item():.4f}')
# Visualize decision boundary
def plot_decision_boundary(model, data):
x_min, x_max = data[:,0].min() - 1, data[:,0].max() + 1
y_min, y_max = data[:,1].min() - 1, data[:,1].max() + 1
xx, yy = torch.meshgrid(torch.arange(x_min, x_max, 0.1), torch.arange(y_min, y_max, 0.1), indexing='ij')
grid = torch.cat([xx.reshape(-1,1), yy.reshape(-1,1)], dim=1)
predictions = model(grid).detach().numpy().reshape(xx.shape)
plt.contourf(xx, yy, predictions, levels=[0,0.5,1], cmap='coolwarm', alpha=0.7)
plt.scatter(data[:,0], data[:,1], c=labels.squeeze(), cmap='coolwarm', edgecolors='k')
plt.title("Decision Boundary")
plt.show()
plot_decision_boundary(model, data)
YouTip