PyTorch First Neural Network |

In this chapter we will introduce how to implement a simple feedforward neural network using PyTorch to complete a binary classification task.

The following example shows how to use PyTorch to implement a simple neural network for training a binary classification task.

The network structure includes input layer, hidden layer and output layer, with ReLU activation function and Sigmoid activation function used.

Mean squared error loss function and stochastic gradient descent optimizer are adopted.

The training process adjusts model parameters step by step through forward propagation, loss calculation, backward propagation and parameter update.

Example

# Import PyTorch libraries

import torch

import torch.nn as nn

# Define input layer size, hidden layer size, output layer size and batch size

n_in, n_h, n_out, batch_size = 10, 5, 1, 10

# Create virtual input data and target data

x = torch.randn(batch_size, n_in) # Randomly generate input data

y = torch.tensor([[1.0], [0.0], [0.0],
                  [1.0], [1.0], [1.0], [0.0], [0.0], [1.0], [1.0]]) # Target output data

# Create sequential model containing linear layer, ReLU activation function and Sigmoid activation function

model = nn.Sequential(
    nn.Linear(n_in, n_h), # Linear transformation from input layer to hidden layer
    nn.ReLU(), # ReLU activation function of hidden layer
    nn.Linear(n_h, n_out), # Linear transformation from hidden layer to output layer
    nn.Sigmoid() # Sigmoid activation function of output layer
)

# Define mean square error loss function and stochastic gradient descent optimizer

criterion = torch.nn.MSELoss()
optimizer = torch.optim.SGD(model.parameters(), lr=0.01) # Learning rate is 0.01

# Execute gradient descent algorithm for model training

for epoch in range(50): # Iterate 50 times
    y_pred = model(x) # Forward propagation, calculate prediction value
    loss = criterion(y_pred, y) # Calculate loss
    print('epoch: ', epoch, 'loss: ', loss.item()) # Print loss value
    optimizer.zero_grad() # Clear gradients
    loss.backward() # Backward propagation, calculate gradients
    optimizer.step() # Update model parameters

The output results are similar to the following:

epoch: 0 loss: 0.2591968774795532 epoch: 1 loss: 0.25902628898620605 epoch: 2 loss: 0.25885599851608276 epoch: 3 loss: 0.25868603587150574 epoch: 4 loss: 0.25851646065711975...

Define network parameters:

n_in, n_h, n_out, batch_size = 10, 5, 1, 10
n_in: Input layer size is 10, meaning each data point has 10 features.
n_h: Hidden layer size is 5, meaning the hidden layer contains 5 neurons.
n_out: Output layer size is 1, meaning output a scalar representing binary classification result (0 or 1).
batch_size: Each batch contains 10 samples.

Generate input data and target data:

x = torch.randn(batch_size, n_in) # Randomly generate input data
y = torch.tensor([[1.0], [0.0], [0.0], [1.0], [1.0], [1.0], [0.0], [0.0], [1.0], [1.0]]) # Target output data

x: Randomly generated input data matrix with shape (10, 10), representing 10 samples, each sample has 10 features.
y: Target output data (labels), representing class labels (0 or 1) for each input sample, a 10x1 tensor.

Define neural network model:

model = nn.Sequential(
    nn.Linear(n_in, n_h), # Linear transformation from input layer to hidden layer
    nn.ReLU(), # ReLU activation function of hidden layer
    nn.Linear(n_h, n_out), # Linear transformation from hidden layer to output layer
    nn.Sigmoid() # Sigmoid activation function of output layer
)

nn.Sequential is used to define network layers in sequence.

nn.Linear(n_in, n_h): Defines linear transformation from input layer to hidden layer, input features are 10, hidden layer has 5 neurons.
nn.ReLU(): Adds ReLU activation function after hidden layer, adding non-linearity.
nn.Linear(n_h, n_out): Defines linear transformation from hidden layer to output layer, output is 1 neuron.
nn.Sigmoid(): Uses Sigmoid activation function at output layer, mapping results to between 0 and 1, used for binary classification tasks.

Define loss function and optimizer:

criterion = torch.nn.MSELoss() # Use mean square error loss function
optimizer = torch.optim.SGD(model.parameters(), lr=0.01) # Use stochastic gradient descent optimizer, learning rate is 0.01

Training loop:

for epoch in range(50): # Train for 50 rounds
    y_pred = model(x) # Forward propagation, calculate prediction value
    loss = criterion(y_pred, y) # Calculate loss
    print('epoch: ', epoch, 'loss: ', loss.item()) # Print loss value
    optimizer.zero_grad() # Clear gradients
    loss.backward() # Backward propagation, calculate gradients
    optimizer.step() # Update model parameters

for epoch in range(50): Perform 50 training iterations.
y_pred = model(x): Perform forward propagation, use current model parameters to calculate prediction values for input data x.
loss = criterion(y_pred, y): Calculate loss between prediction values and target values y.
optimizer.zero_grad(): Clear gradient values from previous round of training.
loss.backward(): Backward propagation, calculate gradients of loss function with respect to model parameters.
optimizer.step(): Update model parameters according to calculated gradients.

Visualization code:

Example

import torch
import torch.nn as nn
import matplotlib.pyplot as plt

# Define input layer size, hidden layer size, output layer size and batch size

n_in, n_h, n_out, batch_size = 10, 5, 1, 10

# Create virtual input data and target data

x = torch.randn(batch_size, n_in) # Randomly generate input data

y = torch.tensor([[1.0], [0.0], [0.0],
                  [1.0], [1.0], [1.0], [0.0], [0.0], [1.0], [1.0]]) # Target output data

# Create sequential model containing linear layer, ReLU activation function and Sigmoid activation function

model = nn.Sequential(
    nn.Linear(n_in, n_h), # Linear transformation from input layer to hidden layer
    nn.ReLU(), # ReLU activation function of hidden layer
    nn.Linear(n_h, n_out), # Linear transformation from hidden layer to output layer
    nn.Sigmoid() # Sigmoid activation function of output layer
)

# Define mean square error loss function and stochastic gradient descent optimizer

criterion = torch.nn.MSELoss()
optimizer = torch.optim.SGD(model.parameters(), lr=0.01) # Learning rate is 0.01

# Used to store loss values for each round

losses = []

# Execute gradient descent algorithm for model training

for epoch in range(50): # Iterate 50 times
    y_pred = model(x) # Forward propagation, calculate prediction value
    loss = criterion(y_pred, y) # Calculate loss
    losses.append(loss.item()) # Record loss value
    print(f'Epoch [{epoch+1}/50], Loss: {loss.item():.4f}') # Print loss value
    optimizer.zero_grad() # Clear gradients
    loss.backward() # Backward propagation, calculate gradients
    optimizer.step() # Update model parameters

# Visualize loss change curve

plt.figure(figsize=(8,5))
plt.plot(range(1,51), losses, label='Loss')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.title('Training Loss Over Epochs')
plt.legend()
plt.grid()
plt.show()

# Visualize comparison between predicted results and actual target values

y_pred_final = model(x).detach().numpy() # Final prediction values
y_actual = y.numpy() # Actual values

plt.figure(figsize=(8,5))
plt.plot(range(1, batch_size + 1), y_actual, 'o-', label='Actual', color='blue')
plt.plot(range(1, batch_size + 1), y_pred_final, 'x--', label='Predicted', color='red')
plt.xlabel('Sample Index')
plt.ylabel('Value')
plt.title('Actual vs Predicted Values')
plt.legend()
plt.grid()
plt.show()

Displays as shown below:

Another Example

We assume we have a two-dimensional dataset, and the goal is to classify points into two categories based on their positions (e.g., red and blue points).

The following example shows how to use a neural network to complete a simple binary classification task, laying the foundation for more complex tasks. Through PyTorch's modular interface, building, training and visualization of neural networks are very intuitive.

1. Data Preparation

First, we generate some simple two-dimensional data:

Example

import torch
import torch.nn as nn
import torch.optim as optim
import matplotlib.pyplot as plt

# Generate some random data

n_samples = 100
data = torch.randn(n_samples, 2) # Generate 100 two-dimensional data points
labels = (data[:,0]**2 + data[:,1]**2 < 1).float().unsqueeze(1) # Points inside circle are 1, outside are 0

# Visualize data

plt.scatter(data[:,0], data[:,1], c=labels.squeeze(), cmap='coolwarm')
plt.title("Generated Data")
plt.xlabel("Feature 1")
plt.ylabel("Feature 2")
plt.show()

Data Description:

data is the input two-dimensional points, each point has two features.
labels is the target classification, points inside the circular region are 1, otherwise 0.

Displays as follows:

2. Define Neural Network

Create a simple feedforward neural network using PyTorch.

A feedforward neural network uses one hidden layer, capturing nonlinear patterns in data through simple linear transformations and activation functions.

Example

class SimpleNN(nn.Module):
    def __init__(self):
        super(SimpleNN, self).__init__()
        # Define layers of neural network
        self.fc1 = nn.Linear(2, 4) # Input layer has 2 features, hidden layer has 4 neurons
        self.fc2 = nn.Linear(4, 1) # Hidden layer output to 1 neuron (for binary classification)
        self.sigmoid = nn.Sigmoid() # Binary classification activation function

    def forward(self, x):
        x = torch.relu(self.fc1(x)) # Use ReLU activation function
        x = self.sigmoid(self.fc2(x)) # Output layer uses Sigmoid activation function
        return x

# Instantiate model
model = SimpleNN()

3. Define Loss Function and Optimizer

Example

# Define loss function and optimizer for binary classification
criterion = nn.BCELoss() # Binary cross entropy loss
optimizer = optim.SGD(model.parameters(), lr=0.1) # Use stochastic gradient descent optimizer

4. Train Model

Train the model with data so it learns to classify.

Example

# Training
epochs = 100
for epoch in range(epochs):
    # Forward propagation
    outputs = model(data)
    loss = criterion(outputs, labels)
    # Backward propagation
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()
    # Print loss every 10 rounds
    if (epoch + 1) % 10 == 0:
        print(f'Epoch [{epoch + 1}/{epochs}], Loss: {loss.item():.4f}')

5. Test Model and Visualize Results

We test the model and draw the decision boundary on the image.

Example

# Visualize decision boundary
def plot_decision_boundary(model, data):
    x_min, x_max = data[:,0].min() - 1, data[:,0].max() + 1
    y_min, y_max = data[:,1].min() - 1, data[:,1].max() + 1
    xx, yy = torch.meshgrid(torch.arange(x_min, x_max, 0.1), torch.arange(y_min, y_max, 0.1), indexing='ij')
    grid = torch.cat([xx.reshape(-1,1), yy.reshape(-1,1)], dim=1)
    predictions = model(grid).detach().numpy().reshape(xx.shape)
    plt.contourf(xx, yy, predictions, levels=[0,0.5,1], cmap='coolwarm', alpha=0.7)
    plt.scatter(data[:,0], data[:,1], c=labels.squeeze(), cmap='coolwarm', edgecolors='k')
    plt.title("Decision Boundary")
    plt.show()

plot_decision_boundary(model, data)

6. Complete Code

Complete code as follows:

Example

import torch
import torch.nn as nn
import torch.optim as optim
import matplotlib.pyplot as plt

# Generate some random data

n_samples = 100
data = torch.randn(n_samples, 2) # Generate 100 two-dimensional data points
labels = (data[:,0]**2 + data[:,1]**2 < 1).float().unsqueeze(1) # Points inside circle are 1, outside are 0

# Visualize data

plt.scatter(data[:,0], data[:,1], c=labels.squeeze(), cmap='coolwarm')
plt.title("Generated Data")
plt.xlabel("Feature 1")
plt.ylabel("Feature 2")
plt.show()

# Define feedforward neural network

class SimpleNN(nn.Module):
    def __init__(self):
        super(SimpleNN, self).__init__()
        # Define layers of neural network
        self.fc1 = nn.Linear(2, 4) # Input layer has 2 features, hidden layer has 4 neurons
        self.fc2 = nn.Linear(4, 1) # Hidden layer output to 1 neuron (for binary classification)
        self.sigmoid = nn.Sigmoid() # Binary classification activation function

    def forward(self, x):
        x = torch.relu(self.fc1(x)) # Use ReLU activation function
        x = self.sigmoid(self.fc2(x)) # Output layer uses Sigmoid activation function
        return x

# Instantiate model
model = SimpleNN()

# Define loss function and optimizer
criterion = nn.BCELoss() # Binary cross entropy loss
optimizer = optim.SGD(model.parameters(), lr=0.1) # Use stochastic gradient descent optimizer

# Training
epochs = 100
for epoch in range(epochs):
    # Forward propagation
    outputs = model(data)
    loss = criterion(outputs, labels)
    # Backward propagation
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()
    # Print loss every 10 rounds
    if (epoch + 1) % 10 == 0:
        print(f'Epoch [{epoch + 1}/{epochs}], Loss: {loss.item():.4f}')

# Visualize decision boundary
def plot_decision_boundary(model, data):
    x_min, x_max = data[:,0].min() - 1, data[:,0].max() + 1
    y_min, y_max = data[:,1].min() - 1, data[:,1].max() + 1
    xx, yy = torch.meshgrid(torch.arange(x_min, x_max, 0.1), torch.arange(y_min, y_max, 0.1), indexing='ij')
    grid = torch.cat([xx.reshape(-1,1), yy.reshape(-1,1)], dim=1)
    predictions = model(grid).detach().numpy().reshape(xx.shape)
    plt.contourf(xx, yy, predictions, levels=[0,0.5,1], cmap='coolwarm', alpha=0.7)
    plt.scatter(data[:,0], data[:,1], c=labels.squeeze(), cmap='coolwarm', edgecolors='k')
    plt.title("Decision Boundary")
    plt.show()

plot_decision_boundary(model, data)

YouTip