Pytorch Beginner

PyTorch_001_Building a Fully Connected Neural Network

codeaddict 2024. 12. 23. 21:49

 

 

What is a Fully Connected Neural Network?

A Fully Connected Neural Network (FCNN), also known as a dense neural network, is one of the simplest yet most powerful architectures in deep learning. It consists of layers where every neuron in one layer is connected to every neuron in the next layer. These networks are versatile and can approximate complex functions, making them ideal for tasks such as classification, regression, and more.

Here’s how it works:

  • The input layer receives data (like images, numbers, or text embeddings).
  • Hidden layers perform computations, transforming the input into higher-level features.
  • The output layer produces the final predictions (e.g., classifying digits, predicting a house price, etc.).

In this tutorial, you’ll learn how to code a fully connected neural network in PyTorch and train it to classify handwritten digits from the MNIST dataset. We’ll keep things fun and interactive, exploring every step of the way!


What Will You Achieve in This Tutorial?

By the end of this tutorial, you will:

  1. Understand Fully Connected Networks: Learn the logic behind how these layers work and what makes them different from CNNs.
  2. Build Your First Fully Connected Neural Network: Create a custom architecture step by step using PyTorch.
  3. Train the Model to Classify MNIST Digits: Watch your network come to life as it learns to predict handwritten numbers.
  4. Analyze Results: See how well your model performs and visualize how it improves with training.

This is going to be an exciting deep dive! You’ll see how raw numerical data transforms into predictions, and by the end, you’ll feel empowered to build your own custom networks for any task.

in!


1. Set Up Your Environment

First, we’ll create a Python virtual environment to keep everything clean and organized.

Step 1.1: Create a Virtual Environment

  1. Open your terminal or command prompt.
  2. Navigate to your project folder or create a new one:
mkdir fully_connected_nn && cd fully_connected_nn

3. Create a virtual environment :

python -m venv venv

4. Activate the virtual environment:

  • Windows:
venv\Scripts\activate
  • Mac/Linux:
source venv/bin/activate

🧹 Why? A virtual environment keeps your project dependencies isolated so it doesn’t mess up other Python projects.


Step 1.2: Install Required Libraries

Now, let’s install the libraries we need. Inside your activated environment, run: refer to pytorch website below

https://pytorch.org/get-started/locally/

pip install torch torchvision tqdm

  • torch: The main PyTorch library.
  • torchvision: For datasets and data transformations.
  • tqdm: For pretty progress bars.

2. Open Jupyter Notebook

If you’re using Jupyter Notebook, make sure it’s installed. You can install it with:

pip install notebook

Then, start Jupyter Notebook in your project directory:

jupyter notebook

Create a new notebook, and let’s get coding! 🎉


3. Explaining the code step by step

0. Import the Libraries

The first step in your notebook is to import all the libraries we’ll need.

# Imports
import torch
import torch.nn.functional as F  
import torchvision.datasets as datasets  
import torchvision.transforms as transforms  
from torch import optim  
from torch import nn  
from torch.utils.data import DataLoader
from tqdm import tqdm  

🧠 Think of this as bringing your tools to the workbench before starting your project!

torch: The main PyTorch library for tensor operations.
torch.nn: Contains modules and classes for building neural networks.
torch.optim: Provides optimization algorithms.
torchvision: A library for image processing and datasets.
tqdm: A library for progress bars in loops.


1. Defining the Neural Network Architecture

We start by defining our neural network class, NN, which inherits from nn.Module.

class NN(nn.Module):
    def __init__(self, input_size, num_classes):
        super(NN, self).__init__()
        self.fc1 = nn.Linear(input_size, 50)  # First fully connected layer
        self.fc2 = nn.Linear(50, num_classes)  # Second fully connected layer

    def forward(self, x):
        x = F.relu(self.fc1(x))  # Apply ReLU activation after the first layer
        x = self.fc2(x)  # Output layer
        return x

Explanation:

  • __init__** Method**: This initializes the network’s layers.
  • input_size: The number of input features (for MNIST, it’s 784, corresponding to 28x28 pixel images).
  • num_classes: The number of output classes (for MNIST, it’s 10, representing digits 0-9).
  • nn.Linear: Fully connected layers that transform inputs to outputs.
  •  — The first layer reduces the input from 784 to 50 dimensions.
  •  — The second layer maps the 50 dimensions to 10 output classes.
  • forward** Method**: Defines how data flows through the network.
  • ReLU Activation: Adds non-linearity to help the network learn complex patterns.

2. Setting Up the Device

We need to specify whether to use a GPU or CPU for computations:

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

Explanation:

  • torch.device: Checks if a GPU is available and sets it as the computation device. Using a GPU can significantly speed up training. If no GPU is available, it defaults to the CPU.

3. Defining Hyperparameters

These are the settings that control how the network is trained:

input_size = 784
num_classes = 10
learning_rate = 0.001
batch_size = 64
num_epochs = 3

Explanation:

  • input_size: Number of input features (28x28 flattened into 784).
  • num_classes: Number of output classes (digits 0-9).
  • learning_rate: Controls how much the model updates weights with each step.
  • batch_size: Number of training samples processed in one iteration.
  • num_epochs: Number of complete passes through the dataset.

4. Loading the MNIST Dataset

We’ll use PyTorch’s torchvision library to load the MNIST dataset:

train_dataset = datasets.MNIST(
    root="dataset/", train=True, transform=transforms.ToTensor(), download=True
)
test_dataset = datasets.MNIST(
    root="dataset/", train=False, transform=transforms.ToTensor(), download=True
)

train_loader = DataLoader(dataset=train_dataset, batch_size=batch_size, shuffle=True)
test_loader = DataLoader(dataset=test_dataset, batch_size=batch_size, shuffle=True)

Explanation:

  • datasets.MNIST: Downloads and loads the MNIST dataset.
  • transform=transforms.ToTensor(): Converts images into PyTorch tensors for computation.
  • DataLoader: Creates iterable datasets, enabling us to loop through data in batches.

5. Initializing the Model

Let’s create an instance of our neural network:

model = NN(input_size=input_size, num_classes=num_classes).to(device)

Explanation:

  • Model Initialization: Creates the network based on the architecture defined earlier.
  • .to(device): Moves the model to the specified computation device (GPU or CPU).

6. Defining the Loss Function and Optimizer

We need to measure the model’s error and update its weights:

criterion = nn.CrossEntropyLoss()  # Loss function for multi-class classification
optimizer = optim.Adam(model.parameters(), lr=learning_rate)  # Adam optimizer

Explanation:

  • nn.CrossEntropyLoss: Suitable for multi-class classification problems.
  • optim.Adam: An optimization algorithm that adapts the learning rate for each parameter, making training efficient and fast.

7. Training the Network

Here’s the training loop where the magic happens:

for epoch in range(num_epochs):
    for batch_idx, (data, targets) in enumerate(tqdm(train_loader)):
        data = data.to(device=device)
        targets = targets.to(device=device)
        data = data.reshape(data.shape[0], -1)  # Flatten the images

        # Forward pass
        scores = model(data)
        loss = criterion(scores, targets)

        # Backward pass
        optimizer.zero_grad()  # Clear previous gradients
        loss.backward()  # Backpropagation
        optimizer.step()  # Update weights

The result of this cell looks like” 

Explanation:

  • Epoch Loop: Runs the training process for a specified number of epochs.
  • Batch Loop: Processes data in batches:
  1. Moves data and targets to the computation device.
  2. Flattens images into 1D tensors (28x28 to 784).
  3. Performs a forward pass through the network to get predictions.
  4. Calculates loss by comparing predictions with actual labels.
  5. Performs backpropagation to calculate gradients.
  6. Updates weights using the optimizer.

8. Checking Model Accuracy

After training, we’ll evaluate the model’s performance:

def check_accuracy(loader, model):
    num_correct = 0
    num_samples = 0
    model.eval()  # Set the model to evaluation mode

    with torch.no_grad():  # Disable gradient tracking
        for x, y in loader:
            x = x.to(device=device)
            y = y.to(device=device)
            x = x.reshape(x.shape[0], -1)  # Flatten the images

            scores = model(x)  # Forward pass
            _, predictions = scores.max(1)  # Get the predicted class

            num_correct += (predictions == y).sum()  # Count correct predictions
            num_samples += predictions.size(0)  # Count total samples

    model.train()  # Set the model back to training mode
    return num_correct / num_samples  # Return accuracy

Explanation:

  • check_accuracy** Function**: Evaluates the model’s accuracy on a given dataset.
  • Evaluation Mode: Disables training-specific behaviors like dropout.
  • No Gradients: Saves memory and speeds up computations during evaluation.

9. Displaying Results

Finally, let’s see how well our model performs:

print(f"Accuracy on training set: {check_accuracy(train_loader, model)*100:.2f}%")
print(f"Accuracy on test set: {check_accuracy(test_loader, model)*100:.2f}%")
Accuracy on training set: 96.22%
Accuracy on test set: 95.70%

Explanation:

  • Training Accuracy: Measures performance on data the model has seen.
  • Test Accuracy: Evaluates performance on unseen data.

10. Testing the Model with a Single Image

After evaluating the overall accuracy, it’s time to test the model on individual images. This section demonstrates how to visualize an image from the test dataset, pass it through the trained model, and view the predicted label compared to the actual label.

import matplotlib.pyplot as plt

# Extract an image from the test dataset
image, label = test_dataset[0]  # Get the first image and its label
image = image.to(device)  # Move image to the same device as the model
image_flattened = image.view(-1, 28*28)  # Flatten the image to match input size

# Display the image
plt.imshow(image.cpu().squeeze(), cmap="gray")
plt.title(f"Actual Label: {label}")
plt.show()

# Pass the image through the model
model.eval()  # Set the model to evaluation mode
with torch.no_grad():
    output = model(image_flattened)  # Forward pass
    predicted_label = output.argmax(dim=1).item()  # Get the predicted label

print(f"Predicted Label: {predicted_label}")

Explanation:

  1. Importing matplotlib: As before, matplotlib.pyplot is used to visualize the images. In this case, we display an image from the test dataset to see how well the model performs on individual examples.
  2. Extracting the Image: We retrieve the first image from the test_dataset. This gives us both the image and the true label. The image is a tensor representing the pixel values, and the label is the correct class for that image.
  3. Moving the Image to the Device: We move the image tensor to the same device (GPU or CPU) where the model resides. This ensures the image and model are aligned for processing.
  4. Flattening the Image: The MNIST images are 28x28 pixels, but our model expects a 1D vector of size 784. Thus, we flatten the image using image.view(-1, 28*28) to reshape it into a single vector.
  5. Displaying the Image: The plt.imshow() function is used to visualize the image. The cmap="gray" argument ensures that the image is displayed in grayscale, as MNIST images are black-and-white. We use squeeze() to remove any singleton dimensions, ensuring the image is properly visualized.
  6. Evaluating the Image with the Model: We set the model to evaluation mode using model.eval(). This is important because, during evaluation, behaviors like dropout are turned off, ensuring the model performs consistently.
  7. Making a Prediction: Inside a torch.no_grad() context (which disables gradient computation to save memory and computation), we pass the flattened image through the model using model(image_flattened). The output is a tensor of raw scores (logits) for each class. We use output.argmax(dim=1) to get the index of the maximum score, which represents the predicted class.
  8. Displaying the Prediction: Finally, we print the predicted label, which represents the class with the highest probability according to the model.

The output of the code above is as follows: