Fundamentals of Machine Learning: A Pythonic Introduction/

...

Autoencoders in Action

Learn how to implement autoencoders effectively through hands-on steps, including image reconstruction and practical experimentation in a dynamic environment.

We'll cover the following...

Loading and preprocessing the data
Defining the architecture
Training the autoencoder
Reconstructing the image
Playground
- Insights from compression and reconstruction experiment
Conclusion

We’ll explore the fascinating world of autoencoders by applying them to the MNIST dataset. Autoencoders are powerful models that consist of an encoder and a decoder. The encoder compresses input images while the decoder reconstructs them. Our focus will be on image reconstruction.

Target: Using PyTorch, we’ll train an autoencoder model to reduce 784 input values to a lower-dimensional representation as low as possible.

By doing so, we aim to investigate whether this condensed representation preserves the same level of informativeness as the original features.

Loading and preprocessing the data

To begin, let’s import the necessary libraries for performing image-related tasks and implementing neural networks using PyTorch. We’ll use torch for tensorA tensor is a multi-dimensional array that can represent scalars, vectors, matrices, or higher-dimensional data. It’s the fundamental data structure in PyTorch for storing and processing inputs, weights, and outputs in deep learning. operations and network functionalities, torch.nn for building networks, torch.optim for optimization algorithms, torchvision for datasets and model architectures, and torchvision.transforms for image transformations.

To load the MNIST dataset, we can utilize the torchvision.datasets.MNIST() function. This function allows us to load the dataset while specifying the root directory where the data will be stored. To load the training set, we set the parameter train=True, and for the testing set, we set train=False.

Once the dataset is loaded, we assign the MNIST image data to variables named x_train and x_test, and the corresponding labels to variables named y_train and y_test. To ensure compatibility with our model, we convert the image data to float type. Additionally, to normalize the pixel values and bring them into the range of [0, 1], we divide the pixel values by 255.

Python 3.10.4

import matplotlib.pyplot as plt
import numpy as np
# Set the grid size for displaying images
num_rows = 2
num_columns = 5
# Create a figure with subplots
fig, axes = plt.subplots(num_rows, num_columns, figsize=(4, 2))
# Loop over each digit category (0-9)
for category in range(10):
    # Get all indices of images belonging to the current category
    category_indices = np.where(y_train == category)[0]
    
    # Randomly select one image from this category
    random_index = np.random.choice(category_indices)
    image_np = np.array(x_train[random_index])
    
    # Determine the subplot row and column
    row = category // num_columns
    col = category % num_columns
    ax = axes[row, col]
    
    # Display the image in the subplot
    ax.imshow(image_np, cmap='gray')
    ax.axis('off')  # Hide axis ticks
    
    # Adjust spacing between subplots
    plt.tight_layout()
# Save the figure as a high-resolution PNG
plt.savefig("output/plot.png", dpi=300)

Ask

Course Overview

Supervised Learning

Detect Cyber Intrusion Using Machine Learning

Clustering

Project: Bag of Visual Words

Generalized Linear Regression

Face Recognition Using Kernel Linear Discriminant

Support Vector Machine

Logistic Regression

Ensemble Learning

Early Stage Diabetes Prediction Using Ensemble Learning

Decoding Dimensions: PCA and Autoencoders

Image Reconstruction Using PCA

Image Colorization using Autoencoders

Colorful Face Generation with VAEs

Appendix

Wrapping Up

How to Predict the Traffic Volume Using Machine Learning

Autoencoders in Action

Loading and preprocessing the data

Defining the architecture