Generate New Fashion Designs with a Generative Adversarial Network (GAN)

19 Mar 2021  
In this article we show you how to build a Generative Adversarial Network (GAN) for fashion design generation.
Here we evaluate and improve our deep network performance.


The availability of datasets like DeepFashion open up new possibilities for the fashion industry. In this series of articles, we’ll showcase an AI-powered deep learning system that can revolutionize the fashion design industry by helping us better understand customers’ needs.

In this project, we’ll use:

We are assuming that you are familiar with the concepts of deep learning, as well as with Jupyter Notebooks and TensorFlow. If you’re new to Jupyter Notebooks, start with this tutorial. You are welcome to download the project code.

In the previous article, we evaluated and improved our deep network performance. In this article, we’ll work on building, training, and testing a Generative Adversarial Network (GAN) — the network we’ll then use to generate new clothing images and designs.

The Power of Predicting New Fashion Images

AI can help us to not only predict the category of a clothing item, but also to create computer-generated images of similar-looking items. This can be pretty handy for retailers and fashion designers who strive to create personalized clothes or predict broader fashion trends.

Until the creation of GANs, generating realistic fashion images was a challenging task due to the images' high volume of data. Images tend to be high resolution, resulting in many pixels. Plus, each pixel represents three channel values: red, blue, and green (RGB). GANs provide researchers with a viable method of generating and verifying all this data.

Building a GAN

A GAN is a popular model for unsupervised machine learning where two neural networks — a generator and a discriminator — interact with each other. The generator’s role is to generate images out of random noise it takes as input. The discriminator’s task is to detect whether these generated images are fake or real (by comparing them to the images in a dataset). This process continues for several epochs until the discriminator loss between fake and real achieves its minimum. As the loss reaches the minimum, the generator becomes sufficiently skilled in generating images similar to those in the original dataset.

Image 1

Building a GAN will include the following stages:

  1. Initializing the network parameters and loading data
  2. Building the generator
  3. Building the discriminator

We’ll use the Pytorch library for building our GAN. This library is fast, and it doesn’t require a lot of computational power.

To install Pytorch with CUDA10 on Conda:

# CUDA 10.0
conda install pytorch==1.2.0 torchvision==0.4.0 cudatoolkit=10.0 -c pytorch

Initializing GAN Parameters and Loading Data

The two convolutional neural networks (CNNs) that constitute a GAN include convolution, batch normalization, and ReLU layers for the discriminator, and deconvolution, batch normalization, and ReLU layers for the generator.

Before starting to build our generator and discriminator networks, let’s set some parameters and load the fashion image dataset that will be used to train and test the network.

First, we impot some dependencies.

from __future__ import print_function
#%matplotlib inline
import argparse
import os
import random
import torch
import torch.nn as nn
import torch.nn.parallel
import torch.backends.cudnn as cudnn
import torch.optim as optim
import torchvision.datasets as dset
import torchvision.transforms as transforms
import torchvision.utils as vutils
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.animation as animation
from IPython.display import HTML

Next, we set some random seeds to achieve reproducibility:

# Set random seed for reproducibility
manualSeed = 999
#manualSeed = random.randint(1, 10000) # use if you want new results
print("Random Seed: ", manualSeed)

Then, we set some important parameters, such as the number of feature maps, input image size, batch size, number of epochs, and learning rate.

# Root directory for dataset
dataroot = r"C:\Users\abdul\Desktop\ContentLab\P2\DeepFashion\Train"

# Number of workers for dataloader
workers = 2

# Batch size during training
batch_size = 128

# Spatial size of training images. All images will be resized to this
#   size using a transformer.
image_size = 64

# Number of channels in the training images. For color images this is 3
nc = 3

# Size of z latent vector (i.e. size of generator input)
nz = 100

# Size of feature maps in generator
ngf = 64

# Size of feature maps in discriminator
ndf = 64

# Number of training epochs
num_epochs = 40

# Learning rate for optimizers
lr = 0.0002

# Beta1 hyperparam for Adam optimizers
beta1 = 0.5

# Number of GPUs available. Use 0 for CPU mode.
ngpu = 1

Now we can load our data using dataloader and show a sample of that data.

dataset = dset.ImageFolder(root=dataroot,
                               transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)),
# Create the dataloader
dataloader =, batch_size=batch_size,
                                         shuffle=True, num_workers=workers)

# Decide which device we want to run on
device = torch.device("cuda:0" if (torch.cuda.is_available() and ngpu > 0) else "cpu")

# Plot some training images
real_batch = next(iter(dataloader))
plt.title("Training Images")
plt.imshow(np.transpose(vutils.make_grid(real_batch[0].to(device)[:64], padding=2, normalize=True).cpu(),(1,2,0)))

Image 2

Finally, we’ll use the function below to initialize weights for both the generator and discriminator networks.

# custom weights initialization 
def weights_init(m):
    classname = m.__class__.__name__
    if classname.find('Conv') != -1:
        nn.init.normal_(, 0.0, 0.02)
    elif classname.find('BatchNorm') != -1:
        nn.init.normal_(, 1.0, 0.02)
        nn.init.constant_(, 0)

Building a Generator from Scratch

The generator CNN consists of transposed convolutional layers, batch norm layers, and ReLU activations. The input is a latent vector, z, which is drawn from a standard normal distribution, and the output is a 3 x 64 x 64 pixels RGB image.

class Generator(nn.Module):
    def __init__(self, ngpu):
        super(Generator, self).__init__()
        self.ngpu = ngpu
        self.main = nn.Sequential(
            # input is Z, going into a convolution
            nn.ConvTranspose2d( nz, ngf * 8, 4, 1, 0, bias=False),
            nn.BatchNorm2d(ngf * 8),
            # state size. (ngf*8) x 4 x 4
            nn.ConvTranspose2d(ngf * 8, ngf * 4, 4, 2, 1, bias=False),
            nn.BatchNorm2d(ngf * 4),
            # state size. (ngf*4) x 8 x 8
            nn.ConvTranspose2d( ngf * 4, ngf * 2, 4, 2, 1, bias=False),
            nn.BatchNorm2d(ngf * 2),
            # state size. (ngf*2) x 16 x 16
            nn.ConvTranspose2d( ngf * 2, ngf, 4, 2, 1, bias=False),
            # state size. (ngf) x 32 x 32
            nn.ConvTranspose2d( ngf, nc, 4, 2, 1, bias=False),
            # state size. (nc) x 64 x 64

    def forward(self, input):
        return self.main(input)

Now, we create the netG generator and show its structure.

# Create the generator
netG = Generator(ngpu).to(device)

# Handle multi-gpu if desired
if (device.type == 'cuda') and (ngpu > 1):
    netG = nn.DataParallel(netG, list(range(ngpu)))

# Apply the weights_init function to randomly initialize all weights
#  to mean=0, stdev=0.2.

# Print the model

Image 3

Building a Discriminator from Scratch

Our discriminator will be called netD, and it will be composed of strided convolution layers, LeakyReLU activations, and batch norm layers. Its input will be a 3 x 64 x 64 input image, and its output will be the scalar probability of the input being from the real dataset.

class Discriminator(nn.Module):
    def __init__(self, ngpu):
        super(Discriminator, self).__init__()
        self.ngpu = ngpu
        self.main = nn.Sequential(
            # input is (nc) x 64 x 64
            nn.Conv2d(nc, ndf, 4, 2, 1, bias=False),
            nn.LeakyReLU(0.2, inplace=True),
            # state size. (ndf) x 32 x 32
            nn.Conv2d(ndf, ndf * 2, 4, 2, 1, bias=False),
            nn.BatchNorm2d(ndf * 2),
            nn.LeakyReLU(0.2, inplace=True),
            # state size. (ndf*2) x 16 x 16
            nn.Conv2d(ndf * 2, ndf * 4, 4, 2, 1, bias=False),
            nn.BatchNorm2d(ndf * 4),
            nn.LeakyReLU(0.2, inplace=True),
            # state size. (ndf*4) x 8 x 8
            nn.Conv2d(ndf * 4, ndf * 8, 4, 2, 1, bias=False),
            nn.BatchNorm2d(ndf * 8),
            nn.LeakyReLU(0.2, inplace=True),
            # state size. (ndf*8) x 4 x 4
            nn.Conv2d(ndf * 8, 1, 4, 1, 0, bias=False),

    def forward(self, input):
        return self.main(input)

# Create the Discriminator
netD = Discriminator(ngpu).to(device)

# Handle multi-gpu if desired
if (device.type == 'cuda') and (ngpu > 1):
    netD = nn.DataParallel(netD, list(range(ngpu)))

# Apply the weights_init function to randomly initialize all weights
#  to mean=0, stdev=0.2.

# Print the model

Image 4

Initializing Loss and Optimizer for GAN

Before starting to train our GAN, we’ll set up its loss functions and optimizer. In GANs, we usually use binary cross-entropy as a loss function because we have two classes in the output: Fake (0) and Real (1). We’ll use the Adam optimizer with a learning rate of 0.0002 and Beta1 of 0.5.

# Initialize BCELoss function
criterion = nn.BCELoss()

# Create batch of latent vectors that we will use to visualize
#  the progression of the generator
fixed_noise = torch.randn(64, nz, 1, 1, device=device)

# Establish convention for real and fake labels during training
real_label = 1.
fake_label = 0.

# Setup Adam optimizers for both G and D
optimizerD = optim.Adam(netD.parameters(), lr=lr, betas=(beta1, 0.999))
optimizerG = optim.Adam(netG.parameters(), lr=lr, betas=(beta1, 0.999))

Next Steps

In the next article, we’ll show you how to train the GAN for fashion design generation. Stay tuned!

This article is part of the series 'Deep Learning for Fashion Classification


This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Dr. Helwan is a machine learning and medical image analysis enthusiast.

His research interests include but not limited to Machine and deep learning in medicine, Medical computational intelligence, Biomedical image processing, and Biomedical engineering and systems.

