Generative adversarial networks (GANs)

Table of contents

Automate your business at $5/day with Engati

Generative adversarial networks (GANs)

What are generative adversarial networks?

Generative adversarial networks (GANs) are an exciting recent innovation in machine learning. GANs are generative models: they create new data instances that resemble your training data. 

They are algorithmic architectures that use two neural networks, pitting one against the other (thus the “adversarial”) in order to generate new, synthetic instances of data that can pass for real data. They are used widely in image generation, video generation, and voice generation.

For example, GANs can create images that look like photographs of human faces, even though the faces don't belong to any real person. 

generative adversarial networks
Source: GeeksforGeeks

How do Generative adversarial networks (GANs) work?

One neural network, called the generator, generates new data instances, while the other, the discriminator, evaluates them for authenticity; i.e. the discriminator decides whether each instance of data that it reviews belongs to the actual training dataset or not.

Let’s say we’re trying to do something banaler than mimic the Mona Lisa. We’re going to generate hand-written numerals like those found in the MNIST dataset, which is taken from the real world. The goal of the discriminator, when shown an instance from the true MNIST dataset, is to recognize those that are authentic.

Meanwhile, the generator is creating new, synthetic images that it passes to the discriminator. It does so in the hopes that they, too, will be deemed authentic, even though they are fake. The goal of the generator is to generate passable hand-written digits: to lie without being caught. The goal of the discriminator is to identify images coming from the generator as fake.

Here are the steps a GAN takes:

  • The generator takes in random numbers and returns an image.
  • This generated image is fed into the discriminator alongside a stream of images taken from the actual, ground-truth dataset.
  • The discriminator takes in both real and fake images and returns probabilities, a number between 0 and 1, with 1 representing a prediction of authenticity and 0 representing fake.

So you have a double feedback loop:

  • The discriminator is in a feedback loop with the ground truth of the images, which we know.
  • The generator is in a feedback loop with the discriminator.

What are generative adversarial networks used for?

What are GANs used for?

1. Image-to-Image Translation

This is a bit of a catch-all task, for those papers that present GANs that can do many image translation tasks.

Phillip Isola, et al. in their 2016 paper titled “Image-to-Image Translation with Conditional Adversarial Networks” demonstrate GANs, specifically their pix2pix approach for many image-to-image translation tasks.

Examples include translation tasks such as:

  • Translation of semantic images to photographs of cityscapes and buildings.
  • Translation of satellite photographs to Google Maps.
  • Translation of photos from day to night.
  • Translation of black and white photographs to color.
  • Translation of sketches to color photographs.

2. 3D Object Generation

Jiajun Wu, et al. in their 2016 paper titled “Learning a Probabilistic Latent Space of Object Shapes via 3D Generative-Adversarial Modeling” demonstrate a GAN for generating new three-dimensional objects (e.g. 3D models) such as chairs, cars, sofas, and tables.

Matheus Gadelha, et al. in their 2016 paper titled “3D Shape Induction from 2D Views of Multiple Objects” use GANs to generate three-dimensional models given two-dimensional pictures of objects from multiple perspectives.

3. Clothing Translation

Donggeun Yoo, et al. in their 2016 paper titled “Pixel-Level Domain Transfer” demonstrate the use of GANs to generate photographs of clothing as may be seen in a catalog or online store, based on photographs of models wearing the clothing.

4. Photos to Emojis

Yaniv Taigman, et al. in their 2016 paper titled “Unsupervised Cross-Domain Image Generation” used a GAN to translate images from one domain to another, including from street numbers to MNIST handwritten digits, and from photographs of celebrities to what they call emojis or small cartoon faces.

5. Text-to-Image Translation (text2image)

Han Zhang, et al. in their 2016 paper titled “StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks” demonstrate the use of GANs, specifically their StackGAN to generate realistic looking photographs from textual descriptions of simple objects like birds and flowers.

Scott Reed, et al. in their 2016 paper titled “Generative Adversarial Text to Image Synthesis” also provide an early example of text to image generation of small objects and scenes including birds, flowers, and more.

In another 2016 paper titled “Learning What and Where to Draw” by Scott Reed, et al., they expanded upon this capability and use GANs to both generate images from text and use bounding boxes and key points as hints as to where to draw a described object, like a bird.

What are the applications of generative adversarial networks?

Applications of GANs

Some of the applications and usecases of GANs include

Improving cybersecurity

One of the methods that hackers use is known as an adversarial attack. The hackers manipulate images by adding malicious data to them, tricking the neural network itself and compromising the intended working of the algorithm. 

It is possible to train generative adversarial networks to identify such instances of fraud. GANs can make deep learning models more robust and identify any malicious information that might be added to images by hackers.

Generating animation models

Generative adversarial networks can be used to automatically generate 3D models needed in video games, animated movies, or cartoons. It is possible for the network to create new 3D models based on the existing dataset of 2D images provided. They can analyze the 2D photos to recreate the 3D models of them really quickly. This saves a lot of time for animators and allows them to focus on other tasks.

Editing photographs

This goes beyond regular photo-editing enhancements. GANs can reconstruct images of faces to identify changes in features such as hair color, facial expressions, or gender, etc. They can even  facial images of people at various ages.

Close Icon
Request a Demo!
Get started on Engati with the help of a personalised demo.
Thanks for the information.
We will be shortly getting in touch with you.
Oops! something went wrong!
For any query reach out to us on
Close Icon
Congratulations! Your demo is recorded.

Select an option on how Engati can help you.

I am looking for a conversational AI engagement solution for the web and other channels.

I would like for a conversational AI engagement solution for WhatsApp as the primary channel

I am an e-commerce store with Shopify. I am looking for a conversational AI engagement solution for my business

I am looking to partner with Engati to build conversational AI solutions for other businesses

Close Icon
You're a step away from building your Al chatbot

How many customers do you expect to engage in a month?

Less Than 2000


More than 5000

Close Icon
Thanks for the information.

We will be shortly getting in touch with you.

Close Icon

Contact Us

Please fill in your details and we will contact you shortly.

Thanks for the information.
We will be shortly getting in touch with you.
Oops! Looks like there is a problem.
Never mind, drop us a mail at