<script type="application/ld+json">
{
 "@context": "https://schema.org",
 "@type": "FAQPage",
 "mainEntity": [{
   "@type": "Question",
   "name": "What are generative adversarial networks (GANs)?",
   "acceptedAnswer": {
     "@type": "Answer",
     "text": "Generative adversarial networks (GANs) are an exciting recent innovation in machine learning. GANs are generative models: they create new data instances that resemble your training data."
   }
 },{
   "@type": "Question",
   "name": "What is generative adversarial network use cases?",
   "acceptedAnswer": {
     "@type": "Answer",
     "text": "1. Image-to-Image Translation.
2. 3D Object Generation.
3. Clothing Translation.
4. Photos to Emojis.
5. Text-to-Image Translation (text2image)."
   }
 }]
}
</script>

Generative adversarial networks (GANs)

What are generative adversarial networks (GANs)?

Generative adversarial networks (GANs) are an exciting recent innovation in machine learning. GANs are generative models: they create new data instances that resemble your training data. 

They are algorithmic architectures that use two neural networks, pitting one against the other (thus the “adversarial”) in order to generate new, synthetic instances of data that can pass for real data. They are used widely in image generation, video generation, and voice generation.

For example, GANs can create images that look like photographs of human faces, even though the faces don't belong to any real person. 

How GANs Work?

One neural network, called the generator, generates new data instances, while the other, the discriminator, evaluates them for authenticity; i.e. the discriminator decides whether each instance of data that it reviews belongs to the actual training dataset or not.

Let’s say we’re trying to do something banaler than mimic the Mona Lisa. We’re going to generate hand-written numerals like those found in the MNIST dataset, which is taken from the real world. The goal of the discriminator, when shown an instance from the true MNIST dataset, is to recognize those that are authentic.

Meanwhile, the generator is creating new, synthetic images that it passes to the discriminator. It does so in the hopes that they, too, will be deemed authentic, even though they are fake. The goal of the generator is to generate passable hand-written digits: to lie without being caught. The goal of the discriminator is to identify images coming from the generator as fake.

Here are the steps a GAN takes:

  • The generator takes in random numbers and returns an image.
  • This generated image is fed into the discriminator alongside a stream of images taken from the actual, ground-truth dataset.
  • The discriminator takes in both real and fake images and returns probabilities, a number between 0 and 1, with 1 representing a prediction of authenticity and 0 representing fake.

So you have a double feedback loop:

  • The discriminator is in a feedback loop with the ground truth of the images, which we know.
  • The generator is in a feedback loop with the discriminator.

Generative adversarial network use cases

1. Image-to-Image Translation

This is a bit of a catch-all task, for those papers that present GANs that can do many image translation tasks.

Phillip Isola, et al. in their 2016 paper titled “Image-to-Image Translation with Conditional Adversarial Networks” demonstrate GANs, specifically their pix2pix approach for many image-to-image translation tasks.

Examples include translation tasks such as:

  • Translation of semantic images to photographs of cityscapes and buildings.
  • Translation of satellite photographs to Google Maps.
  • Translation of photos from day to night.
  • Translation of black and white photographs to color.
  • Translation of sketches to color photographs.

2. 3D Object Generation

Jiajun Wu, et al. in their 2016 paper titled “Learning a Probabilistic Latent Space of Object Shapes via 3D Generative-Adversarial Modeling” demonstrate a GAN for generating new three-dimensional objects (e.g. 3D models) such as chairs, cars, sofas, and tables.

Matheus Gadelha, et al. in their 2016 paper titled “3D Shape Induction from 2D Views of Multiple Objects” use GANs to generate three-dimensional models given two-dimensional pictures of objects from multiple perspectives.

3. Clothing Translation

Donggeun Yoo, et al. in their 2016 paper titled “Pixel-Level Domain Transfer” demonstrate the use of GANs to generate photographs of clothing as may be seen in a catalog or online store, based on photographs of models wearing the clothing.


4. Photos to Emojis

Yaniv Taigman, et al. in their 2016 paper titled “Unsupervised Cross-Domain Image Generation” used a GAN to translate images from one domain to another, including from street numbers to MNIST handwritten digits, and from photographs of celebrities to what they call emojis or small cartoon faces.

5. Text-to-Image Translation (text2image)

Han Zhang, et al. in their 2016 paper titled “StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks” demonstrate the use of GANs, specifically their StackGAN to generate realistic looking photographs from textual descriptions of simple objects like birds and flowers.

Scott Reed, et al. in their 2016 paper titled “Generative Adversarial Text to Image Synthesis” also provide an early example of text to image generation of small objects and scenes including birds, flowers, and more.

In another 2016 paper titled “Learning What and Where to Draw” by Scott Reed, et al., they expanded upon this capability and use GANs to both generate images from text and use bounding boxes and key points as hints as to where to draw a described object, like a bird.



Thanks for reading! We hope you found this helpful.

Ready to level-up your business? Click here.

About Engati

Engati powers 45,000+ chatbot & live chat solutions in 50+ languages across the world.

We aim to empower you to create the best customer experiences you could imagine. 

So, are you ready to create unbelievably smooth experiences?

Check us out!

Generative adversarial networks (GANs)

October 14, 2020

Table of contents

Key takeawaysCollaboration platforms are essential to the new way of workingEmployees prefer engati over emailEmployees play a growing part in software purchasing decisionsThe future of work is collaborativeMethodology

What are generative adversarial networks (GANs)?

Generative adversarial networks (GANs) are an exciting recent innovation in machine learning. GANs are generative models: they create new data instances that resemble your training data. 

They are algorithmic architectures that use two neural networks, pitting one against the other (thus the “adversarial”) in order to generate new, synthetic instances of data that can pass for real data. They are used widely in image generation, video generation, and voice generation.

For example, GANs can create images that look like photographs of human faces, even though the faces don't belong to any real person. 

How GANs Work?

One neural network, called the generator, generates new data instances, while the other, the discriminator, evaluates them for authenticity; i.e. the discriminator decides whether each instance of data that it reviews belongs to the actual training dataset or not.

Let’s say we’re trying to do something banaler than mimic the Mona Lisa. We’re going to generate hand-written numerals like those found in the MNIST dataset, which is taken from the real world. The goal of the discriminator, when shown an instance from the true MNIST dataset, is to recognize those that are authentic.

Meanwhile, the generator is creating new, synthetic images that it passes to the discriminator. It does so in the hopes that they, too, will be deemed authentic, even though they are fake. The goal of the generator is to generate passable hand-written digits: to lie without being caught. The goal of the discriminator is to identify images coming from the generator as fake.

Here are the steps a GAN takes:

  • The generator takes in random numbers and returns an image.
  • This generated image is fed into the discriminator alongside a stream of images taken from the actual, ground-truth dataset.
  • The discriminator takes in both real and fake images and returns probabilities, a number between 0 and 1, with 1 representing a prediction of authenticity and 0 representing fake.

So you have a double feedback loop:

  • The discriminator is in a feedback loop with the ground truth of the images, which we know.
  • The generator is in a feedback loop with the discriminator.

Generative adversarial network use cases

1. Image-to-Image Translation

This is a bit of a catch-all task, for those papers that present GANs that can do many image translation tasks.

Phillip Isola, et al. in their 2016 paper titled “Image-to-Image Translation with Conditional Adversarial Networks” demonstrate GANs, specifically their pix2pix approach for many image-to-image translation tasks.

Examples include translation tasks such as:

  • Translation of semantic images to photographs of cityscapes and buildings.
  • Translation of satellite photographs to Google Maps.
  • Translation of photos from day to night.
  • Translation of black and white photographs to color.
  • Translation of sketches to color photographs.

2. 3D Object Generation

Jiajun Wu, et al. in their 2016 paper titled “Learning a Probabilistic Latent Space of Object Shapes via 3D Generative-Adversarial Modeling” demonstrate a GAN for generating new three-dimensional objects (e.g. 3D models) such as chairs, cars, sofas, and tables.

Matheus Gadelha, et al. in their 2016 paper titled “3D Shape Induction from 2D Views of Multiple Objects” use GANs to generate three-dimensional models given two-dimensional pictures of objects from multiple perspectives.

3. Clothing Translation

Donggeun Yoo, et al. in their 2016 paper titled “Pixel-Level Domain Transfer” demonstrate the use of GANs to generate photographs of clothing as may be seen in a catalog or online store, based on photographs of models wearing the clothing.


4. Photos to Emojis

Yaniv Taigman, et al. in their 2016 paper titled “Unsupervised Cross-Domain Image Generation” used a GAN to translate images from one domain to another, including from street numbers to MNIST handwritten digits, and from photographs of celebrities to what they call emojis or small cartoon faces.

5. Text-to-Image Translation (text2image)

Han Zhang, et al. in their 2016 paper titled “StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks” demonstrate the use of GANs, specifically their StackGAN to generate realistic looking photographs from textual descriptions of simple objects like birds and flowers.

Scott Reed, et al. in their 2016 paper titled “Generative Adversarial Text to Image Synthesis” also provide an early example of text to image generation of small objects and scenes including birds, flowers, and more.

In another 2016 paper titled “Learning What and Where to Draw” by Scott Reed, et al., they expanded upon this capability and use GANs to both generate images from text and use bounding boxes and key points as hints as to where to draw a described object, like a bird.



Thanks for reading! We hope you found this helpful.

Ready to level-up your business? Click here.

Share

Continue Reading