What Is Generative AI?

Generative AI is a collective term for AI technology that learns patterns from large amounts of data and generates new data such as text, images, audio, and video. Since the popularization of ChatGPT and image generation AI in 2022, it has rapidly penetrated society and is now used in many fields including business, education, and creative work.

Target audience: Those who are just getting interested in AI, or those who want to get an overview of generative AI.

Estimated learning time: 15 minutes to read

Prerequisites: None

The Difference Between Generative AI and Traditional AI (Discriminative AI)

There are two broad types of AI: discriminative AI and generative AI.

Discriminative AI classifies or identifies input data. It makes judgments like “Is this image a cat or a dog?” or “Is this email spam or not?”

Generative AI creates new data based on the patterns it has learned. It generates things like “Create a new image of a cat” or “Automatically write an email.”

Comparison	Discriminative AI	Generative AI
Purpose	Classify or identify data	Generate new data
Output	Labels, probabilities, scores	Text, images, audio, video
Representative examples	Image classification, spam filters, face recognition	ChatGPT, DALL-E, Stable Diffusion
Learning method	Learns from labeled data	Learns the distribution of patterns
Main applications	Quality control, medical diagnosis, search	Text writing, image generation, code completion

What Generative AI Can Do

Generative AI is being applied across multiple modalities (types of data).

Text Generation

Text generation AI generates sentences based on an input prompt (instruction).

Writing, summarizing, and translating text
Automatic code generation and debugging assistance
Providing information in conversational format (chatbots)

Examples: ChatGPT (OpenAI), Claude (Anthropic), Gemini (Google)

Image Generation

Image generation AI creates new images from text descriptions (prompts) or reference images.

Generating images from text (Text-to-Image)
Style transfer of existing images
Image editing and completion (inpainting)

Examples: DALL-E (OpenAI), Stable Diffusion (Stability AI), Midjourney

Music and Audio Generation

Music generation AI creates compositions and audio from text or musical instructions.

Generating music from text
Voice cloning and conversion
Speech synthesis (Text-to-Speech)

Examples: Suno, Udio, ElevenLabs

Video Generation

Video generation AI creates video content from text or images.

Generating short videos from text (Text-to-Video)
Converting still images to video
Video editing and completion

Examples: Sora (OpenAI), Runway, Pika

The History of Generative AI

The development of generative AI accelerated when three elements came together: algorithms, data, and computing power.

timeline
    title Key Milestones in Generative AI
    2014 : GAN introduced (Ian Goodfellow)
    2017 : Transformer paper "Attention Is All You Need"
    2018 : BERT (Google) / GPT-1 (OpenAI)
    2019 : GPT-2
    2020 : GPT-3 (175 billion parameters)
    2021 : DALL-E / Codex
    2022 : ChatGPT / Stable Diffusion / Midjourney
    2023 : GPT-4 / Claude 2 / Llama 2
    2024 : Claude 3 / GPT-4o / Gemini 1.5
    2025 : Claude 3.5/4 / GPT-o3 / Rise of reasoning models
    2026 : Practical deployment phase of AI agents

Key Milestones

Year	Event	Significance
2014	GAN (Generative Adversarial Network) — Ian Goodfellow	Two competing networks produce high-quality generation
2017	Transformer paper “Attention Is All You Need” — Vaswani et al.	A parallelizable architecture emerges, becoming the foundation for large-scale models
2018	BERT (Google), GPT-1 (OpenAI)	The paradigm of pre-training + fine-tuning is established
2019	GPT-2	High-quality text generation ability is first widely recognized
2020	GPT-3 (175 billion parameters)	Demonstrates versatility to handle diverse tasks with few examples
2021	DALL-E, Codex	Ability to generate images and code from text is demonstrated
2022	ChatGPT, Stable Diffusion, Midjourney	Era begins where ordinary users use generative AI daily
2023	GPT-4, Claude 2, Llama 2	Dramatic capability improvement and rise of open-source models
2024	Claude 3, GPT-4o, Gemini 1.5	Multimodal (integration of text, images, audio) becomes mainstream
2025	Claude 3.5/4, GPT-o3, rise of reasoning models	”Think before answering” reasoning models used for complex problem solving
2026	Practical deployment phase of AI agents	Multiple AIs collaborate to autonomously execute complex tasks

Why Generative AI Is Advancing So Rapidly Now

The rapid advancement of generative AI results from three elements aligning simultaneously.

graph TD
    A["Computing Power\nGPU/TPU advances\nCloud infrastructure development"] --> D["Rapid advancement\nof generative AI"]
    B["Data\nVast amounts of text\nand images on the internet"] --> D
    C["Algorithms\nThe Transformer\nQuality improvement via RLHF"] --> D

Computing Power: Advances in GPU/TPU performance and cloud infrastructure development have made it possible to train models with hundreds of billions of parameters.

Data: The vast amounts of text, image, and audio data accumulated on the internet can now be used as training material.

Algorithms: The emergence of the Transformer architecture and the spread of reinforcement learning from human feedback (RLHF) have made practically high-quality generation possible.

Summary

Generative AI is a collective term for AI technology that learns data patterns and generates new content
While discriminative AI “classifies and identifies,” generative AI “creates new data”
Supported modalities are rapidly expanding: text, images, music, and video
Starting from the 2017 Transformer paper, it has rapidly developed through the combination of computing power, data, and algorithms

Frequently Asked Questions

Q: What’s the difference between generative AI and “regular AI”?

A: What was widely used as “regular AI” was discriminative AI, which classifies and predicts data. Unlike discriminative AI, generative AI generates new data (text, images, etc.) from learned patterns. Both are types of AI, but their purposes and outputs are fundamentally different.

Q: Do I need specialized knowledge to use generative AI?

A: Products like ChatGPT, Claude, and Midjourney can be used from a browser without specialized knowledge. Technical knowledge is needed for development via API or fine-tuning custom models, but no special skills are required just to use them.

Q: How can generative AI “create new things”?

A: Generative AI learns statistical patterns from large amounts of data and probabilistically generates new data that follows those patterns. It doesn’t create something truly “original” — it generates new combinations based on the distribution of its training data.

Q: What’s the difference between a GAN and an LLM?

A: A GAN (Generative Adversarial Network) is a method that produces high-quality images and other outputs through competition between a generator network and a discriminator network. An LLM (Large Language Model) is a large-scale language model based on the Transformer, specialized for generating and understanding text. Both are forms of generative AI, but their architectures and strengths differ.

Next step: Transformer Models