Skip to content
X

What Is Generative AI?

Generative AI is a collective term for AI technology that learns patterns from large amounts of data and generates new data such as text, images, audio, and video. Since the popularization of ChatGPT and image generation AI in 2022, it has rapidly penetrated society and is now used in many fields including business, education, and creative work.

Target audience: Those who are just getting interested in AI, or those who want to get an overview of generative AI.

Estimated learning time: 15 minutes to read

Prerequisites: None

The Difference Between Generative AI and Traditional AI (Discriminative AI)

Section titled “The Difference Between Generative AI and Traditional AI (Discriminative AI)”

There are two broad types of AI: discriminative AI and generative AI.

Discriminative AI classifies or identifies input data. It makes judgments like “Is this image a cat or a dog?” or “Is this email spam or not?”

Generative AI creates new data based on the patterns it has learned. It generates things like “Create a new image of a cat” or “Automatically write an email.”

ComparisonDiscriminative AIGenerative AI
PurposeClassify or identify dataGenerate new data
OutputLabels, probabilities, scoresText, images, audio, video
Representative examplesImage classification, spam filters, face recognitionChatGPT, DALL-E, Stable Diffusion
Learning methodLearns from labeled dataLearns the distribution of patterns
Main applicationsQuality control, medical diagnosis, searchText writing, image generation, code completion

Generative AI is being applied across multiple modalities (types of data).

Text generation AI generates sentences based on an input prompt (instruction).

  • Writing, summarizing, and translating text
  • Automatic code generation and debugging assistance
  • Providing information in conversational format (chatbots)

Examples: ChatGPT (OpenAI), Claude (Anthropic), Gemini (Google)

Image generation AI creates new images from text descriptions (prompts) or reference images.

  • Generating images from text (Text-to-Image)
  • Style transfer of existing images
  • Image editing and completion (inpainting)

Examples: DALL-E (OpenAI), Stable Diffusion (Stability AI), Midjourney

Music generation AI creates compositions and audio from text or musical instructions.

  • Generating music from text
  • Voice cloning and conversion
  • Speech synthesis (Text-to-Speech)

Examples: Suno, Udio, ElevenLabs

Video generation AI creates video content from text or images.

  • Generating short videos from text (Text-to-Video)
  • Converting still images to video
  • Video editing and completion

Examples: Sora (OpenAI), Runway, Pika

The development of generative AI accelerated when three elements came together: algorithms, data, and computing power.

timeline
    title Key Milestones in Generative AI
    2014 : GAN introduced (Ian Goodfellow)
    2017 : Transformer paper "Attention Is All You Need"
    2018 : BERT (Google) / GPT-1 (OpenAI)
    2019 : GPT-2
    2020 : GPT-3 (175 billion parameters)
    2021 : DALL-E / Codex
    2022 : ChatGPT / Stable Diffusion / Midjourney
    2023 : GPT-4 / Claude 2 / Llama 2
    2024 : Claude 3 / GPT-4o / Gemini 1.5
    2025 : Claude 3.5/4 / GPT-o3 / Rise of reasoning models
    2026 : Practical deployment phase of AI agents
YearEventSignificance
2014GAN (Generative Adversarial Network) — Ian GoodfellowTwo competing networks produce high-quality generation
2017Transformer paper “Attention Is All You Need” — Vaswani et al.A parallelizable architecture emerges, becoming the foundation for large-scale models
2018BERT (Google), GPT-1 (OpenAI)The paradigm of pre-training + fine-tuning is established
2019GPT-2High-quality text generation ability is first widely recognized
2020GPT-3 (175 billion parameters)Demonstrates versatility to handle diverse tasks with few examples
2021DALL-E, CodexAbility to generate images and code from text is demonstrated
2022ChatGPT, Stable Diffusion, MidjourneyEra begins where ordinary users use generative AI daily
2023GPT-4, Claude 2, Llama 2Dramatic capability improvement and rise of open-source models
2024Claude 3, GPT-4o, Gemini 1.5Multimodal (integration of text, images, audio) becomes mainstream
2025Claude 3.5/4, GPT-o3, rise of reasoning models”Think before answering” reasoning models used for complex problem solving
2026Practical deployment phase of AI agentsMultiple AIs collaborate to autonomously execute complex tasks

Why Generative AI Is Advancing So Rapidly Now

Section titled “Why Generative AI Is Advancing So Rapidly Now”

The rapid advancement of generative AI results from three elements aligning simultaneously.

graph TD
    A["Computing Power\nGPU/TPU advances\nCloud infrastructure development"] --> D["Rapid advancement\nof generative AI"]
    B["Data\nVast amounts of text\nand images on the internet"] --> D
    C["Algorithms\nThe Transformer\nQuality improvement via RLHF"] --> D

Computing Power: Advances in GPU/TPU performance and cloud infrastructure development have made it possible to train models with hundreds of billions of parameters.

Data: The vast amounts of text, image, and audio data accumulated on the internet can now be used as training material.

Algorithms: The emergence of the Transformer architecture and the spread of reinforcement learning from human feedback (RLHF) have made practically high-quality generation possible.

  • Generative AI is a collective term for AI technology that learns data patterns and generates new content
  • While discriminative AI “classifies and identifies,” generative AI “creates new data”
  • Supported modalities are rapidly expanding: text, images, music, and video
  • Starting from the 2017 Transformer paper, it has rapidly developed through the combination of computing power, data, and algorithms

Q: What’s the difference between generative AI and “regular AI”?

A: What was widely used as “regular AI” was discriminative AI, which classifies and predicts data. Unlike discriminative AI, generative AI generates new data (text, images, etc.) from learned patterns. Both are types of AI, but their purposes and outputs are fundamentally different.

Q: Do I need specialized knowledge to use generative AI?

A: Products like ChatGPT, Claude, and Midjourney can be used from a browser without specialized knowledge. Technical knowledge is needed for development via API or fine-tuning custom models, but no special skills are required just to use them.

Q: How can generative AI “create new things”?

A: Generative AI learns statistical patterns from large amounts of data and probabilistically generates new data that follows those patterns. It doesn’t create something truly “original” — it generates new combinations based on the distribution of its training data.

Q: What’s the difference between a GAN and an LLM?

A: A GAN (Generative Adversarial Network) is a method that produces high-quality images and other outputs through competition between a generator network and a discriminator network. An LLM (Large Language Model) is a large-scale language model based on the Transformer, specialized for generating and understanding text. Both are forms of generative AI, but their architectures and strengths differ.


Next step: Transformer Models