What Is Machine Learning?
Machine learning (ML) is a computer science method for automatically learning patterns from data. Rather than a programmer writing explicit rules, a computer derives those rules itself when given large amounts of data.
The Difference From Traditional Programming
Section titled “The Difference From Traditional Programming”The essence of machine learning lies in the difference in approach compared with traditional programming.
| Traditional Programming | Machine Learning | |
|---|---|---|
| Input | Rules (written by humans) + Data | Data + Correct labels |
| Output | Answers | Rules (a model) |
| Strengths | Processing with clear logic | Complex pattern recognition |
| Example | Calculating sales tax | Deciding whether an image contains a cat |
Understanding Through Analogy
Section titled “Understanding Through Analogy”Imagine teaching a child what a “cat” is.
- Traditional programming: Spelling out every rule — “the ears are pointed, it has whiskers, four legs…”
- Machine learning: Showing thousands of photos and repeatedly saying “this is a cat, this is a dog.” Eventually the child learns to judge “cat-ness” on its own.
The situations where machine learning is most useful are problems where it is difficult to write complex rules explicitly — spam detection, object recognition in images, speech-to-text conversion, and similar tasks.
The Three Types of Machine Learning
Section titled “The Three Types of Machine Learning”1. Supervised Learning
Section titled “1. Supervised Learning”Supervised learning is a technique that trains on data with correct labels. A model is trained with “answers (supervision)” already provided.
- Examples: Spam filters (classifying emails as “spam” or “not spam”), house price prediction (predicting price from area, age, and other features)
- Representative algorithms: Linear Regression, Logistic Regression, Decision Tree, Random Forest, SVM (Support Vector Machine)
graph LR
D["Labeled Data"] --> M["Training"]
M --> P["Prediction Model"]
P --> R["Predictions on New Data"]2. Unsupervised Learning
Section titled “2. Unsupervised Learning”Unsupervised learning is a technique for automatically discovering patterns and structures in data that has no correct labels. Because no “answers” are provided, the model finds groups and features in the data on its own.
- Examples: Customer segmentation (grouping customers with similar purchase patterns), anomaly detection (finding behavior that deviates from normal patterns)
- Representative algorithms: KMeans clustering, PCA (Principal Component Analysis), DBSCAN
3. Reinforcement Learning
Section titled “3. Reinforcement Learning”Reinforcement learning is a technique where an agent learns through trial and error — interacting with an environment and maximizing cumulative reward. There are no explicit correct labels; the reward signal from the outcome of each action is what drives learning.
- Examples: Game AI (mastering Go, chess, and video games to superhuman level), robot control (learning to walk or grasp objects)
- Analogy: It resembles how a child learns a video game. At first the controls are pressed at random; a rising score reinforces “good actions” and a game-over reinforces “actions to avoid.”
When to Use Machine Learning
Section titled “When to Use Machine Learning”Machine learning is a strong candidate when the following conditions apply:
- A large amount of data exists: More data generally means better accuracy
- Writing explicit rules is difficult: There are patterns that humans cannot easily articulate
- The environment changes over time: Continuously maintaining hand-written rules is costly
Time Complexity of Major Algorithms
Section titled “Time Complexity of Major Algorithms”Libraries like scikit-learn let you run ML algorithms in just two or three lines of code. However, using algorithms without understanding their internals makes it hard to find performance bottlenecks, understand what hyperparameters mean, or debug model behavior effectively.
Knowing the computational complexity of each algorithm is important for choosing the right one.
| Algorithm | Training Time Complexity | Inference Time Complexity | Notes |
|---|---|---|---|
| Linear Regression (OLS) | O(nm² + m³) | O(m) | n: samples, m: features |
| Linear Regression (SGD) | O(epoch × nm) | O(m) | epoch: number of passes |
| Logistic Regression | O(epoch × nm) | O(m) | |
| Decision Tree | O(n log(n) × m) | O(depth) | depth: tree depth |
| Random Forest | O(trees × n log(n) × m) | O(trees × depth) | trees: number of trees |
| SVM | O(n²m + n³) | O(n_sv × m) | n_sv: support vectors |
| k-Nearest Neighbors (kNN) | O(1) (no training) | O(nm) | |
| Naive Bayes | O(nm) | O(m) | |
| PCA | O(nm² + m³) | O(m²) | |
| KMeans | O(i × k × nm) | O(km) | i: iterations, k: clusters |
Q: Are machine learning and AI the same thing?
A: No. AI (Artificial Intelligence) refers broadly to all techniques that replicate human intelligence on a computer. Machine learning is one approach to achieving AI. AI also includes rule-based systems and other methods that do not use machine learning.
Q: Do I need programming experience to use machine learning?
A: Yes, if the goal is implementation — basic Python knowledge is needed. However, no programming experience is required just to understand the concepts. Learning concepts first and then moving to implementation tends to produce better results.
Q: Is supervised or unsupervised learning more commonly used?
A: Supervised learning is used more widely in practice. Most real-world problems have a clear target to predict or classify, and the cost of labeling data is often worth the payoff in performance.
Next: What Is Deep Learning?
Link to this page (Japanese): 機械学習とは