# Introduction to Machine Learning: How We Arrive at Linear Regression

Introduction to Machine Learning: How We Arrive at Linear Regression

机器学习入门：我们是如何推导出线性回归的

Before we talk about Linear Regression, we first need to understand the bigger idea it belongs to: Machine Learning. Machine Learning is the reason applications today can: recommend movies on Netflix, suggest products on Amazon, recognize faces on your phone, and even predict house prices or exam scores. 在讨论线性回归之前，我们首先需要了解它所属的宏大概念：机器学习。正是因为机器学习，现在的应用程序才能够：在 Netflix 上推荐电影、在亚马逊上建议商品、识别手机上的人脸，甚至预测房价或考试成绩。

But what exactly is Machine Learning? Machine Learning is a branch of Artificial Intelligence where we teach computers to learn patterns from data instead of explicitly programming every rule. 那么，机器学习到底是什么？机器学习是人工智能的一个分支，我们通过它教计算机从数据中学习模式，而不是显式地编写每一条规则。

Traditional Programming vs Machine Learning

传统编程与机器学习

In traditional programming: You give the computer rules + data → it gives you answers. 在传统编程中：你给计算机规则 + 数据 → 它给你答案。

In Machine Learning: You give the computer data + answers → it learns the rules. 在机器学习中：你给计算机数据 + 答案 → 它学习规则。

Simple Analogy

简单类比

Think of teaching a child: 想象一下教孩子：

Traditional programming: You say: “If you see 2 + 2, always answer 4.” You must manually define every rule. 传统编程：你说：“如果你看到 2 + 2，总是回答 4。”你必须手动定义每一条规则。

Machine Learning: You show the child many examples: 1 + 1 = 2, 2 + 2 = 4, 3 + 3 = 6. Eventually, the child learns the pattern: “Oh… adding numbers follows a pattern.” That is exactly how Machine Learning works. 机器学习：你给孩子看许多例子：1 + 1 = 2，2 + 2 = 4，3 + 3 = 6。最终，孩子学会了模式：“哦……数字相加是有规律的。”这正是机器学习的工作方式。

The Goal of Machine Learning

机器学习的目标

The main goal is simple: To help machines learn patterns from data and make predictions on new, unseen data. 其主要目标很简单：帮助机器从数据中学习模式，并对新的、未见过的数据进行预测。

Types of Machine Learning

机器学习的类型

There are three main types: 主要有三种类型：

Supervised Learning: The model learns from input (data) and output (correct answers). Example: house size → house price; study hours → exam score. This is where Linear Regression belongs.
监督学习：模型从输入（数据）和输出（正确答案）中学习。例如：房屋面积 → 房价；学习时长 → 考试成绩。线性回归就属于这一类。
Unsupervised Learning: The model is given data without answers and tries to find patterns on its own. Example: grouping customers by behavior, clustering similar items together.
无监督学习：模型被给予没有答案的数据，并尝试自行寻找模式。例如：按行为对客户进行分组，将相似的商品聚类在一起。
Reinforcement Learning: The model learns through rewards, mistakes, and trial and error. Example: game-playing AI, robotics navigation.
强化学习：模型通过奖励、错误和试错来学习。例如：游戏 AI、机器人导航。

From Machine Learning to Prediction Problems

从机器学习到预测问题

Once we focus on supervised learning, we usually ask questions like: “Can we predict a number?” “Can we estimate a value?” “Can we forecast future outcomes?” These are called regression problems. 当我们专注于监督学习时，我们通常会问这样的问题：“我们能预测一个数字吗？”“我们能估算一个数值吗？”“我们能预测未来的结果吗？”这些被称为回归问题。

What is a Regression Problem?

什么是回归问题？

A regression problem is when we try to predict a continuous numerical value. Examples: house price (e.g., 150,000), temperature (e.g., 28°C), exam score (e.g., 75%). This is different from classification, where we predict categories like: yes/no, spam/not spam, dog/cat. 回归问题是指我们试图预测一个连续的数值。例如：房价（如 150,000）、温度（如 28°C）、考试成绩（如 75%）。这与分类问题不同，分类问题是预测类别，例如：是/否、垃圾邮件/非垃圾邮件、狗/猫。

Enter Linear Regression

进入线性回归

Now that we understand regression problems, we can introduce one of the simplest solutions: Linear Regression. Linear Regression is a supervised learning algorithm used to predict continuous values by finding a relationship between input and output variables. 既然我们理解了回归问题，我们就可以介绍最简单的解决方案之一：线性回归。线性回归是一种监督学习算法，通过寻找输入变量和输出变量之间的关系来预测连续值。

Why Linear Regression?

为什么选择线性回归？

Because many real-world relationships can be approximated using a straight line. Example: More study hours → higher exam scores; Bigger houses → higher prices; More advertising → more sales. These relationships often follow a pattern that can be simplified as: “As X increases, Y also increases (or decreases) in a predictable way.” 因为许多现实世界的关系都可以用直线来近似。例如：学习时间越长 → 考试成绩越高；房子越大 → 价格越高；广告越多 → 销量越高。这些关系通常遵循一种可以简化为如下的模式：“随着 X 的增加，Y 也以可预测的方式增加（或减少）。”

The Core Idea of Linear Regression

线性回归的核心思想

Linear Regression tries to draw a best-fit line through data points. This line is used to: understand patterns and make predictions. Mathematically, it is written as: y = mx + c. 线性回归试图在数据点中画出一条最佳拟合线。这条线用于：理解模式并进行预测。在数学上，它写为：y = mx + c。

Next we will do a deep dive into Linear Regression; Buckle up! 接下来我们将深入探讨线性回归；系好安全带！