# Introduction to Machine Learning: How We Arrive at Linear Regression
Introduction to Machine Learning: How We Arrive at Linear Regression
机器学习入门:我们是如何推导出线性回归的
Before we talk about Linear Regression, we first need to understand the bigger idea it belongs to: Machine Learning. Machine Learning is the reason applications today can: recommend movies on Netflix, suggest products on Amazon, recognize faces on your phone, and even predict house prices or exam scores. 在讨论线性回归之前,我们首先需要了解它所属的宏大概念:机器学习。正是因为机器学习,现在的应用程序才能够:在 Netflix 上推荐电影、在亚马逊上建议商品、识别手机上的人脸,甚至预测房价或考试成绩。
But what exactly is Machine Learning? Machine Learning is a branch of Artificial Intelligence where we teach computers to learn patterns from data instead of explicitly programming every rule. 那么,机器学习到底是什么?机器学习是人工智能的一个分支,我们通过它教计算机从数据中学习模式,而不是显式地编写每一条规则。
Traditional Programming vs Machine Learning
传统编程与机器学习
In traditional programming: You give the computer rules + data → it gives you answers. 在传统编程中:你给计算机规则 + 数据 → 它给你答案。
In Machine Learning: You give the computer data + answers → it learns the rules. 在机器学习中:你给计算机数据 + 答案 → 它学习规则。
Simple Analogy
简单类比
Think of teaching a child: 想象一下教孩子:
Traditional programming: You say: “If you see 2 + 2, always answer 4.” You must manually define every rule. 传统编程:你说:“如果你看到 2 + 2,总是回答 4。”你必须手动定义每一条规则。
Machine Learning: You show the child many examples: 1 + 1 = 2, 2 + 2 = 4, 3 + 3 = 6. Eventually, the child learns the pattern: “Oh… adding numbers follows a pattern.” That is exactly how Machine Learning works. 机器学习:你给孩子看许多例子:1 + 1 = 2,2 + 2 = 4,3 + 3 = 6。最终,孩子学会了模式:“哦……数字相加是有规律的。”这正是机器学习的工作方式。
The Goal of Machine Learning
机器学习的目标
The main goal is simple: To help machines learn patterns from data and make predictions on new, unseen data. 其主要目标很简单:帮助机器从数据中学习模式,并对新的、未见过的数据进行预测。
Types of Machine Learning
机器学习的类型
There are three main types: 主要有三种类型:
-
Supervised Learning: The model learns from input (data) and output (correct answers). Example: house size → house price; study hours → exam score. This is where Linear Regression belongs.
-
监督学习:模型从输入(数据)和输出(正确答案)中学习。例如:房屋面积 → 房价;学习时长 → 考试成绩。线性回归就属于这一类。
-
Unsupervised Learning: The model is given data without answers and tries to find patterns on its own. Example: grouping customers by behavior, clustering similar items together.
-
无监督学习:模型被给予没有答案的数据,并尝试自行寻找模式。例如:按行为对客户进行分组,将相似的商品聚类在一起。
-
Reinforcement Learning: The model learns through rewards, mistakes, and trial and error. Example: game-playing AI, robotics navigation.
-
强化学习:模型通过奖励、错误和试错来学习。例如:游戏 AI、机器人导航。
From Machine Learning to Prediction Problems
从机器学习到预测问题
Once we focus on supervised learning, we usually ask questions like: “Can we predict a number?” “Can we estimate a value?” “Can we forecast future outcomes?” These are called regression problems. 当我们专注于监督学习时,我们通常会问这样的问题:“我们能预测一个数字吗?”“我们能估算一个数值吗?”“我们能预测未来的结果吗?”这些被称为回归问题。
What is a Regression Problem?
什么是回归问题?
A regression problem is when we try to predict a continuous numerical value. Examples: house price (e.g., 150,000), temperature (e.g., 28°C), exam score (e.g., 75%). This is different from classification, where we predict categories like: yes/no, spam/not spam, dog/cat. 回归问题是指我们试图预测一个连续的数值。例如:房价(如 150,000)、温度(如 28°C)、考试成绩(如 75%)。这与分类问题不同,分类问题是预测类别,例如:是/否、垃圾邮件/非垃圾邮件、狗/猫。
Enter Linear Regression
进入线性回归
Now that we understand regression problems, we can introduce one of the simplest solutions: Linear Regression. Linear Regression is a supervised learning algorithm used to predict continuous values by finding a relationship between input and output variables. 既然我们理解了回归问题,我们就可以介绍最简单的解决方案之一:线性回归。线性回归是一种监督学习算法,通过寻找输入变量和输出变量之间的关系来预测连续值。
Why Linear Regression?
为什么选择线性回归?
Because many real-world relationships can be approximated using a straight line. Example: More study hours → higher exam scores; Bigger houses → higher prices; More advertising → more sales. These relationships often follow a pattern that can be simplified as: “As X increases, Y also increases (or decreases) in a predictable way.” 因为许多现实世界的关系都可以用直线来近似。例如:学习时间越长 → 考试成绩越高;房子越大 → 价格越高;广告越多 → 销量越高。这些关系通常遵循一种可以简化为如下的模式:“随着 X 的增加,Y 也以可预测的方式增加(或减少)。”
The Core Idea of Linear Regression
线性回归的核心思想
Linear Regression tries to draw a best-fit line through data points. This line is used to: understand patterns and make predictions. Mathematically, it is written as: y = mx + c. 线性回归试图在数据点中画出一条最佳拟合线。这条线用于:理解模式并进行预测。在数学上,它写为:y = mx + c。
Next we will do a deep dive into Linear Regression; Buckle up! 接下来我们将深入探讨线性回归;系好安全带!