The Smallest Brain You Can Build: A Perceptron in Python
The Smallest Brain You Can Build: A Perceptron in Python
你能构建的最小大脑:Python 感知机
A perceptron is the smallest brain you can build. One number goes in. One yes-or-no answer comes out. That is the whole thing. 感知机是你所能构建的最小大脑。输入一个数字,输出一个“是”或“否”的答案。这就是它的全部。
It sounds too simple to matter. But this tiny idea is the seed of every neural network running today. In this post we build a perceptron from scratch in Python, and we watch it learn, live, in your browser. No heavy math. No big libraries. Just a weight, a bias, and a loop. 它听起来太简单,似乎没什么大不了。但这个微小的概念却是当今所有神经网络的种子。在这篇文章中,我们将用 Python 从零开始构建一个感知机,并观察它在浏览器中实时学习的过程。没有复杂的数学,没有庞大的库,只有权重、偏置和一个循环。
I am not a native English speaker, and I am still learning this field myself. So I will explain it the way I needed someone to explain it to me. Slowly, and from the ground up. 我的母语不是英语,我自己也还在学习这个领域。因此,我将用我当初希望别人向我解释的方式来讲解——缓慢地、从基础开始。
What is a perceptron?
什么是感知机?
In 1958, a researcher named Frank Rosenblatt built a machine he called the perceptron. It was inspired by a single brain cell, a neuron. A neuron takes in signals, and if those signals are strong enough, it fires. Rosenblatt copied that idea in math: 1958 年,一位名叫 Frank Rosenblatt 的研究人员制造了一台机器,他称之为“感知机”。它的灵感来源于单个脑细胞——神经元。神经元接收信号,如果信号足够强,它就会触发。Rosenblatt 用数学复制了这个想法:
output = 1 if (w · x + b) > 0 else 0
Here x is the input, w is the weight, and b is the bias. Do not worry about those words yet. We will meet each of them by building something real. 这里 x 是输入,w 是权重,b 是偏置。先不用担心这些术语,我们将在构建实际事物的过程中逐一认识它们。
Think like a human first
先像人类一样思考
Before a machine decides anything, let us watch a human decide. Meet John Doe. He has a job offer, and he must answer one question: should he take it? 在机器做出任何决定之前,让我们先看看人类是如何决定的。认识一下 John Doe。他收到了一份工作邀请,必须回答一个问题:他应该接受吗?
John does not flip a coin. He weighs things. Some factors matter to him more than others. John 不会掷硬币,他会权衡利弊。对他来说,有些因素比其他因素更重要。
| Factor (input) | Value | How much John cares (weight) |
|---|---|---|
| 因素 (输入) | 数值 | John 的重视程度 (权重) |
| Extra pay | high | a lot |
| 额外薪水 | 高 | 非常重视 |
| Stays in the same city | no, he must move | a lot |
| 留在原城市 | 不,他必须搬家 | 非常重视 |
John multiplies each factor by how much he cares about it, then adds everything up. If the total is high enough, he says yes. If not, he says no. John 将每个因素乘以他对其的重视程度,然后将所有结果相加。如果总分足够高,他就说“是”;否则,他就说“否”。
That is a perceptron. The factors are the inputs. How much he cares is the weight. And “high enough” is a threshold he carries in his head. Hold on to that threshold. Later we will give it a name: the bias. 这就是感知机。这些因素就是输入,重视程度就是权重,而“足够高”则是他脑海中的阈值。记住这个阈值,稍后我们会给它起个名字:偏置 (bias)。
How John Doe decides: each input is multiplied by a weight, the results are summed with a bias, and the total becomes one yes-or-no answer. John Doe 的决策方式:每个输入乘以权重,结果与偏置相加,总和最终变成一个“是”或“否”的答案。
The simplest possible decision: is this number positive?
最简单的决策:这个数字是正数吗?
Let us shrink the problem until almost nothing is left. One input. One question. Is this number positive? 让我们把问题简化到极致。一个输入,一个问题:这个数字是正数吗?
That is it. Feed the machine a number. It should answer True for positive and False for negative. 就是这样。给机器一个数字,它应该对正数回答 True,对负数回答 False。
The machine makes its guess like this: 机器的猜测方式如下:
prediction = (weight * value + bias) > 0
Multiply the input by the weight, add the bias, and check if the result is above zero. If yes, it predicts True. If no, it predicts False. This little formula is the classifier, also called the decision function. 将输入乘以权重,加上偏置,然后检查结果是否大于零。如果是,它预测为 True;如果不是,它预测为 False。这个小公式就是分类器,也称为决策函数。
At the start, the weight and bias are just random numbers. So the machine guesses badly. Now comes the only clever part: it learns from its mistakes. 起初,权重和偏置只是随机数,所以机器猜得很差。现在到了最巧妙的部分:它从错误中学习。
if prediction != result:
error = result - prediction # True - False = 1, False - True = -1
weight += learning_rate * error * value
bias += learning_rate * error
When the guess is wrong, we nudge the weight and bias in the right direction. The error tells us which way to nudge. The learning rate decides how big each nudge is. We do this for every example, then repeat the whole pass again. One full pass over the data is called an epoch. Repeating epochs is training. 当猜测错误时,我们向正确的方向微调权重和偏置。误差告诉我们调整的方向,学习率决定了调整的幅度。我们对每个样本都这样做,然后重复整个过程。对数据进行一次完整的遍历称为一个“轮次”(epoch)。重复这些轮次就是训练。
How does a perceptron learn? Epochs and learning rate
感知机是如何学习的?轮次与学习率
You saw two dials while training: epochs and learning rate. 在训练时,你看到了两个调节旋钮:轮次(epochs)和学习率(learning rate)。
An epoch is one full pass over all the data. The machine rarely gets everything right in a single pass, so we go again, and again. More epochs means more chances to fix mistakes. That is why accuracy climbs as you keep training. 轮次是对所有数据进行一次完整的遍历。机器很少能在一次遍历中全部做对,所以我们一遍又一遍地重复。更多的轮次意味着更多的纠错机会,这就是为什么随着训练的进行,准确率会不断攀升。
The learning rate is the size of each correction. In the code it is the learning_rate multiplier:
学习率是每次修正的幅度。在代码中,它是 learning_rate 乘数:
weight += learning_rate * error * value
Small steps are… 小步幅是……