Share Memory by Communicating: When a Channel Beats a Mutex in Go

Share Memory by Communicating: When a Channel Beats a Mutex in Go

通过通信共享内存:在 Go 中何时使用 Channel 优于 Mutex

You’ve read the proverb. It’s in the Go documentation, on stickers, in half the conference talks: “Do not communicate by sharing memory; instead, share memory by communicating.” 你一定读过这句箴言。它出现在 Go 的文档中、贴纸上,以及一半的会议演讲里:“不要通过共享内存来通信;而要通过通信来共享内存。”

Then you open a real Go codebase and every concurrent type in it is a struct with a sync.Mutex on top. The proverb says channels. The code says mutex. Somebody is wrong, and it’s tempting to assume it’s the code. It isn’t. The proverb is a design hint, not a lint rule. Both tools are correct Go. 然后当你打开一个真实的 Go 代码库时,会发现其中每个并发类型都是一个带有 sync.Mutex 的结构体。箴言说要用 Channel,代码却用了 Mutex。有人错了,人们很容易认为错的是代码。其实不然。这句箴言是一个设计提示,而不是一条代码检查规则。这两种工具在 Go 中都是正确的。

The skill is knowing which problem you have in front of you, because they are two different problems wearing similar clothes. One is about guarding a piece of state that several goroutines touch. The other is about handing a piece of state from one goroutine to the next so only one owns it at a time. 真正的技巧在于识别你面前的问题,因为它们是两个穿着相似外衣的不同问题。一个是关于保护多个 goroutine 共同访问的状态;另一个是关于将状态从一个 goroutine 传递给下一个,确保同一时间只有一个 goroutine 拥有它。

The two problems, stated plainly

两个问题,简而言之

A mutex guards state. The data stays in one place. Many goroutines reach into that place, one at a time, do their thing, and leave. Nobody owns the data; they take turns. Mutex 用于保护状态。数据留在原地。多个 goroutine 轮流访问该位置,完成各自的工作后离开。没有人拥有这些数据;他们只是轮流使用。

A channel transfers ownership. The data moves. A goroutine builds a value, sends it, and then must never touch it again. The receiver now owns it, exclusively, until it sends the value somewhere else. There is no shared access because at any moment exactly one goroutine holds the value. Channel 用于转移所有权。数据在移动。一个 goroutine 构建一个值,发送它,然后就绝不再触碰它。接收者现在独占该值,直到它将该值发送到别处。因为在任何时刻只有一个 goroutine 持有该值,所以不存在共享访问。

That second sentence is the whole proverb. “Share memory by communicating” means: instead of parking data in a shared box and locking the box, pass the data down a channel so ownership moves with it. The synchronization is the send and the receive. You never lock because you never share. 第二句话就是这句箴言的全部含义。“通过通信共享内存”意味着:不要把数据放在一个共享的盒子里并锁住盒子,而是通过 Channel 传递数据,让所有权随之转移。同步发生在发送和接收的过程中。你从不加锁,因为你从不共享。

The mutex version

Mutex 版本

Here is a counter. Many goroutines increment it. This is the guard-state problem, and a mutex is the right answer. 这是一个计数器。许多 goroutine 对其进行递增。这是“保护状态”问题,Mutex 是正确的答案。

type Counter struct {
    mu sync.Mutex
    n  int
}

func (c *Counter) Inc() {
    c.mu.Lock()
    c.n++
    c.mu.Unlock()
}

func (c *Counter) Value() int {
    c.mu.Lock()
    defer c.mu.Unlock()
    return c.n
}

Nothing moves. c.n lives in the Counter for the whole program. Every goroutine that calls Inc reaches into the same integer and takes its turn under the lock. Pointer receiver, so every caller shares the one mutex. This is small, obvious, and fast. 没有任何东西在移动。c.n 在整个程序运行期间都驻留在 Counter 中。每个调用 Inc 的 goroutine 都会访问同一个整数,并在锁的保护下轮流操作。使用指针接收者,因此每个调用者共享同一个 Mutex。这很简洁、直观且高效。

Trying to force a channel here makes the code worse. You’d stand up a goroutine that owns the counter, plus a request channel, plus a reply channel for Value. More moving parts to protect a single int++. The proverb doesn’t ask you to do that. 在这里强行使用 Channel 只会让代码变得更糟。你得启动一个拥有计数器的 goroutine,外加一个请求 Channel,以及一个用于 Value 的回复 Channel。为了保护一个简单的 int++ 而引入更多组件,这并不是箴言所要求的。

The channel version

Channel 版本

Now change the problem. A worker pool processes jobs. Each job is owned by exactly one worker while it runs, then the result moves on. Nothing is shared. This is the transfer-ownership problem, and a channel fits it exactly. 现在改变一下问题。一个工作池处理任务。每个任务在运行时由且仅由一个 worker 拥有,然后结果被传递出去。没有任何东西被共享。这是“转移所有权”问题,Channel 正好适用。

type Job struct { ID int; Data []byte }
type Result struct { ID int; Sum int }

func worker(jobs <-chan Job, out chan<- Result) {
    for j := range jobs {
        sum := 0
        for _, b := range j.Data { sum += int(b) }
        out <- Result{ID: j.ID, Sum: sum}
    }
}

When worker receives a Job off jobs, it owns that job. No other goroutine has a reference to it. It computes, builds a Result, and sends it away. After the send, the worker forgets the result and loops for the next job. No lock appears anywhere, because at no point do two goroutines hold the same value. 当 workerjobs 接收到一个 Job 时,它就拥有了这个任务。没有其他 goroutine 持有它的引用。它进行计算,构建一个 Result,然后将其发送出去。发送后,worker 就会忘记这个结果并循环处理下一个任务。代码中没有任何锁,因为在任何时刻都没有两个 goroutine 持有同一个值。

The ownership rule that keeps you honest

让你保持严谨的所有权规则

The line that separates a correct channel design from a subtle data race is this: after you send a value on a channel, treat it as gone. 区分正确的 Channel 设计与隐蔽数据竞争的界限在于:在通过 Channel 发送一个值之后,就把它当作已经不存在了。

job := Job{ID: 1, Data: buf}
jobs <- job 
// buf is now owned by whoever received job.
// Writing to buf here is a data race.
buf[0] = 0x00 // BUG

Job.Data is a slice, which is a pointer to a backing array. The send copied the slice header, not the bytes. Sender and receiver now point at the same array. The channel gave you the synchronization to hand it off, and then you reached back in and mutated shared memory anyway. The race detector will find this; go run -race on the real thing prints a DATA RACE report. Job.Data 是一个切片,它指向底层的数组。发送操作复制的是切片头(header),而不是字节本身。发送者和接收者现在指向同一个数组。Channel 提供了移交的同步机制,但你随后又回过头去修改了共享内存。竞争检测器会发现这一点;在实际代码上运行 go run -race 会打印出 DATA RACE 报告。

The mental discipline is simple. A send is a goodbye. If you need the value after sending it, copy it before the send, or don’t send the original. This is exactly the guarantee a mutex does not give you and does not need to: under a mutex, everyone expects to share, so everyone locks. 这种思维纪律很简单。发送即告别。如果你在发送后还需要用到该值,请在发送前复制一份,或者不要发送原始数据。这正是 Mutex 不提供也不需要提供的保证:在使用 Mutex 时,每个人都预期要共享,所以每个人都会加锁。

A quick way to pick

快速选择指南

Ask one question about the data: does it stay put, or does it move? 关于数据,问一个问题:它是留在原地,还是在移动?

  • Stays put, many readers/writers take turns — mutex. Counters, caches, a config struct reloaded in place, connection pools. The data has a home and goroutines visit it. 留在原地,多个读写者轮流访问 — Mutex。例如计数器、缓存、原地重载的配置结构体、连接池。数据有一个“家”,goroutine 只是去访问它。

  • Moves from producer to consumer, one owner at a time — channel. Pipelines, worker pools, event fan-out, request/reply between goroutines. The data has a journey and ownership travels with it. 从生产者移动到消费者,同一时间只有一个所有者 — Channel。例如流水线、工作池、事件分发、goroutine 间的请求/响应。数据有一段“旅程”,所有权随之移动。

Two more practical tie-breakers when it’s genuinely ambiguous: 当情况确实模糊不清时,还有两个实用的判断标准:

  1. If the shared thing is a single field you read and write in nanoseconds, a mutex is almost always simpler and faster. Don’t build a goroutine and two channels to protect an int. 如果共享的东西只是一个在纳秒级就能读写的字段,Mutex 几乎总是更简单、更高效。不要为了保护一个 int 而去构建一个 goroutine 和两个 Channel。

  2. If you find yourself locking a mutex, kicking off work, and unlocking in a different function than you locked in, you’re probably fighting the guard. 如果你发现自己在某个函数中加锁,启动工作,却在另一个函数中解锁,那你可能是在与“保护机制”作斗争(即用错了工具)。