Don't answer the first question

Don’t answer the first question

不要回答第一个问题

In my work on Perfetto, a performance debugging tool, one question I get often is: “how do I split a Perfetto trace into multiple files?” Instead of answering directly, I say: “there isn’t an easy way to do that, but what’s leading you to collect traces large enough to want to split?”

在我的工作(Perfetto,一款性能调试工具)中,我经常被问到一个问题:“我该如何将 Perfetto 追踪文件拆分成多个文件?”我不会直接回答,而是反问:“目前并没有简单的方法可以做到这一点,但究竟是什么原因让你需要收集如此庞大的追踪文件,以至于想要将其拆分呢?”

This is one of my golden rules at work. When a user asks me something “weird”: don’t answer the first version of the question. On the surface this might appear like I’m talking about the XY problem, but that stops one step short. It treats the user’s stated question as a puzzle to decode: figure out what they really meant, answer that, move on. I think we can go much further.

这是我在工作中的黄金法则之一。当用户问我一些“奇怪”的问题时:不要直接回答问题的表象。表面上看,这似乎是在谈论“XY 问题”,但这种理解还不够深入。它仅仅把用户提出的问题当作一个需要破解的谜题:弄清楚他们真正想问的是什么,回答它,然后结束。我认为我们可以做得更多。

Instead, the confusion that produced the wrong question is itself an opening, and the conversation it sparks is valuable to both sides. The user walks away with a better mental model of the tool. I walk away with a clearer picture of where the product confuses people. And sometimes, between us, we figure out that the product itself needs to change.

相反,导致用户提出错误问题的困惑本身就是一个切入点,由此引发的对话对双方都很有价值。用户离开时会对工具有了更好的心智模型;而我则能更清晰地了解产品在哪些地方让用户感到困惑。有时,我们甚至会共同发现产品本身需要改进。

I’ve written before about how I can still be a successful engineer while avoiding the spotlight. While that covered the general strategy, this is one of the concrete tactics that makes it work. I should also say this post is aimed at people who build things for other engineers. If you’re building a consumer product, or a B2B service, it will translate less directly, but the underlying instinct might still be useful.

我之前写过关于如何在避免聚光灯的情况下依然成为一名成功的工程师。那篇文章涵盖了总体策略,而本文则是实现这一目标的具体战术之一。我还需要说明,这篇文章是写给那些为其他工程师构建产品的人看的。如果你构建的是消费级产品或 B2B 服务,这些经验可能无法直接套用,但其背后的直觉或许依然有用。

Diagnosing the ask

诊断需求

Some questions are easy, routine, and purely a matter of pointing at documentation; those don’t merit much discussion here. The interesting cases are where something is out of the ordinary, and it’s rare that the user will have given me enough information in their first ask.

有些问题很简单、很常规,只需指引用户查看文档即可;这些问题在这里无需多谈。有趣的情况是那些不同寻常的问题,而且用户很少会在第一次提问时就提供足够的信息。

So I run a mental checklist to figure out where to go next:

  1. Have I seen this before? If so, I might already have an answer to hand. If not, it’s uncommon enough that I want to slow down.
  2. Does the question even sound reasonable compared to others I’ve seen? If not, why might they be asking it, and is there a more normal question underneath?
  3. Does it fit the shape of the tool? Or is the user fighting the architecture without realizing it?

因此,我会进行心理核对,以确定下一步该怎么做:

  1. 我以前见过这种情况吗?如果有,我可能已经有了现成的答案。如果没有,说明这很不寻常,我需要放慢节奏。
  2. 与我见过的其他问题相比,这个问题听起来合理吗?如果不合理,他们为什么要这样问?其背后是否隐藏着一个更常规的问题?
  3. 它符合工具的设计初衷吗?还是用户在不知不觉中与架构“对着干”?

Once I’ve figured out what feels off, the next step is asking something that will surface the missing context. I might say something like “well the answer to your immediate question is X but that’s a pretty strange thing to ask for because of reason Y. Can you tell me more about the wider problem you’re trying to solve?”

一旦我找出哪里不对劲,下一步就是提出一些能引出缺失背景的问题。我可能会说:“针对你直接提出的问题,答案是 X,但因为 Y 的原因,这其实是一个很奇怪的需求。你能多跟我说说你试图解决的更广泛的问题吗?”

This will probably be the start of a back and forth. How quickly it moves depends on how well the user can communicate their thoughts. But we’ll usually end up in one of a few places: they’re missing the philosophy of the tool, the product is hiding the right path or the product itself needs to change.

这通常会开启一轮互动。进展速度取决于用户表达想法的能力。但我们最终通常会得出以下几种结论之一:他们没理解工具的设计理念、产品隐藏了正确的路径,或者产品本身需要改进。

When they’re missing the philosophy

当他们没理解设计理念时

It’s quite common for users to come to us not knowing what they want, or not understanding the problem they’re trying to solve. To be clear, I’m not criticizing them for this; teams are often trying to solve problems with limited time or resources, and they turn to new debugging tools when they’re struggling to make progress. As a result, they often find the tool, find it does most of what they want but doesn’t match their model of “how it should work”. So they file a feature request.

用户在不知道自己想要什么,或者不理解自己试图解决的问题时来找我们,这是很常见的。需要说明的是,我并不是在批评他们;团队往往是在时间和资源有限的情况下试图解决问题,当他们进展受阻时,就会转向新的调试工具。结果,他们找到了这个工具,发现它能完成大部分需求,但不符合他们心中“应该如何工作”的模型。于是,他们提交了功能请求。

A common version of this: people come to Perfetto, see that a trace is a highly detailed recording of what a device did over a window of time, realize you can compute metrics from a Perfetto trace, and treat it as a holy grail solution to all their problems. Want a frame rate? Count frames in the trace. Memory used by an app? Look at the allocations and frees. In principle, any metric could be computed from a trace.

一个常见的例子是:人们来到 Perfetto,看到追踪文件是对设备在一段时间内所做操作的高度详细记录,意识到可以从 Perfetto 追踪中计算指标,于是将其视为解决所有问题的“圣杯”。想要帧率?计算追踪中的帧数。想要应用占用的内存?查看分配和释放记录。原则上,任何指标都可以从追踪中计算出来。

But this is a bad idea for a simple reason: traces are expensive to collect and process: you’re collecting all the data about the system rather than sampling a single number. You’re going to waste a lot of resources when instead, a dedicated metric collection system would do the job much more efficiently.

但这是一个糟糕的主意,原因很简单:收集和处理追踪文件的成本很高:你是在收集关于系统的所有数据,而不是采样一个单一的数值。你会浪费大量资源,而专门的指标收集系统本可以更高效地完成这项工作。

My overarching point is that there’s a certain philosophy to how tools are designed, and users often miss it because they’re focused on their immediate problem. A big part of my job is teaching the team how to approach performance engineering in the first place, not just explaining how to use Perfetto. It means making people aware of the tools they have available, how to think about things like startup, frame drops, memory, and power, and how to work with them both in normal situations and when something goes wrong.

我的核心观点是,工具的设计是有其特定理念的,而用户往往因为专注于眼前的问题而忽略了这一点。我工作的一大部分内容是教导团队如何从根本上进行性能工程,而不仅仅是解释如何使用 Perfetto。这意味着要让人们意识到他们手头有哪些工具,如何思考启动、掉帧、内存和功耗等问题,以及如何在正常情况和出现故障时处理这些问题。

When the right path is hidden

当正确的路径被隐藏时

Other times the team understands the problem; they just don’t see how to put existing tools together. Our tools are powerful by design, and we have to be mindful that other teams might not understand the full range of what we’ve built. It’s my job to figure out what they actually want. Often, something we built for a different purpose can be repurposed to meet their needs.

有时团队理解问题所在,只是不知道如何组合现有的工具。我们的工具设计得很强大,我们必须意识到其他团队可能并不完全了解我们所构建功能的全部范围。我的工作就是弄清楚他们真正想要什么。通常,我们为其他目的构建的功能可以被重新利用,以满足他们的需求。

A perfect example here is what I already discussed: trace splitting. The conversation goes something like, “…what’s leading you to collect traces large enough to want to split?” They say it’s because they have periods of interest in a long trace and want to slice it up. Partly for performance, partly to make visualizing easier.

这里有一个完美的例子,就是我之前讨论过的:追踪拆分。对话通常是这样的:“……是什么原因让你收集了如此庞大的追踪文件,以至于想要将其拆分?”他们说是因为在长追踪中他们只关注某些特定时段,想要将其切片。部分是为了性能,部分是为了让可视化更容易。

But then I point out that Perfetto already supports periodic trace snapshots, short repeated recordings instead of one long one, which removes the need to collect a long trace at all. They’re trying to solve a problem they shouldn’t be having in the first place. It’s always satisfying to see people say “that’s exactly what I needed!” even though it’s not what they asked for. It means I successfully figured out what they actually wanted rather than what they thought they did.

但我指出 Perfetto 已经支持周期性追踪快照(即短时间的重复记录,而不是一个长记录),这完全消除了收集长追踪文件的必要。他们试图解决的问题,本就不应该存在。看到人们说“这正是我需要的!”总是令人欣慰,即使这并不是他们最初要求的。这意味着我成功地弄清楚了他们真正想要什么,而不是他们以为自己想要什么。

When the product needs to change

当产品需要改进时

Occasionally, the response reveals something genuinely new, something that could set us on the path to building something big. These cases are tricky: even when the ask is novel, the asker often can’t tell…

偶尔,用户的反馈会揭示出一些真正新颖的东西,这可能会引领我们构建出重大的功能。这些情况很棘手:即使需求是新颖的,提问者往往也无法表达清楚……