PEP 661 – Sentinel Values, accepted 5 years later
PEP 661 – Sentinel Values, accepted 5 years later
Abstract
Unique placeholder values, commonly known as “sentinel values”, are common in programming. They have many uses, such as for: Default values for function arguments, for when a value was not given: def foo(value=None): ... Return values from functions when something is not found or unavailable: >>> "abc".find("d") -1 Missing data, such as NULL in relational databases or “N/A” (“not available”) in spreadsheets.
摘要
唯一占位符值(通常称为“哨兵值”)在编程中非常常见。它们有许多用途,例如:作为函数参数的默认值,用于表示未提供值的情况:def foo(value=None): ...;作为函数在未找到或不可用时的返回值:>>> "abc".find("d") -1;以及表示缺失数据,例如关系数据库中的 NULL 或电子表格中的“N/A”(不可用)。
Python has the special value None, which is intended to be used as such a sentinel value in most cases. However, sometimes an alternative sentinel value is needed, usually when it needs to be distinct from None since None is a valid value in that context. Such cases are common enough that several idioms for implementing such sentinels have arisen over the years, but uncommon enough that there hasn’t been a clear need for standardization. However, the common implementations, including some in the stdlib, suffer from several significant drawbacks. This PEP proposes adding a built-in class for defining sentinel values, to be used in the stdlib and made publicly available to all Python code.
Python 拥有特殊值 None,在大多数情况下,它旨在用作此类哨兵值。然而,有时需要替代的哨兵值,通常是因为它需要与 None 区分开来,因为在某些上下文中 None 本身就是一个有效值。这种情况非常普遍,以至于多年来出现了几种实现此类哨兵的惯用法,但又不够普遍,因此一直没有明确的标准化需求。然而,常见的实现方式(包括标准库中的一些实现)存在几个重大缺陷。本 PEP 提议添加一个用于定义哨兵值的内置类,供标准库使用,并向所有 Python 代码公开。
Motivation
In May 2021, a question was brought up on the python-dev mailing list about how to better implement a sentinel value for traceback.print_exception. The existing implementation used the following common idiom: _sentinel = object(). However, this object has an uninformative and overly verbose repr, causing the function’s signature to be overly long and hard to read.
动机
2021 年 5 月,python-dev 邮件列表提出了一个问题,探讨如何更好地为 traceback.print_exception 实现哨兵值。现有的实现使用了以下常见的惯用法:_sentinel = object()。然而,该对象的 repr 信息量不足且过于冗长,导致函数签名变得过长且难以阅读。
Additionally, two other drawbacks of many existing sentinels were brought up in the discussion: Some do not have a distinct type, hence it is impossible to define clear type signatures for functions with such sentinels as default values. They behave unexpectedly after being copied, due to a separate instance being created and thus comparisons using is failing. Some common sentinel idioms have similar problems after being pickled and unpickled.
此外,讨论中还提到了许多现有哨兵的另外两个缺陷:一些哨兵没有独特的类型,因此无法为以这些哨兵作为默认值的函数定义清晰的类型签名。它们在被复制后表现出意外行为,因为会创建一个单独的实例,从而导致使用 is 进行的比较失败。一些常见的哨兵惯用法在序列化(pickling)和反序列化后也存在类似问题。
In the ensuing discussion, Victor Stinner supplied a list of currently used sentinel values in the Python standard library. This showed that the need for sentinels is fairly common, that there are various implementation methods used even within the stdlib, and that many of these suffer from at least one of the three above drawbacks. The discussion did not lead to any clear consensus on whether a standard implementation method is needed or desirable, whether the drawbacks mentioned are significant, nor which kind of implementation would be good.
在随后的讨论中,Victor Stinner 提供了一份 Python 标准库中当前使用的哨兵值列表。这表明对哨兵的需求相当普遍,即使在标准库内部也使用了各种实现方法,并且其中许多方法至少存在上述三个缺陷中的一个。讨论并未就“是否需要或值得采用标准实现方法”、“所提到的缺陷是否严重”以及“哪种实现方式更好”达成明确共识。
Rationale
The criteria guiding the chosen implementation were: The sentinel objects should behave as expected by a sentinel object: When compared using the is operator, it should always be considered identical to itself but never to any other object. Creating a sentinel object should be a simple, straightforward one-liner. It should be simple to define as many distinct sentinel values as needed. The sentinel objects should have a clear and short repr. It should be possible to use clear type signatures for sentinels. The sentinel objects should behave correctly after copying, and sentinels should have predictable behavior when pickled and unpickled.
基本原理
指导所选实现的标准如下:哨兵对象应符合哨兵对象的预期行为:当使用 is 运算符进行比较时,它应始终被视为与自身相同,但绝不与任何其他对象相同。创建哨兵对象应简单直接,只需一行代码。定义任意数量的独特哨兵值应非常简单。哨兵对象应具有清晰且简短的 repr。应能够为哨兵使用清晰的类型签名。哨兵对象在复制后应表现正确,并且在序列化和反序列化时应具有可预测的行为。