Python 3.15: features that didn't make the headlines

Python 3.15: features that didn’t make the headlines

Python 3.15:那些未上头条的新特性

It’s that time of the year again, a new version of Python is just around the corner. With the Python 3.15.0b1 feature freeze, we know what’s coming to Python later this year. There are so many big features coming including lazy imports and the tachyon profiler which I previously covered. Last year, I really enjoyed investigating the smaller features of Python 3.14. I found that many of those features were just as interesting as the big PEPs and deserve a lot more attention. This year the situation is no different.

又到了每年的这个时候,新版本的 Python 即将发布。随着 Python 3.15.0b1 特性冻结(feature freeze),我们已经知道今年晚些时候 Python 将迎来哪些更新。许多重磅功能即将到来,包括我之前介绍过的延迟导入(lazy imports)和 tachyon 分析器。去年,我非常喜欢研究 Python 3.14 的一些小特性。我发现其中许多特性与那些重大的 PEP 一样有趣,值得更多的关注。今年也不例外。

Asyncio Taskgroup Cancellation

Asyncio TaskGroup 取消机制

There are not many Asyncio changes in this releases. The main feature to come out here is the ability to cancel a TaskGroup gracefully. TaskGroup is a form of structured concurrency, it enables developers to create multiple concurrent tasks in a clean way.

在此次发布中,Asyncio 的改动并不多。主要的新特性是能够优雅地取消 TaskGroup。TaskGroup 是一种结构化并发形式,它使开发者能够以简洁的方式创建多个并发任务。

async with asyncio.TaskGroup() as tg:
    tg.create_task(run())
    tg.create_task(run())
# Waits for all the tasks to complete
# 等待所有任务完成

Suppose we want to wait in the background for a signal of sorts to interrupt the taskgroup’s execution, it’s seems like something simple to do in asyncio, but in reality it’s somewhat awkward to do this.

假设我们想在后台等待某种信号来中断 TaskGroup 的执行,这在 asyncio 中看起来很简单,但实际上操作起来有些笨拙。

class Interrupt(Exception): ...

with suppress(Interrupt):
    async with asyncio.TaskGroup() as tg:
        tg.create_task(run())
        tg.create_task(run())
        if await wait_for_signal():
            raise Interrupt()

This works because exceptions raised within a task group cause other tasks to cancel. The custom Interrupt exception is raised as part of a ExceptionGroup which then gets filtered by contextlib.suppress, resulting in a graceful exit. The way suppress works with ExceptionGroup is yet another overlooked feature from 3.12. This is a change I learnt by accident when researching this article.

这段代码之所以有效,是因为在任务组内引发的异常会导致其他任务被取消。自定义的 Interrupt 异常作为 ExceptionGroup 的一部分被引发,随后被 contextlib.suppress 过滤,从而实现优雅退出。suppressExceptionGroup 配合使用的方式是 Python 3.12 中另一个被忽视的特性。这是我在撰写本文时偶然发现的一个改动。

The new TaskGroup.cancel makes this process a lot easier: 新的 TaskGroup.cancel 让这个过程变得简单多了:

async with asyncio.TaskGroup() as tg:
    tg.create_task(run())
    tg.create_task(run())
    if await wait_for_signal():
        tg.cancel()

Unlike before it’s so simple there’s hardly any point in explaining. It simply cancels the group without raising any exceptions.

与以前不同,它非常简单,几乎不需要解释。它只是取消了任务组,而不会引发任何异常。

Context Manager Improvements

上下文管理器改进

Decorators are surprisingly hard to write, so much so that it’s become a go-to interview question. But did you know that context managers can also double up as a decorator?

装饰器写起来出奇地困难,以至于它成了面试中的必考题。但你知道吗?上下文管理器也可以兼作装饰器。

@contextmanager
def duration(message: str) -> Iterator[None]:
    start = time.perf_counter()
    try:
        yield
    finally:
        print(f"{message} elapsed {time.perf_counter() - start:.2f} seconds")

Here I have a very commonly used context manager to print out the duration spent in the block. Ever since Python 3.3 we could directly use it as a decorator too:

这里我有一个非常常用的上下文管理器,用于打印代码块执行所花费的时间。自 Python 3.3 起,我们就可以直接将其用作装饰器:

@duration('workload')
def workload():
    ...

But whilst it’s convenient, there are cases where it doesn’t work at all: 虽然这很方便,但在某些情况下它完全无法工作:

@duration('async workload')
async def async_workload(): ...

@duration('generator workload')
def workload():
    while True: yield ...

Iterators, async functions and async iterators don’t work well here because they have different semantics to standard functions. When you call them they return immediately with a generator object, coroutine function and async generator object respectively. So the decorator completes immediately as opposed to the entire lifecycle what it’s wrapping.

迭代器、异步函数和异步迭代器在这里表现不佳,因为它们的语义与标准函数不同。当你调用它们时,它们会立即返回一个生成器对象、协程函数或异步生成器对象。因此,装饰器会立即完成,而不是覆盖它所包装对象的整个生命周期。

This is an unfortunate problem I’ve encountered many times, and it’s often a problem for normal decorators too. But this has changed in 3.15, now the ContextDecorator will check the type of the function it’s wrapping and ensure that the decorator covers the entire lifespan. In my opinion, this now makes context managers the best way to create decorators! It avoids some of the common footguns and provides cleaner syntax. I recommend more people start using it this way.

这是一个我多次遇到的不幸问题,对于普通装饰器来说这也是个常见问题。但在 3.15 中情况发生了变化,现在 ContextDecorator 会检查它所包装函数的类型,并确保装饰器覆盖整个生命周期。在我看来,这使得上下文管理器成为创建装饰器的最佳方式!它避免了一些常见的“坑”,并提供了更简洁的语法。我建议更多人开始以这种方式使用它。

Thread Safe Iterators

线程安全迭代器

Iterators are one of the foundations of modern Python. The iterator type allows us to separate data sources from data consumers as below, resulting in cleaner abstractions:

迭代器是现代 Python 的基石之一。迭代器类型允许我们将数据源与数据消费者分离,从而实现更简洁的抽象:

def stream_events(...) -> Iterator[str]:
    while True: yield blocking_get_event(...)

events = stream_events(...)
for event in events:
    consume(event)

But this abstraction breaks when using threading or free-threading. An iterator by default is not threadsafe, therefore we may see skipped values or just broken internal iterator state. This is solved in 3.15 with threading.serialize_iterator, we simply wrap our original iterator with this and voila:

但当使用线程或自由线程(free-threading)时,这种抽象就会失效。默认情况下,迭代器不是线程安全的,因此我们可能会看到跳过的值或损坏的内部迭代器状态。在 3.15 中,这个问题通过 threading.serialize_iterator 得到了解决,我们只需用它包装原始迭代器即可:

import threading
events = threading.serialize_iterator(stream_events(...))
with ThreadPoolExecutor() as executor:
    fut1 = executor.submit(consume, events)
    fut2 = executor.submit(consume, events)

There is also the threading.synchronized_iterator decorator which just applies threading.serialize_iterator to the result of an generator function. Finally we also have threading.concurrent_tee that instead of splitting the values will duplicate the values across multiple iterators.

此外还有一个 threading.synchronized_iterator 装饰器,它只是将 threading.serialize_iterator 应用于生成器函数的结果。最后,我们还有 threading.concurrent_tee,它不会拆分值,而是将值复制到多个迭代器中。

Before these utilities existed we primarily relied on Queues to synchronise consumption between threads, with these added in we can avoid changing our abstractions for multi-threaded code.

在这些工具出现之前,我们主要依赖队列(Queues)来同步线程间的消费。有了这些新增功能,我们可以在编写多线程代码时避免更改原有的抽象逻辑。

Bonus Features

额外特性

Last year I only highlighted 3 features, but this year there are a lot more updates that intrigue me. Here are 2 more changes that are perhaps less impactful but still very interesting nonetheless.

去年我只重点介绍了 3 个特性,但今年有更多让我感兴趣的更新。以下是另外两个改动,它们的影响力可能较小,但依然非常有趣。

Counter xor Operation

Counter 的异或操作

collections.Counter is a very useful class. It let’s us easily count up the frequency of discrete occurrences. It behaves very similar to a dict[KeyType, int] but with a ton of useful operations.

collections.Counter 是一个非常有用的类。它让我们能够轻松统计离散事件的频率。它的行为与 dict[KeyType, int] 非常相似,但拥有大量实用的操作。

c = Counter(a=3, b=1)
d = Counter(a=1, b=2)
print(f"{c + d = }") # add two counters together: c[x] + d[x]
print(f"{c - d = }") # subtract (keeping only positive counts)

But it has some weirder operations too: 但它也有一些更奇特的操作:

print(f"{c & d = }") # intersection: min(c[x], d[x])
print(f"{c | d = }") # union: max(c[x], d[x])