Restartable Sequences

TL;DR

Restartable sequences (rseq) are a Linux system programming concept that enables lockless, scalable data structures on multi-core processors. Currently, only handwritten assembly code supports rseq, but future OS updates and language redesigns are expected to mainstream this technique, promising substantial performance improvements.

Restartable sequences (rseq), a Linux system programming feature introduced around 2018, are gaining attention for their potential to dramatically improve performance on multi-core processors. Although currently only accessible through handwritten assembly, experts believe rseq will soon be supported broadly across operating systems and programming languages, unlocking new levels of efficiency for system programmers and high-performance applications.

Rseq allows threads to execute small, restartable sections of code atomically, reducing the need for locks or atomic operations that can bottleneck performance on many-core CPUs. Currently, using rseq requires manual assembly coding, but it offers substantial speedups in memory allocation and other low-level operations. For example, on a high-end 128-core CPU, rseq-enabled malloc implementations have achieved up to 34x faster performance compared to traditional sharding methods, according to the developer.

In practical terms, this means applications that rely on frequent memory operations or concurrent data structures can see significant efficiency gains. The developer demonstrated that on a Raspberry Pi 5 with four cores, rseq improved malloc speed by three times, while on a $4,834 System76 Thelio Astra with 128 cores, the speedup was 34x. On a $17,628 AMD Threadripper with 96 cores, the improvement reached 43x.

Currently, software such as tcmalloc, jemalloc, glibc, and cosmopolitan uses rseq, but widespread adoption depends on future OS support and language integration. The developer predicts that all system programming languages will eventually incorporate rseq capabilities, and operating systems will natively support them, making lockless, scalable data structures commonplace.

Why It Matters

This development could revolutionize high-performance computing by enabling applications to fully exploit the capabilities of modern many-core processors. For system programmers, rseq offers a low-level tool to implement lockless data structures that scale efficiently, potentially leading to faster databases, real-time systems, and AI workloads. As core counts increase and performance demands rise, techniques like rseq will become essential for maintaining scalability and efficiency.

However, the current necessity of handwritten assembly limits accessibility. The broader industry’s adoption will depend on OS-level support and language features, which could democratize this performance boost for a wider range of developers and applications.

The Linux Programming Interface: A Linux and UNIX System Programming Handbook

The Linux Programming Interface: A Linux and UNIX System Programming Handbook

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Background

Restartable sequences were introduced in Linux around 2018 as a way to execute small, restartable code sequences atomically, primarily to optimize performance-critical operations like thread-local storage updates and CPU identification. Historically, thread synchronization relied heavily on locks and atomic primitives, which become bottlenecks on many-core systems. Rseq offers a way to bypass some of these limitations by enabling user-space code to be safely restarted if preempted during critical sections.

Until now, rseq has remained a niche technique, primarily used in specialized libraries like tcmalloc and jemalloc. Its potential has been recognized within system programming communities, especially as core counts on CPUs continue to grow exponentially. The developer’s recent demonstrations show that on high-core-count systems, rseq can drastically reduce contention and improve throughput, but widespread adoption is hindered by the need for manual assembly coding and limited OS support.

“All operating systems will be updated to support rseq(), all system programming languages will be redesigned to be able to express restartable sequences, and all data structure libraries will be rewritten to use them.”

— the developer

“On my $4,834 System76 Thelio Astra with 128-core CPU, rseq makes malloc() go 34x faster.”

— the developer

C++ Custom Memory Allocators: Arena,Pool, and Stack Allocation, Allocator-Aware Containers, and DeterministicMemory Performance in Production Systems. (High-Performance C++ Engineering)

C++ Custom Memory Allocators: Arena,Pool, and Stack Allocation, Allocator-Aware Containers, and DeterministicMemory Performance in Production Systems. (High-Performance C++ Engineering)

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

What Remains Unclear

It is not yet clear when or if mainstream operating systems will natively support rseq, or how quickly programming languages will integrate this feature. The current implementation relies on handwritten assembly, limiting accessibility. The long-term stability and security implications of widespread rseq adoption are also still being evaluated.

Amazon

lockless data structure development tools

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

What’s Next

Expect ongoing efforts to standardize rseq support at the OS level, along with language library updates that abstract away assembly coding. Developers and system architects will likely experiment with rseq in high-performance applications, and industry adoption could accelerate as hardware with even more cores becomes common. Monitoring upcoming Linux kernel updates and language support developments will be key.

CPU Design and Practice

CPU Design and Practice

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Key Questions

What is restartable sequences (rseq)?

Rseq is a Linux feature that allows small sections of code to be executed atomically and restart if interrupted, enabling lockless, scalable data structures on multi-core processors.

Why is rseq important for high-core CPUs?

Rseq can significantly reduce synchronization overhead and contention, allowing applications to better utilize the processing power of many-core systems, leading to faster performance.

Is rseq currently easy to use?

No, it requires handwritten assembly code and is not yet supported directly by operating systems or high-level languages, limiting its accessibility.

When will rseq support become widespread?

It is uncertain; future OS updates and language integrations are needed. Industry experts expect broader support within the next few years as hardware and software evolve.

What kind of applications benefit most from rseq?

High-performance applications such as memory allocators, databases, real-time systems, and AI workloads that require lockless, scalable data structures will benefit most.

Source: Hacker News

You May Also Like

Native all the way, until you need text

Developers struggle with native SDKs for complex text rendering, leading many to turn to web-based solutions like Electron for chat apps with Markdown.

Self-Distillation Enables Continual Learning [pdf]

Researchers introduce Self-Distillation Fine-Tuning (SDFT), a method enabling models to learn new skills from demonstrations without forgetting previous ones.

Google Declaring War on the Web

Google shifts its search strategy towards AI-generated responses, reducing links and potentially dominating information access, raising concerns about web freedom.

First Apple M5 memory exploit discovered using Anthropic AI, gives root access on MacOS — Claude Mythos helps security researchers bypass Memory Integrity Enforcement

Researchers using Anthropic AI have identified the first privilege escalation exploit on Apple M5 chips, bypassing hardware security features and gaining root access.