Show HN: Streambed – Stream Postgres to Iceberg on S3, Supports Postgres Wire

TL;DR

Streambed is a new tool that streams PostgreSQL WAL changes directly to Iceberg tables on S3, allowing analytical queries without modifying existing applications. It supports the Postgres wire protocol for seamless integration. The project is in early release with detailed setup instructions available.

Streambed, an open-source project announced on Hacker News, enables real-time streaming of PostgreSQL WAL changes to Iceberg tables stored on S3, supporting the Postgres wire protocol for querying without traditional ETL or Spark dependencies.

Streambed connects to PostgreSQL as a logical replication subscriber, decoding WAL messages for inserts, updates, and deletes. It buffers these changes and writes them as Parquet files to an S3 bucket, simultaneously updating Iceberg metadata. The system supports updates and deletes through copy-on-write merging. A built-in query server exposes Iceberg tables over the Postgres wire protocol, allowing users to query data with psql or any Postgres-compatible client. The project requires Go 1.22+ and CGO, and can be deployed locally using Docker or in production environments. Setup involves starting Postgres and MinIO locally, building the Go binary, and running the sync and query server components, with commands for resync and cleanup available.

Why It Matters

This development matters because it offers a streamlined, low-latency way to offload analytical workloads from production Postgres databases without changing existing applications. It simplifies data lake management by eliminating traditional ETL pipelines, enabling real-time analytics with familiar tools, and reducing infrastructure complexity. The support for the Postgres wire protocol means users can query streamed data directly with standard Postgres clients, broadening accessibility and ease of integration.

PostgreSQL Mastery: Schema Design, Query Tuning, and HA

PostgreSQL Mastery: Schema Design, Query Tuning, and HA

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Background

Traditional data warehousing often relies on batch ETL processes or complex Spark-based pipelines to move data from transactional systems to analytical stores. Recent efforts aim to simplify this by enabling streaming approaches. Prior solutions have required significant setup or proprietary connectors. Streambed builds on logical replication in Postgres, a feature introduced in recent versions, to facilitate continuous data ingestion directly into data lakes on S3 using Iceberg. This aligns with industry trends toward real-time analytics and simplified data architecture.

“Streambed streams WAL changes via logical replication, writes Parquet files to S3, and commits Iceberg metadata, supporting real-time analytics without ETL or Spark.”

— Viggy28 (Hacker News user)

“The query server speaks the Postgres wire protocol, so you can connect with psql directly to query your streamed data.”

— Viggy28 (Hacker News user)

Iceberg 69227 ARC 6-Foot Rectangular Table, 36" x 72", Graphite/Silver Leg

Iceberg 69227 ARC 6-Foot Rectangular Table, 36" x 72", Graphite/Silver Leg

Versatile for open plan environments

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

What Remains Unclear

Details about performance at scale, stability in production environments, and long-term maintenance are still emerging. It is not yet clear how well Streambed handles very high throughput or complex schema changes, and user feedback is limited to initial releases.

Amazon

Postgres wire protocol compatible client

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

What’s Next

Next steps include broader testing and adoption, potential feature enhancements such as support for more complex schema evolution, and integration with cloud-native orchestration tools. Developers may also explore deploying Streambed in production environments to evaluate performance and reliability.

Amazon

Parquet file storage on S3

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Key Questions

How does Streambed compare to traditional ETL pipelines?

Streambed provides real-time streaming of Postgres changes directly into Iceberg on S3, eliminating the need for batch ETL jobs and reducing latency. It simplifies architecture by avoiding Spark or other heavy processing frameworks.

Can I query the streamed data with standard Postgres tools?

Yes, Streambed includes a query server that exposes Iceberg tables over the Postgres wire protocol, allowing connection with psql and other Postgres-compatible clients.

What are the system requirements to run Streambed?

Streambed requires Go 1.22+ and CGO. It can be run locally using Docker or deployed directly on servers. It also depends on a Postgres instance with logical replication enabled and an S3-compatible storage service like MinIO or AWS S3.

Is Streambed suitable for high-volume production environments?

While initial release details are promising, performance at scale and stability in production are still under evaluation. Users should conduct testing before deploying in critical systems.

Source: Hacker News

You May Also Like

Valorant’s new Vanguard update seems to be bricking cheaters’ PCs. Riot’s response? “Congrats on your $6k paperweights”

Riot Games states Vanguard anti-cheat does not ‘brick’ PCs, clarifying recent claims about hardware damage linked to a recent update.

How to Stop Laptop Overheating on a Desk Setup

An effective guide to preventing laptop overheating on your desk setup reveals essential tips to keep your device cool and perform optimally.

I believe there are entire companies right now under AI psychosis

A recent claim suggests some companies are experiencing ‘AI psychosis,’ raising concerns about AI’s impact on organizational decision-making and mental health.

Monitor Size and Distance: The Simple Comfort Formula

Just knowing the right monitor size and distance can transform your comfort—discover the simple formula that keeps you productive and pain-free.