PostgreSQL on Kubernetes: Research & Recommendation

Executive Summary 1. Operator Comparison CloudNativePG (CNPG) -- STRONGLY RECOMMENDED Zalando Postgres Operator -- 3rd Place CrunchyData PGO -- 2nd Place Percona Operator -- 4th Place Operator Comparison Table 2. Multi-Tenancy Patterns Pattern 1: One PG Instance Per Project/Namespace Pattern 2: Shared HA Cluster with Multiple Databases Pattern 3: Hybrid Approach -- RECOMMENDED Resource Comparison (10 projects)3. Self-Hosted Alternatives Assessment 4. High Availability Replication Automatic Failover Failover Timing 5. Storage Hetzner Storage Options 6. Performance Tuning PostgreSQL Configuration Critical Notes PgBouncer Sizing K8s vs Bare-Metal Performance 7. Backup & Recovery Strategy Backup Verification DR Testing Calendar 8. Monitoring & Alerting Key Metrics to Monitor Minimum Alert Rules Grafana Dashboards 9. Final Architecture 10. Sources Operators Alternatives Multi-Tenancy Infrastructure

Executive Summary

After researching PostgreSQL operators, multi-tenancy patterns, self-hosted alternatives, HA configurations, storage options, performance tuning, backup strategies, and monitoring -- here is the unified recommendation for our Hetzner K3s cluster.

✅

Recommendation: Use CloudNativePG operator with a hybrid multi-tenancy approach (start with one shared HA cluster, promote critical projects to dedicated clusters as needed). Use local NVMe storage, PgBouncer in transaction mode, and backups to Hetzner Object Storage.

1. Operator Comparison

CloudNativePG (CNPG) -- STRONGLY RECOMMENDED

🏆

The clear winner for our use case. CNCF Sandbox project (applying for Incubation). 7,700+ GitHub stars, 880 commits/year, 132M+ image downloads. Apache 2.0 license.

HA: Kubernetes-native failover (no Patroni/etcd dependency). Quorum-based failover (stable in v1.28). Self-healing with auto pod restart, replica promotion, rolling updates.

Backup: Built on Barman. S3-compatible storage, continuous WAL archiving, full PITR, scheduled base backups, compression & encryption.

Monitoring: Built-in Prometheus exporter with customizable SQL metrics. PodMonitor auto-creation. Official Grafana dashboard.

Connection Pooling: Native PgBouncer via dedicated Pooler CRD. Separate, scalable PgBouncer pods.

K3s/Hetzner: Proven in production on K3s + Hetzner (Brella case study: zero issues after 7 months).

GitOps: Fully declarative CRDs -- perfect for infrastructure-as-code repos.

Multi-tenancy: Namespace-based isolation. Cluster-wide or namespace-scoped operator installation.

One caveat: Failover time on Hetzner K3s can be ~5 minutes for node failures (vs ~30s on cloud providers) due to Hetzner's node detection speed. This is infrastructure-level, not a CNPG issue.

Zalando Postgres Operator -- 3rd Place

~4,100 GitHub stars. NOT a CNCF project. Release cadence slowing.

Built on Patroni + Spilo. Proven at scale inside Zalando.

Unique Team API for multi-tenancy (best among all operators).

WAL-G for backups.

Community momentum shifting to CNPG. Harder to recommend for new deployments in 2026.

CrunchyData PGO -- 2nd Place

~3,900 GitHub stars. Oldest operator (production since 2017). Backed by Crunchy Data.

Built on Patroni. Best reliability test results in independent benchmarks.

pgBackRest for backup (gold standard for large databases -- block-level incremental, parallel backup/restore).

More complex initial setup than CNPG. Kustomize-first approach.

Strong choice if you need pgBackRest or already have Patroni expertise.

Percona Operator -- 4th Place

~72 GitHub stars. Built on top of CrunchyData PGO.

Smallest community -- significant risk for small teams needing community support.

PMM integration for monitoring (can be heavy).

Not recommended unless already invested in the Percona ecosystem.

Operator Comparison Table

Feature	CloudNativePG	Zalando	CrunchyData PGO	Percona
GitHub Stars	~7,700	~4,100	~3,900	~72
CNCF Status	Sandbox (applying Incubation)	None	None	None
HA Foundation	K8s-native	Patroni	Patroni	Patroni (via PGO)
Backup Tool	Barman	WAL-G	pgBackRest	pgBackRest
PgBouncer	Yes (Pooler CRD)	Yes (sidecar)	Yes	Yes
Prometheus	Built-in exporter	Community add-on	Built-in	PMM / Prometheus
K3s/Hetzner Tested	Yes (production)	Yes	Yes	Yes (documented)
Multi-tenancy	Namespace RBAC	Team API (best)	Namespace RBAC	Namespace RBAC
Release Cadence	Very high	Slowing	Moderate	Moderate
Complexity	Low	Medium	Medium-High	Medium-High

2. Multi-Tenancy Patterns

Pattern 1: One PG Instance Per Project/Namespace

Each project gets its own dedicated PostgreSQL cluster (primary + replicas).

➕

Pros: Full isolation, independent scaling, independent backups/PITR, independent upgrades, strongest security boundary, no noisy neighbors.

➖

Cons: ~2GB memory per project minimum (primary + replica). 10 projects = ~20-40GB memory. More clusters to monitor, more backup schedules, storage fragmentation.

When to use: Strict compliance/security requirements, very different workload profiles, when you can afford the resource overhead.

Pattern 2: Shared HA Cluster with Multiple Databases

One HA PostgreSQL cluster shared across all projects with database-level isolation.

➕

Pros: Resource efficient (4-8GB for 10-15 databases vs 20-40GB). One cluster to monitor/backup/upgrade. Simpler networking. PgBouncer routes per-database.

➖

Cons: Full blast radius (cluster down = ALL projects down). Noisy neighbor risk. PITR is all-or-nothing (cannot restore one database independently). Shared upgrade cycle. Security relies on PostgreSQL RBAC, not network isolation.

When to use: Resource-constrained environments, small team, similar workload profiles, non-critical applications.

Pattern 3: Hybrid Approach -- RECOMMENDED

🎯

Best of both worlds. Critical apps get dedicated PG instances; less critical apps share a common cluster.

Tier 1 (Dedicated): Production-critical, high-traffic, or compliance-sensitive projects.

Tier 2 (Shared): Internal tools, dev environments, low-traffic microservices.

A project should get a dedicated instance when:

It handles PII, payment data, or has compliance requirements

High write throughput or large dataset (>50GB)

Higher SLA than other projects

Needs independent scaling or upgrade schedule

A project can use the shared cluster when:

Internal tool or low-traffic service

Non-sensitive data

Occasional latency spikes are acceptable

Small dataset (<5GB)

Resource Comparison (10 projects)

Resource	10 Dedicated	1 Shared	Hybrid (2+1)
Memory	20-40 GB	4-8 GB	8-16 GB
CPU	5-10 cores	2-4 cores	3-6 cores
PVCs	20-30	3	9
PgBouncer Instances	10	1	3
Backup Schedules	10	1	3

3. Self-Hosted Alternatives Assessment

Solution	Production Ready	K3s/Hetzner	Complexity	Verdict
Neon (self-hosted)	No	Poor (needs NVMe + S3)	Very High	NOT RECOMMENDED
Supabase (self-hosted)	Partial	Yes	High	NOT RECOMMENDED (overkill)
Bitnami Helm Charts	No (deprecated)	Yes	Low	NOT RECOMMENDED
StackGres	Yes	Yes	Medium	CONDITIONALLY RECOMMENDED
Tembo	Early GA	Yes (untested)	Medium	WATCH

Neon: Serverless PG features (scale-to-zero, branching) unavailable in self-hosted mode. Operationally demanding. Not production-ready.

Supabase: Full platform (auth, APIs, realtime) is overkill if you just need PostgreSQL. Community-supported only, no official K8s support.

Bitnami: Being deprecated. No automated failover, no backup management, no PITR out of the box.

StackGres: Solid choice if you want batteries-included with a web console. Patroni HA + WAL-G backups + PgBouncer + Prometheus. Heavier pod footprint than CNPG.

Tembo: Interesting Rust-based operator with 200+ extensions and pre-built Stacks. Too young for critical production bet.

Key takeaway: None beat a well-configured CloudNativePG operator for our use case.

4. High Availability

Replication

Async replication (recommended default): Primary doesn't wait for replicas. RPO = replication lag (1-5s). No write latency penalty.

Sync replication: Primary waits for replica confirmation. RPO approaches zero. Adds 1-3ms latency within same DC. Use only for zero-RPO databases.

Quorum-based sync: ANY 1 (replica1, replica2) provides sync durability without single-replica dependency.

Automatic Failover

CNPG uses K8s-native leader election (no Patroni/etcd needed)

Self-healing: auto pod restart, replica promotion, rolling updates

Split-brain prevention through K8s leader election primitives

Failover Timing

Scenario	RTO	RPO	Procedure
Single replica failure	0 (no impact)	0	Operator auto-recreates
Primary failure (same DC)	30-60s	0-5s (async) / 0 (sync)	Auto-failover
Full DC failure	5-15 min	Replication lag	Promote cross-DC replica
Data corruption	15-60 min	To point before corruption	PITR restore
Complete cluster loss	1-4 hours	Last WAL archived	Restore from S3 backup

5. Storage

Hetzner Storage Options

Option	IOPS	Latency	Replication	Best For
Local NVMe (recommended)	100K+	Microseconds	None (use PG replication)	Primary DB, max performance
Longhorn	~19K	Higher	Built-in 2-3x	Simpler ops
OpenEBS Mayastor	~28K	NVMe-over-TCP	Configurable	High-perf with replication
Hetzner Volumes	~15K	Milliseconds	Hetzner-managed	Avoid for primary PG

Recommendations:

Use local NVMe via LocalPV for primary PostgreSQL (within 5-10% of bare-metal performance)

Use Longhorn if you prefer storage-level replication as a safety net (~30-40% IOPS cost)

Use separate WAL volume via CNPG walStorage spec for parallel I/O

StorageClass: reclaimPolicy: Retain, allowVolumeExpansion: true, WaitForFirstConsumer

Minimum IOPS targets: 3,000+ random read, 1,000+ random write, <1ms p99 latency for 8K reads.

6. Performance Tuning

PostgreSQL Configuration

Parameter	Formula	Example (8GB RAM, 4 CPU)
`shared_buffers`	25% of RAM	2GB
`effective_cache_size`	50-75% of RAM	6GB
`work_mem`	RAM / (max_connections * 4)	16MB
`maintenance_work_mem`	5-10% of RAM	512MB
`max_connections`	Low (use PgBouncer)	100-200
`random_page_cost`	1.1 for NVMe/SSD	1.1
`effective_io_concurrency`	200 for NVMe/SSD	200
`max_wal_size`	2-4GB for write-heavy	4GB

Critical Notes

⚠️

K8s defaults /dev/shm to 64MB. If shared_buffers exceeds this, PostgreSQL will FAIL to start. Most operators handle this automatically -- verify yours does.

⚡

CPU pinning is the single biggest tuning lever. Use Guaranteed QoS (requests == limits) with CPU Manager static policy. Benchmarks show +22% average read/write TPS and -76% write latency with NUMA affinity.

PgBouncer Sizing

Setting	Value	Rationale
`pool_mode`	transaction	Stateless apps (most common)
`default_pool_size`	20-30	Per user/database pair
`max_client_conn`	1000-5000	PgBouncer connections are lightweight
`max_db_connections`	100	Should be < max_connections

K8s vs Bare-Metal Performance

Local NVMe + CPU pinning + huge pages: Within 5-10% of bare-metal

Local NVMe, no CPU pinning: Within 15-20%

Longhorn/network storage: 30-50% slower

7. Backup & Recovery

Strategy

Full backup: Weekly (Sunday)

Incremental backup: Daily

WAL archiving: Continuous (every completed 16MB WAL segment)

Retention: 30 days of backups, 7 days of WAL

Target: Hetzner Object Storage (S3-compatible) or MinIO

Backup Verification

🔑

Backups don't protect your business -- proven restores do. Schedule weekly automated restore tests to a temporary cluster. Monitor backup age (alert if >25 hours), size trends, and WAL archiving lag.

DR Testing Calendar

Monthly: Backup restore verification (automated)

Quarterly: Simulated primary failure + failover drill

Semi-annually: Full DR exercise (restore from backup to fresh cluster)

8. Monitoring & Alerting

Key Metrics to Monitor

Health: pg_up, postmaster uptime

Connections: Active count by state, utilization vs max_connections (alert > 80%)

Performance: Cache hit ratio (should be > 99%), TPS, deadlocks

Replication: Replay lag in seconds (alert > 30s warning, > 300s critical)

Storage: Database size growth, disk usage (alert > 85%)

Backups: Last backup age (alert > 25 hours), WAL archiving failures

Minimum Alert Rules

PostgreSQL down (critical)

Connection utilization > 80% (warning)

Replication lag > 30s / > 300s (warning / critical)

Cache hit ratio < 99% (warning)

Backup age > 25 hours (critical)

Disk usage > 85% (warning)

WAL archiving failures (warning)

Deadlocks detected (warning)

Grafana Dashboards

Recommended community dashboards: IDs 9628 (PostgreSQL Database) and 14114 (PostgreSQL Overview).

9. Final Architecture

Layer	Choice	Rationale
Operator	CloudNativePG	CNCF, K3s-proven, lightest, most active
Multi-tenancy	Hybrid (start shared)	Resource efficient, clear upgrade path
Storage	Local NVMe via LocalPV	Best IOPS, within 5-10% of bare-metal
WAL Volume	Separate (CNPG walStorage)	Parallel I/O, disk-full protection
Replication	Async (default)	Sync only for zero-RPO databases
Connection Pooling	PgBouncer (CNPG Pooler CRD)	Transaction mode, 20-30 pool size
Backup Target	Hetzner Object Storage (S3)	Weekly full + daily incr + continuous WAL
Retention	30 days	With weekly automated restore verification
Monitoring	Built-in CNPG Prometheus + Grafana	Connections, TPS, replication, cache, disk

10. Sources

Operators

CloudNativePG -- cloudnative-pg.io

Zalando Postgres Operator -- github.com/zalando/postgres-operator

CrunchyData PGO -- github.com/CrunchyData/postgres-operator

Percona Operator -- github.com/percona/percona-postgresql-operator

Brella Case Study (CNPG on Hetzner K3s)

Palark Operator Comparison -- blog.palark.com

simplyblock Operator Comparison -- simplyblock.io/blog

Alternatives

Neon Operator by Molnett -- molnett.com/blog

Supabase Self-Hosting -- supabase.com/docs/guides/self-hosting

StackGres -- stackgres.io

Tembo -- tembo.io

Multi-Tenancy

CloudNativePG Architecture Docs -- cloudnative-pg.io/documentation

CNPG Discussion #497 -- Multiple Databases

CNPG Discussion #2357 -- Multi-tenant Architecture

Neon Noisy Neighbor Blog -- neon.com/blog

Infrastructure

PostgreSQL Tuning for Kubernetes best practices

PgBouncer Multi-Tenant at Scale -- dzone.com

Zalando Engineering -- PgBouncer on Kubernetes

Research conducted April 2026 by a Claude Code agent team: K8s Operator Specialist, Database Architect, Solutions Architect, and Infrastructure Engineer.