DB-First Architecture: Why PostgreSQL Beats YAML

TL;DR

Replaced 200+ YAML config files with 40+ PostgreSQL tables — query times dropped 98%
Idempotent upsert migrations (ON CONFLICT DO UPDATE) made every migration safely re-runnable
JSONB columns handle schema evolution without breaking changes
YAML files kept as read-only exports — developers can still browse them, but DB is the source of truth

The Problem: YAML at Scale

If you're building a developer tool that manages projects, sprints, tasks, or configuration, you've probably started with YAML or JSON files. We did too. It works great at first — human-readable, version-controlled, easy to edit.

Then your tool grows. Ours looked like this:

sprints/
  project-a/
    sprint-01.yaml ... sprint-55.yaml
  project-b/
    sprint-01.yaml ...
config/
  learnings/
    2026-01-15-cache-pattern.yaml
    ...213 files...
  events.yaml          # 4000+ lines, growing daily

The breaking point? Two automated processes tried to update the same YAML file simultaneously. One process's changes silently overwrote the other's. We lost data that had to be manually re-entered.

The lesson: The moment you have more than one writer — whether that's a developer, a CI pipeline, or an AI agent — file-based storage becomes a race condition waiting to happen.

Why PostgreSQL, Not Just "Any Database"

We chose PostgreSQL specifically for three features that YAML files can't replicate:

ACID transactions — concurrent writes are safe by default (MVCC handles isolation)
JSONB columns — schema flexibility without losing query power. New fields go in JSONB first, graduate to real columns when stable
pgvector extension — semantic search over knowledge entries using embedding vectors, something no file-based approach can match

The Migration Strategy: Idempotent Upserts

The key insight that made migration painless: every SQL statement must be safely re-runnable.

-- Every table uses IF NOT EXISTS
CREATE TABLE IF NOT EXISTS sprints (
    id SERIAL PRIMARY KEY,
    project_id INTEGER REFERENCES projects(id),
    name VARCHAR(200) NOT NULL,
    status VARCHAR(50) DEFAULT 'active',
    metadata JSONB DEFAULT '{}',
    created_at TIMESTAMPTZ DEFAULT NOW(),
    UNIQUE(project_id, name)
);

-- Every INSERT uses ON CONFLICT
INSERT INTO sprints (project_id, name, status)
VALUES (1, 'sprint-alpha', 'active')
ON CONFLICT (project_id, name) DO UPDATE
SET metadata = sprints.metadata || EXCLUDED.metadata,
    updated_at = NOW();

This means you can run any migration file twice (or ten times) and get the same result. No "migration already applied" errors. No manual rollbacks.

Schema Evolution: JSONB as an Escape Hatch

Traditional schema changes are painful — add a column, every config file needs updating, backward compatibility breaks. With JSONB:

-- Need a new field? Just add it to metadata
UPDATE sprints
SET metadata = metadata || '{"priority_score": 85}'
WHERE id = 42;

-- Query it immediately
SELECT name, metadata->>'priority_score' as priority
FROM sprints
WHERE (metadata->>'priority_score')::int > 80;

Our rule: new fields start in metadata JSONB. If they're used consistently across 3+ cycles, promote to a real column.

Performance: The Numbers

Operation	YAML	PostgreSQL	Gain
List all items	2.3s	45ms	51x
Full-text search	1.8s	62ms	29x
Update one record	340ms	38ms	9x
Concurrent writes	Race condition	ACID	Safe
Cross-project query	4.1s	78ms	53x

Five Rules We Live By

DB is the only source of truth — if it's not in PostgreSQL, it doesn't exist
Every INSERT has ON CONFLICT — no migration should fail on a second run
YAML is export-only — generated from DB, never edited directly
New fields start in JSONB — graduate to columns when stable
IF NOT EXISTS on everything — CREATE TABLE, ADD COLUMN, CREATE INDEX

When to Make the Switch

You should consider DB-first when you have more than one writer, need cross-project queries, or your files exceed 50 files or 1000 lines. The migration is worth it earlier than you think.

postgresql database yaml migration architecture

Kishore Rajendra Founder, DevPilot. Building AI-powered development velocity for solo developers and small teams.

DB-First Architecture: Why PostgreSQL Beats YAML for Developer Tools