Project: Harness Engineering (Care Flow Orchestration)
Purpose
Create a concrete, verifiable Care Flow orchestration system using harness engineering principles:
- humans steer (intent + constraints)
- agents execute (automation + glue)
- repo/KB is the system of record
- feedback loops (logs/metrics/tests) make automation reliable
Care Flow scope (from your earlier voice note):
- monitoring
- integration
- provisioning
- dashboard
- alerts
- messaging
Scope
In scope
- Define Care Flow boundaries, actors, and events
- Event schemas and idempotent processing
- Monitoring + alerting pipeline (health, metrics, incidents)
- Provisioning workflow (device onboarding)
- Messaging workflow (status, incidents, escalations)
- Dashboards (operational + management)
- AI agent runbooks for common incidents
Out of scope (for initial MVP)
- Full production HA/DR across regions
- Full multi-tenant RBAC (unless required immediately)
Owner / Team
- Owner: Arif
- Team: (assign)
Status
- Status: Proposed
- Start date: 2026-02-13
- Target date: (set)
Architecture
High-level flow
Devices/Gateways → ingestion → event store → rules/alerts → dashboards + messaging
AI orchestration sits alongside:
- generates/updates execution plans
- proposes actions
- runs verified scripts/playbooks
Key artifacts
- Event schema(s)
- Runbooks (SOPs)
- “Check health” scripts
- “Provision device” scripts
Prereqs
- Knowledge hub established (Docusaurus KB)
- Auth + dashboard workflow stable (see harness-engineering-dad)
Inputs
- Reference principles:
Deliverables
- Care Flow system map (components + ownership)
- Event schema (JSON) + examples
- MVP monitoring + alert rules + notification routing
- MVP provisioning workflow + verification steps
- Incident runbooks (top 5 incidents)
- Dashboard pages/tiles for Care Flow
- A single “careflow-check.sh” that validates the pipeline end-to-end
Procedure (Step-by-step)
Step 1 — Define boundaries + event taxonomy
- Devices, gateways, backends
- Event types: heartbeat, status, error, provision, alert
Step 2 — Define data contracts
- JSON schema + versioning strategy
- Idempotency keys
Step 3 — Build MVP ingestion + persistence
- Minimal reliable storage
- Replay capability
Step 4 — Alerting + messaging
- Rules
- escalation
- Discord/WhatsApp/email integration as needed
Step 5 — Dashboards
- Operational dashboard
- Management summary
Step 6 — AI runbooks + automation
- Agent reads logs/metrics
- Proposes action
- Executes scripts with verification
Config
- Define environment variables and secrets policy (no secrets in KB)
Verification
- Synthetic test events appear in dashboards
- Alerts fire and route correctly
- Provisioning flow completes end-to-end
Rollback
- Disable automated actions (manual-only mode)
- Disable alert rules if noisy
Troubleshooting
- Missing events → check ingestion + auth
- Duplicate events → check idempotency
References
- Harness engineering:
Changelog
- 2026-02-13: Created project plan page.