Farholm — Operations Runbook

Last reviewed: 2026-06-14. Owner: Jake (sole operator). Keep this short and actionable. For architecture, see README.md; for known risks, see REVIEW.md.

Quick reference

What Where
Hosting / DNS / Worker / KV / R2 / Access Cloudflare dashboard (one account)
Source + CI GitHub → Cloudflare Workers Builds (deploy on push to main)
Booking data KV namespace farholm-bookings (booking: / trash: keys)
Backups R2 bucket farholm-backups (snapshots/<date>/<stamp>.json, ~90-day retention)
Contact inquiries Web3Forms → email to hello@farholm.com
Privacy/cookie policy Termly (embedded on /privacy, /cookies)
Health check GET https://trip.farholm.com/api/health200 {ok:true}
Admin https://trip.farholm.com/admin (Cloudflare Access, admin email only)

Severity & first steps

First, always: note the time, capture evidence (screenshots, curl -i output, Cloudflare Ray IDs, Worker logs) before changing anything. Then work the relevant scenario below.

Scenarios

1. Portal / site outage

  1. Confirm scope: curl -I https://farholm.com/ and …/api/health (expect 200).
  2. Cloudflare → Workers & Pages → farholm-site → Deployments. If a recent deploy correlates, roll back: open the previous good deployment → promote.
  3. Check Workers Builds logs for a failed build; check the zone for an incident on Cloudflare Status.
  4. If only dynamic routes fail but static pages serve, suspect a Worker error — check Worker Logs/real-time logs; redeploy last known-good.

2. Failed or missing backup

  1. Run a manual snapshot: admin → Snapshot now (or POST /api/admin/snapshot).
  2. If it errors, check the R2 binding (BACKUPS) and bucket existence, then KV read access. Re-run.
  3. Confirm a fresh object exists in farholm-backups.

3. Lost or corrupted booking data

  1. Soft-deleted by mistake: read trash:<REF> in KV (Cloudflare dashboard) and re-save its JSON through the admin. (Trash holds 90 days.)
  2. Corrupted/overwritten: download the most recent good snapshots/<date>/<stamp>.json from R2, find the booking, re-save via admin.
  3. Bulk loss (namespace): restore from the latest R2 snapshot into a fresh KV namespace, repoint the BOOKINGS binding, redeploy. Test FH-9QXM4K7P.

4. Suspected unauthorized access

  1. Cloudflare Zero Trust → Access → review the application logs for /admin and /api/admin/*; confirm only the admin email authenticated.
  2. Rotate: Cloudflare account password + MFA, Access policy, and any exposed Worker secrets (GOOGLE_PLACES_KEY, AERODATABOX_KEY).
  3. Take a snapshot for evidence; review recent booking:/trash: changes.
  4. If client PII was exposed, see Client notification below.

5. Contact form failing

6. Third-party outage

Client notification (data events)

Notify affected clients without undue delay if their personal data was exposed or lost. Include what happened, what data, what you're doing, and what they should do. Check state breach-notification obligations for affected residents.

Recovery objectives (targets)

Exercise schedule

Twice a year: (1) roll back to a previous Worker deployment on a quiet day and confirm the site serves; (2) restore one booking from an R2 snapshot into a scratch namespace. Record the date and result here.