Files
pi_mcps/plans/homelab-release-runbook.md

265 lines
12 KiB
Markdown

# Homelab Release Runbook — New Project → Local Host → Public
**Audience:** Work Lumen (and future me). This is the authoritative, battle-tested
procedure for taking a new alpha app from zero to (a) running on TrueNAS for LAN
testing, then (b) publicly reachable over HTTPS — using only Gitea Actions + frp +
IONOS Apache. No expensive per-project DevOps rediscovery.
**Status:** Proven twice end-to-end (InspectFlow, CannaManage). Supersedes the
stale [`homelab-proxy-architecture.md`](homelab-proxy-architecture.md:1) (that doc
describes a WireGuard plan that was **never used** — the VPS is OpenVZ and cannot
run WireGuard; we use **frp** instead).
---
## 0. The mental model (read this once)
```
PUBLIC (optional, additive)
browser ─HTTPS─► IONOS Apache ─ProxyPass─► VPS frps ─frp tunnel─► TrueNAS frpc ─► frontend:PORT
(82.165.206.45, Let's Encrypt) (85.214.154.199) (192.168.188.119)
LOCAL-FIRST (always works on its own)
LAN browser ──────────────────────────────────────────────────► TrueNAS 192.168.188.119:HOSTPORT
```
Two **decoupled** phases:
1. **Local phase**`git push` to `main` → Gitea Actions self-hosted runner on
TrueNAS builds + runs the stack in-place. App is live at
`http://192.168.188.119:<hostPort>`. **Zero VPS / IONOS involvement.** This is
where every project starts and stays during early alpha.
2. **Public phase** — purely *additive*. Run [`homelab-publish.sh`](../scripts/homelab-publish.sh:1)
once to wire the frp tunnel + IONOS vhost + Let's Encrypt cert. Nothing about
the app changes; you only add a tunnel from `frontend:PORT` out to the world.
To "unpublish", stop the frpc proxy block — local phase keeps working.
**Why frontend-only tunnelling:** the Next.js frontend proxies `/api/backend/*`
to the backend server-side (see the route.ts catch-all proxy in the template), so
only the frontend host port needs to be exposed. The backend and db never touch
the public path.
---
## 1. Fixed infrastructure (already exists — do not rebuild)
| Component | Where | Detail |
|---|---|---|
| Gitea | TrueNAS `192.168.188.119:30008` | source of truth; `http://192.168.188.119:30008/` |
| act_runner | TrueNAS container, **instance-level** | auto-picks-up **any** new repo — no per-repo runner registration needed |
| frps (frp server) | VPS `85.214.154.199:7000` | token auth; bind ports in the 300xx range |
| frpc (frp client) | TrueNAS `systemd` service | config `/mnt/VM_SSD_Pool/frp/frpc.toml`; reload `systemctl restart frpc` |
| IONOS Apache | `82.165.206.45` (apex `plate-software.de`) | terminates HTTPS, ProxyPass to VPS frps; acme.sh for certs |
| DNS | IONOS | each subdomain `A` record → **82.165.206.45** (the IONOS box, NOT the VPS) |
SSH aliases assumed: `ssh truenas`, `ssh ionos`, `ssh vps` (confirm in `~/.ssh/config`).
---
## 2. Port & subdomain registry ⚠️ SINGLE SOURCE OF TRUTH — update on every new project
**frp remote ports (on VPS frps).** Each project gets exactly one. Allocate the
next free one and record it here *before* writing any config.
| Project | frp remotePort | subdomain | frontend hostPort (LAN) | backend hostPort (LAN, debug) | Status |
|---|---|---|---|---|---|
| gitea | 30008 | (direct) | — | — | live |
| inspectflow | 30009 | inspectflow.plate-software.de | (Caddy-fronted) | — | live |
| cannamanage | 30010 | cannamanage.plate-software.de | 3000 | 8081→8080 | live |
| **— next free —** | **30011** | — | — | — | — |
**Allocation rules:**
- **frp remotePort**: strictly increment from the table. Next = **30011**.
- **frontend hostPort**: each compose project runs on its **own bridge network**, so
internal ports (3000 / 8080 / 5432) never collide across stacks. Only *published*
host ports can clash. Pick a unique LAN host port per project (e.g. 3000, 3001, …)
for the frontend; keep backend published only if you need LAN debugging (use a
unique port like 8081, 8082, …). **db must NOT publish a host port** (see §6).
- **subdomain**: `<project>.plate-software.de`, A-record → 82.165.206.45.
---
## 3. NEW PROJECT — local phase (every project does this)
Goal: app live at `http://192.168.188.119:<frontendHostPort>` via push-to-deploy.
1. **Create the repo from the template.**
- Generate from `homelab-app-template` in Gitea (or copy its `.gitea/`,
`docker-compose.truenas.yml`, frontend proxy route, `.env.example`).
- The template ships a working `.gitea/workflows/deploy.yml` and TrueNAS compose
override. You only fill in placeholders.
2. **Fill placeholders** (template uses `__PROJECT__`, `__FRONTEND_PORT__`,
`__BACKEND_PORT__`):
- compose `-p` project name = `__PROJECT__`
- frontend `ports: "__FRONTEND_PORT__:3000"`
- backend `ports: "__BACKEND_PORT__:8080"` (or drop entirely if no LAN debug)
- container names `__PROJECT__-frontend`, `__PROJECT__-backend`, `__PROJECT__-db`
3. **Set Gitea Actions secrets** (repo → Settings → Actions → Secrets). Minimum:
- `AUTH_SECRET``openssl rand -base64 32` (NextAuth)
- `JWT_SECRET``openssl rand -base64 32` (backend, if it issues JWTs)
- `DB_PASSWORD``openssl rand -base64 24`
```bash
# quick generate
for s in AUTH_SECRET JWT_SECRET DB_PASSWORD; do echo "$s=$(openssl rand -base64 32)"; done
```
4. **Push to `main`.** The instance-level act_runner picks it up automatically.
Watch the run:
```bash
# list recent runs via Gitea API (token in ~/.config or use web UI)
curl -s -H "Authorization: token $GITEA_TOKEN" \
"http://192.168.188.119:30008/api/v1/repos/<owner>/<repo>/actions/tasks" | jq '.workflow_runs[:5]'
```
Or just open `http://192.168.188.119:30008/<owner>/<repo>/actions`.
5. **Verify locally:**
```bash
curl -I http://192.168.188.119:<frontendHostPort>/ # expect 200 or 307→/login
```
✅ At this point the app is fully usable on the LAN. You can stop here for as long
as you want. The next section is **optional** and additive.
---
## 4. GO PUBLIC — the switch (run once per project, when ready)
Everything below is automated by [`homelab-publish.sh`](../scripts/homelab-publish.sh:1).
Run it and skip to §5 to verify. The manual steps are documented here so the script
is auditable and so you can debug if a step fails.
**Prereq:** DNS A-record `<project>.plate-software.de → 82.165.206.45` exists and
has propagated (`dig +short <project>.plate-software.de` must return 82.165.206.45).
### 4a. frp tunnel (TrueNAS frpc → VPS frps)
Append a proxy block to `/mnt/VM_SSD_Pool/frp/frpc.toml` on TrueNAS:
```toml
[[proxies]]
name = "__PROJECT__"
type = "tcp"
localIP = "127.0.0.1"
localPort = <frontendHostPort> # the LAN host port the frontend publishes
remotePort = <frpRemotePort> # from the registry, e.g. 30011
```
Reload: `ssh truenas 'systemctl restart frpc'`.
### 4b. IONOS Apache vhost
Create `/etc/apache2/sites-available/<project>.plate-software.de.conf`:
```apache
<VirtualHost *:80>
ServerName <project>.plate-software.de
Alias /.well-known/acme-challenge/ /var/www/html/.well-known/acme-challenge/
ProxyPass /.well-known/acme-challenge/ !
RewriteEngine On
RewriteCond %{REQUEST_URI} !^/\.well-known/acme-challenge/
RewriteRule ^(.*)$ https://%{HTTP_HOST}$1 [R=301,L]
</VirtualHost>
<VirtualHost *:443>
ServerName <project>.plate-software.de
SSLEngine on
SSLCertificateFile /root/.acme.sh/<project>.plate-software.de_ecc/<project>.plate-software.de.cer
SSLCertificateKeyFile /root/.acme.sh/<project>.plate-software.de_ecc/<project>.plate-software.de.key
SSLCertificateChainFile /root/.acme.sh/<project>.plate-software.de_ecc/ca.cer
ProxyPreserveHost On
ProxyPass / http://85.214.154.199:<frpRemotePort>/
ProxyPassReverse / http://85.214.154.199:<frpRemotePort>/
RequestHeader set X-Forwarded-Proto https
RequestHeader set X-Real-IP %{REMOTE_ADDR}s
</VirtualHost>
```
Enable HTTP vhost first (needed for the ACME challenge), then issue the cert, then
the 443 vhost will have a valid cert to load:
```bash
a2ensite <project>.plate-software.de.conf
apache2ctl configtest && systemctl reload apache2
```
### 4c. Let's Encrypt cert (acme.sh) — ⚠️ force the right CA
acme.sh defaults to **ZeroSSL**, which stalled on us (order stuck at
`retryafter=86400`). Always pin Let's Encrypt:
```bash
acme.sh --set-default-ca --server letsencrypt
acme.sh --issue -d <project>.plate-software.de -w /var/www/html --server letsencrypt
```
Then reload Apache so the 443 vhost picks up the freshly issued cert:
```bash
apache2ctl configtest && systemctl reload apache2
```
---
## 5. Verify public
⚠️ Your workstation DNS may be stale-cached. Force-resolve to be sure you're
testing the real path, not a cache:
```bash
curl -I --resolve <project>.plate-software.de:443:82.165.206.45 \
https://<project>.plate-software.de/ # expect 307 → /login, valid TLS
# full smoke (login → an authed endpoint)
curl -s --resolve <project>.plate-software.de:443:82.165.206.45 \
-c /tmp/cj https://<project>.plate-software.de/api/auth/... # adapt per app
```
Check the latest deploy run is green and the db port is closed:
```bash
ssh truenas 'docker exec <project>-db sh -c "netstat -tln | grep 5432 || true"' # internal only
ss -tln | grep 5432 # on TrueNAS host: should be EMPTY (no host publish)
```
---
## 6. Security baseline (end-of-alpha minimum)
- **db is internal-only.** Never publish Postgres to the LAN. In
`docker-compose.truenas.yml` use `ports: !override []` on the db service to drop
any inherited host publish. The backend reaches it as `db:5432` on the compose
net; deploy-time role reconcile uses `docker exec`.
- **Secrets via Gitea Actions secrets**, never committed. Injected at job `env`
level in deploy.yml.
- **Postgres password rotation gotcha:** `POSTGRES_PASSWORD` only applies on first
volume init. A persistent volume keeps the *old* role password. The deploy.yml
includes an `ALTER USER ... WITH PASSWORD` reconcile step (guarded by
`if [ -n "$DB_PASSWORD" ]`) so rotating the secret actually takes effect.
- **Frontend verify** in deploy.yml uses a container-loopback node probe
(`docker exec <project>-frontend node -e "require('http').get(...)"`), NOT a host
wget — the host probe gave transient false-failures while the container was still
recreating.
---
## 7. Auth gotchas (NextAuth v5 over the HTTPS→HTTP proxy boundary)
- Use **`auth()`**, not `getToken()`. `getToken`'s `__Secure-` cookie
autodetection breaks across the HTTPS-frontend → HTTP-internal boundary.
- Frontend env (set in compose override): `NEXTAUTH_URL` / `AUTH_URL` =
`https://<project>.plate-software.de`, `AUTH_TRUST_HOST=true`,
`BACKEND_URL=http://backend:8080`.
- The server-side proxy route (`/api/backend/[...path]/route.ts`) injects the
Bearer token from the session and streams bodies with `duplex: "half"`. This is
the systemic fix that unblocked CannaManage — it ships in the template.
---
## 8. Quick reference — "do the whole public switch"
```bash
# 1. allocate port in §2 registry (next = 30011), commit the runbook update
# 2. create DNS A-record <project>.plate-software.de → 82.165.206.45 (IONOS panel)
# 3. run the switch script:
./scripts/homelab-publish.sh <project> <frontendHostPort> <frpRemotePort>
# 4. verify (§5)
```
---
## Appendix — files this runbook references
- Template repo: `homelab-app-template` (Gitea) — scaffold for new projects
- Switch script: [`scripts/homelab-publish.sh`](../scripts/homelab-publish.sh:1)
- Proven examples: CannaManage [`deploy.yml`] + [`docker-compose.truenas.yml`],
InspectFlow (Caddy-fronted variant)
- Handover notes: `lumen-exchange/from-homelab/2026-06-22-cannamanage-public-hosting-LIVE.md`