Files
cannamanage/docs/ci-cd-review.md
T
Patrick Plate ade9673f02
CI — Build, Lint & Security Scan / frontend (push) Has been cancelled
CI — Build, Lint & Security Scan / image-scan (push) Has been cancelled
CI — Build, Lint & Security Scan / secrets-scan (push) Has been cancelled
CI — Build, Lint & Security Scan / backend (push) Has been cancelled
Deploy to TrueNAS / deploy (push) Has been cancelled
fix: harden CI security gates, parallelize builds, externalize secrets
- Make OWASP, Gitleaks, pnpm audit blocking (remove || true fallbacks)
- Add Maven -T 1C for parallel reactor threads
- Fix parallel Docker build race condition (PID tracking + set -euo pipefail)
- Externalize JWT/NextAuth secrets via env vars with dev-only defaults
- Add .env.example with generation instructions
- Add CI/CD infrastructure review document
2026-06-19 16:04:09 +02:00

258 lines
10 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# CI/CD Infrastructure Review — CannaManage
**Date:** 2026-06-19
**Reviewer:** Lumen (Code + Security)
**Scope:** `.gitea/workflows/ci.yml`, `.gitea/workflows/deploy.yml`, `Dockerfile.backend`, `docker-compose*.yml`, `.snyk`
**Verdict:** ⚠️ Approved with findings — 2 High, 4 Medium, 3 Low
---
## 1. Architecture Overview
```
Push to main
├── CI Pipeline (ci.yml)
│ ├── backend (compile + test + OWASP SCA) ─┐
│ ├── frontend (lint + type-check + pnpm audit) ─┼── parallel
│ ├── secrets-scan (Gitleaks) ─┘
│ └── image-scan (Trivy) ── needs: [backend, frontend]
└── Deploy Pipeline (deploy.yml)
└── docker compose build + up + healthcheck
```
**Good:** CI jobs are already parallelized. `concurrency` prevents redundant runs. `cancel-in-progress: true` on CI is correct (abort stale runs). `cancel-in-progress: false` on deploy prevents interrupted deployments.
---
## 2. Security Findings
### ❌ HIGH — S-01: Hardcoded JWT Secret in docker-compose.yml
**File:** `docker-compose.yml:34`
```yaml
CANNAMANAGE_SECURITY_JWT_SECRET: hmSULRhmFYcOXDwYxb7bGXp7Bovh+hXgua/VqF44Ts/N+8YELWpWiqQ+aLrymCuM
```
**Risk:** This base64-encoded HMAC key is committed to git. Anyone with repo access can forge valid JWTs. If this key is the same as production, it's a full auth bypass.
**Mitigation:**
- Replace with `${JWT_SECRET}` and use `.env` file (gitignored) for local dev
- Alternatively, generate a random key at container startup for dev-only:
```yaml
CANNAMANAGE_SECURITY_JWT_SECRET: ${JWT_SECRET:-$(openssl rand -base64 48)}
```
- Verify that production (`docker-compose.prod.yml`) uses env vars ✅ — it does use `${JWT_SECRET}`
**Accepted Risk?** If this is intentionally a dev-only secret different from production, document it in `.snyk` and add a comment. Gitleaks *should* flag this — check if it's being suppressed.
---
### ❌ HIGH — S-02: Hardcoded NextAuth Secret in docker-compose.yml
**File:** `docker-compose.yml:54` and `docker-compose.truenas.yml:19`
```yaml
NEXTAUTH_SECRET: docker-dev-nextauth-secret-minimum-32chars
AUTH_SECRET: docker-dev-nextauth-secret-minimum-32chars
```
**Risk:** Same as S-01. If this secret is reused on the TrueNAS deployment (which is internet-accessible via `cannamanage.plate-software.de`), session cookies can be forged.
**Mitigation:**
- `docker-compose.truenas.yml` should reference `${AUTH_SECRET}` from an env file
- The word "dev" in the value suggests it's dev-only, but TrueNAS *is* the production host per `deploy.yml`
---
### ⚠️ MEDIUM — S-03: OWASP Dependency-Check Always Passes (|| true)
**File:** `ci.yml:44`
```yaml
run: |
./mvnw org.owasp:dependency-check-maven:check \
... || true
```
**Risk:** The `|| true` means the step NEVER fails the build, regardless of CVSS score. The `failBuildOnCVSS=7` flag is effectively useless.
**Mitigation:** Remove `|| true`. If you need to allow known issues, use the suppression file that's already configured (`-DsuppressionFile=.snyk-maven-suppressions.xml`).
---
### ⚠️ MEDIUM — S-04: Gitleaks Non-Blocking (echo instead of exit)
**File:** `ci.yml:168`
```yaml
gitleaks detect ... --exit-code 1 \
|| echo "::error::Gitleaks found potential secrets in the repository"
```
**Risk:** `--exit-code 1` should fail the step, but the `||` fallback to `echo` means the step always succeeds. Secrets can be pushed without blocking the pipeline.
**Mitigation:** Remove the `|| echo` fallback. Let Gitleaks fail the build:
```yaml
run: gitleaks detect --source . --report-format json --report-path gitleaks-report.json --exit-code 1
```
---
### ⚠️ MEDIUM — S-05: pnpm audit Non-Blocking
**File:** `ci.yml:83`
```yaml
pnpm audit --audit-level=high || echo "::warning::pnpm audit found vulnerabilities"
```
**Risk:** High/Critical frontend CVEs produce a warning but don't fail the build.
**Mitigation:** Remove `|| echo ...` to make High/Critical CVEs block the pipeline. Use `.snyk` or `pnpm audit --ignore-path` for documented exceptions.
---
### ⚠️ MEDIUM — S-06: Parallel Docker Build Race Condition
**File:** `ci.yml:96-99`
```yaml
docker build -t cannamanage-backend:scan -f Dockerfile.backend . &
docker build -t cannamanage-frontend:scan -f cannamanage-frontend/Dockerfile cannamanage-frontend/ &
wait
```
**Risk:** If either `docker build` fails, `wait` returns the exit code of the *last* process that exited, not the first failure. One image could fail silently while the other succeeds.
**Mitigation:** Use `set -euo pipefail` and capture PIDs:
```yaml
run: |
set -euo pipefail
docker build -t cannamanage-backend:scan -f Dockerfile.backend . &
PID1=$!
docker build -t cannamanage-frontend:scan -f cannamanage-frontend/Dockerfile cannamanage-frontend/ &
PID2=$!
wait $PID1 $PID2
```
With `wait $PID1 $PID2`, if either fails, the step fails.
---
### ️ LOW — S-07: Trivy Uses curl|sh Install Pattern
**File:** `ci.yml:103`
```yaml
curl -sfL https://raw.githubusercontent.com/aquasecurity/trivy/main/contrib/install.sh | sh -s -- -b /usr/local/bin
```
**Risk:** Supply-chain risk — if the upstream script is compromised, arbitrary code runs in CI. Using `main` branch means no version pinning.
**Mitigation:** Pin to a specific version tag or commit hash, or use the official Trivy GitHub Action:
```yaml
- uses: aquasecurity/trivy-action@v0.28.0
with:
image-ref: cannamanage-backend:scan
severity: HIGH,CRITICAL
exit-code: 1
```
---
### ️ LOW — S-08: Deploy Workflow Uses Hardcoded IP
**File:** `deploy.yml:60, 74`
```yaml
wget -q -O /dev/null http://192.168.188.119:8081/actuator/health
```
**Risk:** Low — private IP only reachable on LAN. But if the TrueNAS IP changes (DHCP), deploy silently breaks.
**Mitigation:** Use `localhost` or a DNS name (e.g., `truenas.local`) instead. Or use container-internal checks via `docker compose exec`.
---
### ️ LOW — S-09: No Image Signing or SBOM
**Risk:** Docker images are built and deployed without signatures or Software Bill of Materials. Can't verify image integrity post-deployment.
**Mitigation (future):** Add `docker sbom` or `syft` for SBOM generation. Consider `cosign` for image signing when maturing to multi-env deployments.
---
## 3. Code Quality Findings
### CI Workflow (ci.yml)
| # | Check | Status | Notes |
|---|-------|--------|-------|
| 1 | Jobs parallelized | ✅ | backend, frontend, secrets-scan run concurrently |
| 2 | Concurrency control | ✅ | Group + cancel-in-progress |
| 3 | Maven caching | ✅ | `cache: maven` in setup-java |
| 4 | Maven `-T 1C` | ✅ | Multi-threaded reactor (just added) |
| 5 | Surefire forkCount | ✅ | `forkCount=2` in parent POM |
| 6 | Artifact upload | ✅ | Reports preserved on failure (`if: always()`) |
| 7 | Frontend lockfile | ✅ | `--frozen-lockfile` prevents silent dep changes |
| 8 | Trivy ignore-unfixed | ✅ | Only actionable vulns fail the build |
| 9 | Security gates blocking | ❌ | All 3 security tools use `|| true/echo` fallbacks |
| 10 | Docker parallel build | ⚠️ | Race condition on failure (see S-06) |
### Deploy Workflow (deploy.yml)
| # | Check | Status | Notes |
|---|-------|--------|-------|
| 1 | No test step | ⚠️ | Comment says "tests remain local-only gate" — acceptable for self-hosted |
| 2 | Health checks | ✅ | 20 retries × 6s for backend, 10 × 5s for frontend |
| 3 | Image pruning | ✅ | Prevents disk exhaustion |
| 4 | `set -euo pipefail` | ✅ | Fail-fast on errors |
| 5 | Concurrency non-cancelling | ✅ | Prevents interrupted deploys |
### Dockerfile.backend
| # | Check | Status | Notes |
|---|-------|--------|-------|
| 1 | Multi-stage build | ✅ | Builder + runtime stages |
| 2 | Non-root user | ✅ | `appuser` in runtime |
| 3 | Alpine base | ✅ | Minimal attack surface |
| 4 | Layer caching | ✅ | POMs copied before source |
| 5 | No HEALTHCHECK | ⚠️ | Compose defines it, but image itself has no HEALTHCHECK |
| 6 | `dependency:go-offline` | ✅ | Pre-downloads deps for cache |
| 7 | No COPY of secrets | ✅ | No .env or keys copied |
### Docker Compose Files
| # | Check | Status | Notes |
|---|-------|--------|-------|
| 1 | Production uses env vars | ✅ | `docker-compose.prod.yml` uses `${...}` |
| 2 | Resource limits (prod) | ✅ | Memory/CPU limits set |
| 3 | Log rotation (prod) | ✅ | `json-file` driver with max-size |
| 4 | Dev compose has secrets | ❌ | S-01, S-02 — hardcoded in plain text |
| 5 | Postgres healthcheck | ✅ | `pg_isready` |
| 6 | Service dependencies | ✅ | `condition: service_healthy` |
| 7 | Prod binds 127.0.0.1 only | ✅ | Prevents external access without reverse proxy |
| 8 | Dev binds 0.0.0.0 | ⚠️ | `ports: "8080:8080"` is all-interfaces |
---
## 4. Snyk Policy (.snyk)
✅ **Well-structured.** CSRF ignore is correctly scoped to the JWT API filter chain with clear rationale and expiry date. No concerns.
---
## 5. Recommendations (Priority Order)
| Priority | Finding | Action |
|----------|---------|--------|
| 🔴 High | S-01, S-02 | Move secrets to `.env` (gitignored) or generate at runtime |
| 🟡 Medium | S-03, S-04, S-05 | Remove `|| true` / `|| echo` from security tools — make them blocking |
| 🟡 Medium | S-06 | Fix parallel Docker build error propagation |
| 🟢 Low | S-07 | Pin Trivy version or use official action |
| 🟢 Low | S-08 | Replace hardcoded IP with DNS/localhost |
| 🟢 Low | S-09 | Add SBOM generation (future) |
---
## 6. Summary
The CI/CD setup is **architecturally sound** — parallel jobs, proper caching, multi-stage Docker builds, non-root containers, and comprehensive security scanning (OWASP, Trivy, Gitleaks, pnpm audit).
The main weakness is that **all security gates are non-blocking** (`|| true` everywhere), which means the scanning tools generate reports but never actually prevent a vulnerable build from deploying. This is the single most impactful fix: remove the fallbacks and let CVEs/secrets block the pipeline.
The hardcoded secrets in `docker-compose.yml` are a moderate risk — they're clearly dev-only values different from production, but the TrueNAS deployment (which is internet-facing) reuses the same NEXTAUTH_SECRET, which should be fixed.