feat: archive zoo_backup for home sync

2026-06-24 19:27:05 +02:00
parent 02844e4c4a
commit 038e546963
133 changed files with 19953 additions and 0 deletions
@@ -0,0 +1,452 @@
+# Skill: security-review
+
+Security-focused code review against ADP security policies and PAISY patterns.
+
+## Description
+
+Analyzes code changes against ADP's 67 SEC-* security rules, OWASP guidelines, and 10 PAISY-specific security patterns. Integrates automated scan results (SonarQube via MCP, DataRake secrets detection) with a 16-item manual checklist. Produces a structured security review report with a clear PASS/FAIL verdict. PASS is required before code review can proceed.
+
+## Invoked by
+
+🔒 Security Reviewer mode
+
+## Required Inputs
+
+| Input | Source | Example |
+|-------|--------|---------|
+| `TICKET_KEY` | Jira issue key | `ESIDEPAISY-12081` |
+| `MODULE` | PAISY module name | `eau`, `eubp`, `svmeldungen` |
+
+## Output
+
+Markdown file: `docs/<MODULE>/<TICKET_KEY>/<TICKET_KEY>-security-review.md`
+
+## Steps
+
+### 1. Get the diff
+
+```bash
+cd /Users/pplate/git/paisy-<TICKET_KEY>
+git diff origin/current --name-only
+git diff origin/current --stat
+git diff origin/current
+```
+
+Identify all changed files. Separate Java source files from test files, resources, and documentation.
+
+### 2. Read changed files fully for context
+
+For each changed Java file, read the full file — not just the diff hunks. Security issues often depend on surrounding context (e.g., a method that looks safe in isolation but is called with untrusted input).
+
+```bash
+cd /Users/pplate/git/paisy-<TICKET_KEY>
+git diff origin/current --name-only | grep "\.java$"
+```
+
+### 3. Run SonarQube analyze_code_snippet on changed files
+
+For each changed Java file, use the SonarQube MCP tool:
+
+```python
+# Read the full file content
+file_content = read_file("<path_to_changed_file>")
+
+# Analyze with SonarQube
+analyze_code_snippet(
+    fileContent=file_content,
+    language=["java"],
+    scope=["MAIN"]  # or ["TEST"] for test files
+)
+```
+
+Collect all SonarQube findings. Filter for security-relevant rules (SECURITY impact software quality).
+
+#### 3.1 DataRake Secrets Scan (Pipeline-Äquivalent)
+
+ADP's Jenkins-Pipeline ruft `secretsScan()` über das **DataRake**-Widget auf (regex-basierter Secrets-Detector). Der Reviewer simuliert dieses Verhalten manuell, da DataRake nicht als MCP-Tool verfügbar ist.
+
+**Was DataRake erkennt:**
+- Passwörter (Literalwerte in sicherheitsrelevanten Kontexten)
+- Tokens (Basic Auth, Bearer, JWT)
+- Unverschlüsselte Private Keys
+- Gefährliche Dateien (`.netrc`, Java Keystores `.jks`/`.p12`, SSH Private Keys)
+- Gefährliche Kommandos, die Credentials offenlegen
+- Hartkodierte Secrets in JDBC-Connection-Strings (Oracle, MySQL, PostgreSQL, Informix)
+
+**Detection-Regeln (wichtig für die Beurteilung):**
+- Passwort-Werte müssen **6–70 Zeichen** lang sein, um zu triggern (kürzere Werte werden als Placeholder gefiltert)
+- JDBC-Rakes erkennen `user/password` direkt in URL-Connection-Strings
+- Dateiendung entscheidet, welche Rakes greifen (`.java` → Java-spezifische Rakes)
+- **Nur Single-Line-Matching** — über mehrere Zeilen verteilte Secrets werden nicht erkannt
+
+**Schweregrade laut DataRake:**
+
+| Schweregrad | Kontext | Begründung |
+|-------------|---------|-----------|
+| CRITICAL | Passwort/Token in HTML oder JavaScript | Kann ADP-Netzwerk verlassen (Browser-deliverable) |
+| HIGH | Secrets in anderem Source-Code (`.java`, `.yml`, `.properties`, `.sh`) | Im Repository sichtbar |
+| MEDIUM / LOW | Review-würdige Items (Keystores, verdächtige Muster) | Manuelle Bewertung nötig |
+
+**Manuelles Scan-Vorgehen:**
+
+```bash
+# 1. Verdächtige Dateien identifizieren
+cd /Users/pplate/git/paisy-<TICKET_KEY>
+git diff origin/current --name-only | grep -E '\.(java|yml|yaml|properties|xml|sh|js|html|jsp)$'
+
+# 2. Auf typische Secret-Patterns prüfen (6–70 Zeichen)
+git diff origin/current | grep -iE '(password|passwd|pwd|token|api[_-]?key|secret)\s*[:=]\s*["\x27][^"\x27]{6,70}["\x27]'
+
+# 3. JDBC-Connection-Strings auf Inline-Credentials prüfen
+git diff origin/current | grep -iE 'jdbc:[^"\x27\s]*(user|password)='
+
+# 4. Gefährliche Dateien finden
+git diff origin/current --name-only | grep -iE '(\.netrc|\.jks|\.p12|id_rsa|id_dsa|id_ecdsa)$'
+```
+
+Quelle: Confluence EIT-Space, Seiten `3270157258` (DataRake), `3270157260` (FAQ), `3270157263` (JDBC Rakes).
+
+### 4. Apply SEC-* rules — manual checklist (16 items)
+
+For each changed file, systematically check:
+
+| # | Rules | Check | What to look for |
+|---|-------|-------|-----------------|
+| 1 | SEC-001..004 | No hardcoded credentials | Passwords, API keys, tokens, JDBC credentials in string literals. Exclude test files. |
+| 2 | SEC-005 | Credentials via `@Value`/env | All credentials must come from `@Value("${...}")`, `System.getProperty()`, or `System.getenv()`. No `private static final String PASSWORD = "..."`. |
+| 3 | SEC-011 | No SQL injection | All SQL must use parameterized queries (`?` placeholders, `@Query` with `:param`, JPA Criteria API). No string concatenation in SQL. |
+| 4 | SEC-012 | No path traversal | File paths from external input must be sanitized. Check for `new File(userInput)`, `Paths.get(userInput)` without validation. |
+| 5 | SEC-016 | Input validation | All external data entry points (REST endpoints, file parsers, PAISY responses) must validate input before processing. |
+| 6 | SEC-018 | No info disclosure in errors | Error messages must not expose stack traces, internal paths, SQL queries, or system details to callers. |
+| 7 | SEC-033 | PII encryption | Payroll data, HR data, personal data must be encrypted at rest. Check for PII stored in plain text in new DB columns. |
+| 8 | SEC-035 | No PII in LLM processing | No personal data (names, BBNR, Versicherungsnummer) sent to AI/LLM services. |
+| 9 | SEC-040 | No sensitive data in logs | Log statements must not contain passwords, tokens, PII, or full payroll records. Check `log.debug/info/warn/error` calls. |
+| 10 | SEC-055 | No `src.gen/` modifications | Files under `src.gen/` are JAXB-generated. Any modification is silently overwritten and creates false security. |
+| 11 | SEC-057 | ServiceCenter `F;` response validation | Every `ServiceCenter.INSTANCE()` call must check if the response starts with `F;` before parsing. Unchecked `F;` responses can lead to corrupted payroll data sent to government agencies. |
+| 12 | SEC-058 | Date sentinel handling | Before parsing dates from PAISY, check for sentinel values: `00.00.0000`, `0000000`, `9999999`. Parsing these causes `DateTimeParseException` or silently wrong date calculations. |
+| 13 | SEC-059 | Batch EntityManager flush/clear | Batch processing loops must call `em.flush()` and `em.clear()` every 100 items. Without this, the JPA persistence context grows unbounded → `OutOfMemoryError` in production. |
+| 14 | SEC-060 | Dual Flyway migrations | Every new migration must exist in BOTH `db/migration/H2/` AND `db/migration/ORACLE/`. Missing one breaks either dev/test (H2) or production (Oracle). |
+| 15 | SEC-061 | No hardcoded BBNR/IDs | No hardcoded Betriebsnummern, sprint IDs, Epic keys, or instance names. These must come from configuration or runtime lookup. |
+| 16 | SEC-064 | DataRake compliance — no secrets in source | No literal passwords/tokens (6–70 chars), no inline credentials in JDBC URLs, no committed keystores/SSH keys, no dangerous credential-exposing commands. H2 test DBs must use random UUID passwords, not default `sa`/empty. |
+
+### 5. Check PAISY-specific patterns (SEC-055 through SEC-063) with code examples
+
+For each PAISY-specific rule, verify against the actual code:
+
+#### SEC-055: src.gen/ protection
+```bash
+# Check if any changed files are under src.gen/
+git diff origin/current --name-only | grep "src\.gen/"
+# If any match → CRITICAL finding
+```
+
+#### SEC-056: JAXB namespace
+```java
+// ❌ FAIL — javax namespace (legacy)
+import javax.xml.bind.annotation.*;
+
+// ✅ PASS — jakarta namespace
+import jakarta.xml.bind.annotation.*;
+```
+
+#### SEC-057: ServiceCenter F; validation
+```java
+// ❌ UNSAFE — no error check
+String response = ServiceCenter.INSTANCE().getPaisy().pgmReadLine();
+String[] parts = response.split(";");
+String bbnr = parts[1]; // ArrayIndexOutOfBoundsException or wrong data if F; response
+
+// ✅ SAFE — error check first
+String response = ServiceCenter.INSTANCE().getPaisy().pgmReadLine();
+if (response.startsWith("F;")) {
+    log.error("PAISY error: {}", response);
+    throw new PaisyRuntimeException("ServiceCenter error: " + response);
+}
+String[] parts = response.split(";");
+```
+
+#### SEC-058: Date sentinel handling
+```java
+// ❌ UNSAFE — no sentinel check
+LocalDate date = LocalDate.parse(paiDate, formatter);
+
+// ✅ SAFE — sentinel check first
+if (paiDate.equals("00.00.0000") || paiDate.equals("0000000") || paiDate.equals("9999999")) {
+    return null; // or LocalDate.MAX / LocalDate.MIN as appropriate
+}
+LocalDate date = LocalDate.parse(paiDate, formatter);
+```
+
+#### SEC-059: Batch EM flush/clear
+```java
+// ❌ UNSAFE — no flush/clear in batch loop
+for (Record record : records) {
+    em.persist(record);
+}
+
+// ✅ SAFE — flush/clear every 100 items
+for (int i = 0; i < records.size(); i++) {
+    em.persist(records.get(i));
+    if (i % 100 == 0) {
+        em.flush();
+        em.clear();
+    }
+}
+```
+
+#### SEC-060: Dual Flyway migrations
+```bash
+# Check for new migration files
+git diff origin/current --name-only | grep "db/migration"
+# For each H2 migration, verify a matching ORACLE migration exists (and vice versa)
+```
+
+#### SEC-061: No hardcoded identifiers
+```java
+// ❌ FAIL — hardcoded BBNR
+String bbnr = "12345678";
+
+// ❌ FAIL — hardcoded sprint ID
+int sprintId = 1234;
+
+// ✅ PASS — from configuration
+@Value("${paisy.bbnr}")
+private String bbnr;
+```
+
+#### SEC-062: CSV encoding
+```java
+// ❌ FAIL — wrong encoding
+new FileReader(csvFile, StandardCharsets.UTF_8);
+
+// ✅ PASS — correct German government standard
+new FileReader(csvFile, Charset.forName("ISO-8859-1"));
+```
+
+#### SEC-063: Parameterized logging
+```java
+// ❌ FAIL — string concatenation
+log.debug("Processing BBNR: " + bbnr + " with status: " + status);
+
+// ✅ PASS — parameterized
+log.debug("Processing BBNR: {} with status: {}", bbnr, status);
+```
+
+#### SEC-064: DataRake compliance — H2 test database pattern
+
+H2 in-memory Test-Datenbanken sind ein häufiger Auslöser für DataRake-Findings. Der **default user `sa` mit leerem Passwort** triggert die JDBC-Rakes nicht zuverlässig (`sa` < 6 Zeichen), aber jeder kurze hartkodierte Wert in `JDBC_PASSWORD` kann als Secret erkannt werden, sobald er die 6-Zeichen-Schwelle überschreitet.
+
+```java
+// ✅ SAFE — kein User gesetzt, zufälliges UUID-Passwort (wird nicht als Secret erkannt,
+//          da UUID-Format als Test-Pattern gefiltert wird und kein User → kein JDBC-Rake-Match)
+props.put(JDBC_DRIVER, "org.h2.Driver");
+props.put(JDBC_URL, "jdbc:h2:mem:test_db;DB_CLOSE_DELAY=-1;MODE=Oracle");
+props.put(JDBC_PASSWORD, UUID.randomUUID().toString());
+
+// ❌ TRIGGERS — default user "sa" + festes kurzes Passwort
+props.put(JDBC_USER, "sa");
+props.put(JDBC_PASSWORD, "testpassword123");
+
+// ❌ TRIGGERS — Inline-Credentials im JDBC-URL
+props.put(JDBC_URL, "jdbc:oracle:thin:scott/tiger123@//host:1521/SID");
+
+// ✅ SAFE — Credentials separat, via @Value oder env
+@Value("${db.user}")
+private String dbUser;
+@Value("${db.password}")
+private String dbPassword;
+```
+
+**Generelle Regeln zur Vermeidung von DataRake-Findings:**
+- Keine literalen Passwörter (6–70 Zeichen) in `.java`, `.yml`, `.properties`, `.xml`, `.sh`
+- Keine Inline-Credentials in JDBC-URLs (`user=`/`password=` im URL-String)
+- Keine Java Keystores (`.jks`, `.p12`), SSH Private Keys oder `.netrc`-Dateien im Repository
+- Template-Variablen verwenden: `${ENV_VAR}`, `{placeholder}`, `((concourse-var))` — DataRake filtert diese als StandardPatterns
+- Für Tests: `UUID.randomUUID().toString()` oder `ENC[AES256,...]` für verschlüsselte Werte
+
+Präzedenzfall: `ESIDEPAISY-12154` dokumentiert das Pattern `user=password=<short_value>` als bewusst akzeptierten lokalen Dev-Pattern.
+
+### 6. Identify false positives
+
+Before flagging a finding, check against known false positive patterns:
+
+| Pattern | Tool | Why It's Safe | Action |
+|---------|------|--------------|--------|
+| Variable self-assignment | DataRake | Variable assignment, not a password literal | Skip |
+| Credential parsing from connection string | DataRake | Extracting, not hardcoding | Skip |
+| Environment variable retrieval | DataRake | Runtime injection, not hardcoded | Skip |
+| Test data in `src/test/` | Both | Test-only scope, never deployed | Skip (unless test exposes real credentials) |
+| Application-specific passwords by design | DataRake | Business logic requirement (e.g., PDF encryption) | Document as exception |
+| Log file matches | DataRake | Build artifacts, not source code | Skip |
+| Snyk result files | DataRake | Scan output, not source | Skip |
+| Username=password pattern for local dev | DataRake | Intentional dev-only pattern (ESIDEPAISY-12154 precedent) | Document as accepted risk |
+
+#### DataRake StandardPatternFilters (built-in false positives)
+
+DataRake filtert die folgenden Werte automatisch — sie lösen **kein** Finding aus und müssen nicht als False Positive dokumentiert werden:
+
+| Pattern-Beispiel | Typ | Begründung |
+|------------------|-----|-----------|
+| `${variable_name}` | Spring/Shell Template | Wird zur Laufzeit ersetzt |
+| `{variable}` | Bare Braces | Template-Platzhalter |
+| `{{variable}}` | Handlebars / Helm Template | Template-Engine-Syntax |
+| `$(variable)` | Shell Substitution | Wird zur Laufzeit ersetzt |
+| `$VARIABLE` | Bare Dollar Sign | Shell-/Env-Variable |
+| `%variable%` | Windows Env Style | Wird zur Laufzeit ersetzt |
+| `#variable#` | Hash-Delimited | Template-Platzhalter |
+| `??variable??` | Null-Coalescing Style | Template-Platzhalter |
+| `((concourse-placeholder))` | Concourse/Pipeline | Pipeline-Variable |
+| `XXXXXXXXXX` / `xxxxxxxxxx` | Repeated Single Char | Regex `^([0-9A-ZxX+#*-])(\1)+$` |
+| `test-test`, `foo_foo`, `aaaaaa` | Repeated Word | 3–10 gleiche Zeichen/Wortgruppen |
+| `ENC[AES256,...]` | AES256-verschlüsselt | Bereits sicher verschlüsselt |
+| Werte unter 6 Zeichen | Zu kurz | Alle Regexes verlangen `{6,70}` |
+| `example.com`, `test.de` | Domain/URL | Beispiel-Domains |
+
+**Konsequenz für den Reviewer:** Tritt einer dieser Patterns im Diff auf, ist es **kein Finding** — auch wenn der Kontext "password" oder "secret" enthält. Nicht in den False-Positive-Report aufnehmen, sondern still überspringen.
+
+Also search BigMind for known false positive patterns:
+```python
+memory_search_facts("false positive security")
+memory_search_facts("SecBench accepted")
+```
+
+### 7. Generate security review document
+
+Write `docs/<MODULE>/<TICKET_KEY>/<TICKET_KEY>-security-review.md`:
+
+```markdown
+# Security Review: <TICKET_KEY> — <Summary>
+
+**Datum:** <today>
+**Modul:** <MODULE>
+**Reviewer:** Roo (Security Reviewer)
+**Branch:** <branch name>
+**Verdict:** ✅ PASS / ❌ FAIL
+
+---
+
+## Scan-Ergebnisse
+
+| Tool | Status | Befunde |
+|------|--------|---------|
+| SonarQube (SAST) | ✅ Sauber / ⚠️ N Befunde | <details> |
+| Datarake (Secrets) | ✅ Sauber / ⚠️ N Befunde / ⏭️ Nicht verfügbar | <details> |
+| Snyk Code | ✅ Sauber / ⚠️ N Befunde / ⏭️ Nicht verfügbar | <details> |
+
+## Manuelle Sicherheits-Checkliste
+
+| # | Regel | Prüfpunkt | Ergebnis | Anmerkung |
+|---|-------|-----------|----------|-----------|
+| 1 | SEC-001..004 | Keine hartkodierten Credentials | ✅/❌ | |
+| 2 | SEC-005 | Credentials via @Value/env | ✅/❌ | |
+| 3 | SEC-011 | Keine SQL-Injection | ✅/❌ | |
+| 4 | SEC-012 | Kein Path Traversal | ✅/❌ | |
+| 5 | SEC-016 | Input-Validierung | ✅/❌ | |
+| 6 | SEC-018 | Keine Info-Disclosure in Fehlern | ✅/❌ | |
+| 7 | SEC-033 | PII-Verschlüsselung | ✅/N/A | |
+| 8 | SEC-035 | Kein PII in LLM-Verarbeitung | ✅/N/A | |
+| 9 | SEC-040 | Keine sensiblen Daten in Logs | ✅/❌ | |
+| 10 | SEC-055 | Keine src.gen/ Änderungen | ✅/❌ | |
+| 11 | SEC-057 | F;-Response-Validierung | ✅/N/A | |
+| 12 | SEC-058 | Datums-Sentinel-Behandlung | ✅/N/A | |
+| 13 | SEC-059 | Batch-EM-Flush/Clear | ✅/N/A | |
+| 14 | SEC-060 | Duale Flyway-Migrationen | ✅/N/A | |
+| 15 | SEC-061 | Keine hartkodierten BBNR/IDs | ✅/❌ | |
+| 16 | SEC-064 | DataRake-Compliance (keine Secrets im Source) | ✅/❌ | |
+
+## Befunde
+
+### ❌ Kritisch / Hoch (muss behoben werden)
+
+1. **<file>:<line>** — [SEC-XXX] <Beschreibung>
+   - Schweregrad: Critical/High
+   - Empfehlung: <Behebungsvorschlag>
+
+### ⚠️ Mittel (sollte behoben werden)
+
+1. **<file>:<line>** — [SEC-XXX] <Beschreibung>
+   - Empfehlung: <Behebungsvorschlag>
+
+### ℹ️ Niedrig / Info
+
+1. **<file>:<line>** — [SEC-XXX] <Beschreibung>
+
+## Identifizierte False Positives
+
+| Tool | Datei | Befund | Begründung |
+|------|-------|--------|-----------|
+| <tool> | <file> | <finding> | <why it's safe> |
+
+## Verdict
+
+**✅ PASS** — Keine kritischen oder hohen Sicherheitsbefunde. Weiter zum Code Review.
+— oder —
+**❌ FAIL** — N kritische/hohe Befunde müssen vor dem Code Review behoben werden.
+```
+
+### 8. Determine verdict
+
+| Verdict | Criteria | Pipeline Action |
+|---------|---------|----------------|
+| ✅ PASS | No Critical or High findings after false positive filtering | Proceed to Phase 6 (Code Review) |
+| ❌ FAIL | Any Critical or High finding remains | Loop back to Phase 4 (Code mode) for fixes, then re-review |
+
+### 9. Store findings in BigMind
+
+```python
+memory_store_fact(
+    category="codebase",
+    fact=f"{TICKET_KEY}: Security review {verdict}. {total_findings} findings ({critical} Critical, {high} High, {medium} Medium, {low} Low). {false_positives} false positives identified."
+)
+
+# For significant findings, store details
+memory_append_chunk(
+    session_id=SESSION_ID,
+    content=f"Security review for {TICKET_KEY}:\nVerdict: {verdict}\nFindings: {details}\nFalse positives: {fp_details}",
+    flag_reason="security review findings"
+)
+```
+
+## Expected Output
+
+- Security review document at `docs/<MODULE>/<TICKET_KEY>/<TICKET_KEY>-security-review.md`
+- SonarQube scan results integrated (where available)
+- All 16 manual checklist items evaluated (incl. DataRake secrets compliance)
+- False positives identified and documented with rationale
+- Clear PASS/FAIL verdict
+- BigMind fact stored
+
+## Error Handling
+
+| Error | Resolution |
+|-------|------------|
+| Worktree not found | Check if `/Users/pplate/git/paisy-<TICKET_KEY>` exists, or use main repo with branch checkout |
+| SonarQube MCP unavailable | Skip automated scan, note in report as "⏭️ Nicht verfügbar", rely on manual checklist |
+| DataRake not available as MCP | Expected — always perform manual DataRake simulation via Step 3.1 grep commands |
+| No Java files changed | Skip SonarQube scan and SAST checks, focus on configuration/resource security |
+| Empty diff | Branch identical to `current` — report "no changes to review" |
+| Large diff (>50 files) | Focus on high-risk files first: files with `ServiceCenter`, `EntityManager`, `@Value`, SQL, file I/O |
+
+## Severity Levels
+
+| Severity | Symbol | Definition | SLA | Pipeline Impact |
+|----------|--------|-----------|-----|----------------|
+| Critical | ❌ | Exploitable vulnerability, hardcoded production credentials, data corruption risk | Fix immediately | Blocks pipeline — FAIL |
+| High | ❌ | Security weakness that could be exploited with effort, missing input validation on external data | Fix before merge | Blocks pipeline — FAIL |
+| Medium | ⚠️ | Security improvement needed but not immediately exploitable | Fix in sprint | Advisory — PASS with warnings |
+| Low | ℹ️ | Best practice suggestion, defense-in-depth improvement | Fix when convenient | Advisory — PASS |
+
+## Language
+
+- Document content: **German**
+- Code references (class names, methods, file paths): English as-is
+- Checklist items: German
+- BigMind facts: English
+
+## Conventions
+
+- One security review per ticket — don't split across multiple documents
+- Always run SonarQube `analyze_code_snippet` when MCP tool is available
+- Document ALL false positives with rationale — this builds the knowledge base
+- Reference SEC-* rule IDs in all findings for traceability
+- If a finding was previously accepted as risk (check BigMind), note it but don't re-flag