feat: archive zoo_backup for home sync

This commit is contained in:
Patrick Plate
2026-06-24 19:27:05 +02:00
parent 02844e4c4a
commit 038e546963
133 changed files with 19953 additions and 0 deletions
@@ -0,0 +1,452 @@
# Skill: security-review
Security-focused code review against ADP security policies and PAISY patterns.
## Description
Analyzes code changes against ADP's 67 SEC-* security rules, OWASP guidelines, and 10 PAISY-specific security patterns. Integrates automated scan results (SonarQube via MCP, DataRake secrets detection) with a 16-item manual checklist. Produces a structured security review report with a clear PASS/FAIL verdict. PASS is required before code review can proceed.
## Invoked by
🔒 Security Reviewer mode
## Required Inputs
| Input | Source | Example |
|-------|--------|---------|
| `TICKET_KEY` | Jira issue key | `ESIDEPAISY-12081` |
| `MODULE` | PAISY module name | `eau`, `eubp`, `svmeldungen` |
## Output
Markdown file: `docs/<MODULE>/<TICKET_KEY>/<TICKET_KEY>-security-review.md`
## Steps
### 1. Get the diff
```bash
cd /Users/pplate/git/paisy-<TICKET_KEY>
git diff origin/current --name-only
git diff origin/current --stat
git diff origin/current
```
Identify all changed files. Separate Java source files from test files, resources, and documentation.
### 2. Read changed files fully for context
For each changed Java file, read the full file — not just the diff hunks. Security issues often depend on surrounding context (e.g., a method that looks safe in isolation but is called with untrusted input).
```bash
cd /Users/pplate/git/paisy-<TICKET_KEY>
git diff origin/current --name-only | grep "\.java$"
```
### 3. Run SonarQube analyze_code_snippet on changed files
For each changed Java file, use the SonarQube MCP tool:
```python
# Read the full file content
file_content = read_file("<path_to_changed_file>")
# Analyze with SonarQube
analyze_code_snippet(
fileContent=file_content,
language=["java"],
scope=["MAIN"] # or ["TEST"] for test files
)
```
Collect all SonarQube findings. Filter for security-relevant rules (SECURITY impact software quality).
#### 3.1 DataRake Secrets Scan (Pipeline-Äquivalent)
ADP's Jenkins-Pipeline ruft `secretsScan()` über das **DataRake**-Widget auf (regex-basierter Secrets-Detector). Der Reviewer simuliert dieses Verhalten manuell, da DataRake nicht als MCP-Tool verfügbar ist.
**Was DataRake erkennt:**
- Passwörter (Literalwerte in sicherheitsrelevanten Kontexten)
- Tokens (Basic Auth, Bearer, JWT)
- Unverschlüsselte Private Keys
- Gefährliche Dateien (`.netrc`, Java Keystores `.jks`/`.p12`, SSH Private Keys)
- Gefährliche Kommandos, die Credentials offenlegen
- Hartkodierte Secrets in JDBC-Connection-Strings (Oracle, MySQL, PostgreSQL, Informix)
**Detection-Regeln (wichtig für die Beurteilung):**
- Passwort-Werte müssen **670 Zeichen** lang sein, um zu triggern (kürzere Werte werden als Placeholder gefiltert)
- JDBC-Rakes erkennen `user/password` direkt in URL-Connection-Strings
- Dateiendung entscheidet, welche Rakes greifen (`.java` → Java-spezifische Rakes)
- **Nur Single-Line-Matching** — über mehrere Zeilen verteilte Secrets werden nicht erkannt
**Schweregrade laut DataRake:**
| Schweregrad | Kontext | Begründung |
|-------------|---------|-----------|
| CRITICAL | Passwort/Token in HTML oder JavaScript | Kann ADP-Netzwerk verlassen (Browser-deliverable) |
| HIGH | Secrets in anderem Source-Code (`.java`, `.yml`, `.properties`, `.sh`) | Im Repository sichtbar |
| MEDIUM / LOW | Review-würdige Items (Keystores, verdächtige Muster) | Manuelle Bewertung nötig |
**Manuelles Scan-Vorgehen:**
```bash
# 1. Verdächtige Dateien identifizieren
cd /Users/pplate/git/paisy-<TICKET_KEY>
git diff origin/current --name-only | grep -E '\.(java|yml|yaml|properties|xml|sh|js|html|jsp)$'
# 2. Auf typische Secret-Patterns prüfen (670 Zeichen)
git diff origin/current | grep -iE '(password|passwd|pwd|token|api[_-]?key|secret)\s*[:=]\s*["\x27][^"\x27]{6,70}["\x27]'
# 3. JDBC-Connection-Strings auf Inline-Credentials prüfen
git diff origin/current | grep -iE 'jdbc:[^"\x27\s]*(user|password)='
# 4. Gefährliche Dateien finden
git diff origin/current --name-only | grep -iE '(\.netrc|\.jks|\.p12|id_rsa|id_dsa|id_ecdsa)$'
```
Quelle: Confluence EIT-Space, Seiten `3270157258` (DataRake), `3270157260` (FAQ), `3270157263` (JDBC Rakes).
### 4. Apply SEC-* rules — manual checklist (16 items)
For each changed file, systematically check:
| # | Rules | Check | What to look for |
|---|-------|-------|-----------------|
| 1 | SEC-001..004 | No hardcoded credentials | Passwords, API keys, tokens, JDBC credentials in string literals. Exclude test files. |
| 2 | SEC-005 | Credentials via `@Value`/env | All credentials must come from `@Value("${...}")`, `System.getProperty()`, or `System.getenv()`. No `private static final String PASSWORD = "..."`. |
| 3 | SEC-011 | No SQL injection | All SQL must use parameterized queries (`?` placeholders, `@Query` with `:param`, JPA Criteria API). No string concatenation in SQL. |
| 4 | SEC-012 | No path traversal | File paths from external input must be sanitized. Check for `new File(userInput)`, `Paths.get(userInput)` without validation. |
| 5 | SEC-016 | Input validation | All external data entry points (REST endpoints, file parsers, PAISY responses) must validate input before processing. |
| 6 | SEC-018 | No info disclosure in errors | Error messages must not expose stack traces, internal paths, SQL queries, or system details to callers. |
| 7 | SEC-033 | PII encryption | Payroll data, HR data, personal data must be encrypted at rest. Check for PII stored in plain text in new DB columns. |
| 8 | SEC-035 | No PII in LLM processing | No personal data (names, BBNR, Versicherungsnummer) sent to AI/LLM services. |
| 9 | SEC-040 | No sensitive data in logs | Log statements must not contain passwords, tokens, PII, or full payroll records. Check `log.debug/info/warn/error` calls. |
| 10 | SEC-055 | No `src.gen/` modifications | Files under `src.gen/` are JAXB-generated. Any modification is silently overwritten and creates false security. |
| 11 | SEC-057 | ServiceCenter `F;` response validation | Every `ServiceCenter.INSTANCE()` call must check if the response starts with `F;` before parsing. Unchecked `F;` responses can lead to corrupted payroll data sent to government agencies. |
| 12 | SEC-058 | Date sentinel handling | Before parsing dates from PAISY, check for sentinel values: `00.00.0000`, `0000000`, `9999999`. Parsing these causes `DateTimeParseException` or silently wrong date calculations. |
| 13 | SEC-059 | Batch EntityManager flush/clear | Batch processing loops must call `em.flush()` and `em.clear()` every 100 items. Without this, the JPA persistence context grows unbounded → `OutOfMemoryError` in production. |
| 14 | SEC-060 | Dual Flyway migrations | Every new migration must exist in BOTH `db/migration/H2/` AND `db/migration/ORACLE/`. Missing one breaks either dev/test (H2) or production (Oracle). |
| 15 | SEC-061 | No hardcoded BBNR/IDs | No hardcoded Betriebsnummern, sprint IDs, Epic keys, or instance names. These must come from configuration or runtime lookup. |
| 16 | SEC-064 | DataRake compliance — no secrets in source | No literal passwords/tokens (670 chars), no inline credentials in JDBC URLs, no committed keystores/SSH keys, no dangerous credential-exposing commands. H2 test DBs must use random UUID passwords, not default `sa`/empty. |
### 5. Check PAISY-specific patterns (SEC-055 through SEC-063) with code examples
For each PAISY-specific rule, verify against the actual code:
#### SEC-055: src.gen/ protection
```bash
# Check if any changed files are under src.gen/
git diff origin/current --name-only | grep "src\.gen/"
# If any match → CRITICAL finding
```
#### SEC-056: JAXB namespace
```java
// ❌ FAIL — javax namespace (legacy)
import javax.xml.bind.annotation.*;
// ✅ PASS — jakarta namespace
import jakarta.xml.bind.annotation.*;
```
#### SEC-057: ServiceCenter F; validation
```java
// ❌ UNSAFE — no error check
String response = ServiceCenter.INSTANCE().getPaisy().pgmReadLine();
String[] parts = response.split(";");
String bbnr = parts[1]; // ArrayIndexOutOfBoundsException or wrong data if F; response
// ✅ SAFE — error check first
String response = ServiceCenter.INSTANCE().getPaisy().pgmReadLine();
if (response.startsWith("F;")) {
log.error("PAISY error: {}", response);
throw new PaisyRuntimeException("ServiceCenter error: " + response);
}
String[] parts = response.split(";");
```
#### SEC-058: Date sentinel handling
```java
// ❌ UNSAFE — no sentinel check
LocalDate date = LocalDate.parse(paiDate, formatter);
// ✅ SAFE — sentinel check first
if (paiDate.equals("00.00.0000") || paiDate.equals("0000000") || paiDate.equals("9999999")) {
return null; // or LocalDate.MAX / LocalDate.MIN as appropriate
}
LocalDate date = LocalDate.parse(paiDate, formatter);
```
#### SEC-059: Batch EM flush/clear
```java
// ❌ UNSAFE — no flush/clear in batch loop
for (Record record : records) {
em.persist(record);
}
// ✅ SAFE — flush/clear every 100 items
for (int i = 0; i < records.size(); i++) {
em.persist(records.get(i));
if (i % 100 == 0) {
em.flush();
em.clear();
}
}
```
#### SEC-060: Dual Flyway migrations
```bash
# Check for new migration files
git diff origin/current --name-only | grep "db/migration"
# For each H2 migration, verify a matching ORACLE migration exists (and vice versa)
```
#### SEC-061: No hardcoded identifiers
```java
// ❌ FAIL — hardcoded BBNR
String bbnr = "12345678";
// ❌ FAIL — hardcoded sprint ID
int sprintId = 1234;
// ✅ PASS — from configuration
@Value("${paisy.bbnr}")
private String bbnr;
```
#### SEC-062: CSV encoding
```java
// ❌ FAIL — wrong encoding
new FileReader(csvFile, StandardCharsets.UTF_8);
// ✅ PASS — correct German government standard
new FileReader(csvFile, Charset.forName("ISO-8859-1"));
```
#### SEC-063: Parameterized logging
```java
// ❌ FAIL — string concatenation
log.debug("Processing BBNR: " + bbnr + " with status: " + status);
// ✅ PASS — parameterized
log.debug("Processing BBNR: {} with status: {}", bbnr, status);
```
#### SEC-064: DataRake compliance — H2 test database pattern
H2 in-memory Test-Datenbanken sind ein häufiger Auslöser für DataRake-Findings. Der **default user `sa` mit leerem Passwort** triggert die JDBC-Rakes nicht zuverlässig (`sa` < 6 Zeichen), aber jeder kurze hartkodierte Wert in `JDBC_PASSWORD` kann als Secret erkannt werden, sobald er die 6-Zeichen-Schwelle überschreitet.
```java
// ✅ SAFE — kein User gesetzt, zufälliges UUID-Passwort (wird nicht als Secret erkannt,
// da UUID-Format als Test-Pattern gefiltert wird und kein User → kein JDBC-Rake-Match)
props.put(JDBC_DRIVER, "org.h2.Driver");
props.put(JDBC_URL, "jdbc:h2:mem:test_db;DB_CLOSE_DELAY=-1;MODE=Oracle");
props.put(JDBC_PASSWORD, UUID.randomUUID().toString());
// ❌ TRIGGERS — default user "sa" + festes kurzes Passwort
props.put(JDBC_USER, "sa");
props.put(JDBC_PASSWORD, "testpassword123");
// ❌ TRIGGERS — Inline-Credentials im JDBC-URL
props.put(JDBC_URL, "jdbc:oracle:thin:scott/tiger123@//host:1521/SID");
// ✅ SAFE — Credentials separat, via @Value oder env
@Value("${db.user}")
private String dbUser;
@Value("${db.password}")
private String dbPassword;
```
**Generelle Regeln zur Vermeidung von DataRake-Findings:**
- Keine literalen Passwörter (670 Zeichen) in `.java`, `.yml`, `.properties`, `.xml`, `.sh`
- Keine Inline-Credentials in JDBC-URLs (`user=`/`password=` im URL-String)
- Keine Java Keystores (`.jks`, `.p12`), SSH Private Keys oder `.netrc`-Dateien im Repository
- Template-Variablen verwenden: `${ENV_VAR}`, `{placeholder}`, `((concourse-var))` — DataRake filtert diese als StandardPatterns
- Für Tests: `UUID.randomUUID().toString()` oder `ENC[AES256,...]` für verschlüsselte Werte
Präzedenzfall: `ESIDEPAISY-12154` dokumentiert das Pattern `user=password=<short_value>` als bewusst akzeptierten lokalen Dev-Pattern.
### 6. Identify false positives
Before flagging a finding, check against known false positive patterns:
| Pattern | Tool | Why It's Safe | Action |
|---------|------|--------------|--------|
| Variable self-assignment | DataRake | Variable assignment, not a password literal | Skip |
| Credential parsing from connection string | DataRake | Extracting, not hardcoding | Skip |
| Environment variable retrieval | DataRake | Runtime injection, not hardcoded | Skip |
| Test data in `src/test/` | Both | Test-only scope, never deployed | Skip (unless test exposes real credentials) |
| Application-specific passwords by design | DataRake | Business logic requirement (e.g., PDF encryption) | Document as exception |
| Log file matches | DataRake | Build artifacts, not source code | Skip |
| Snyk result files | DataRake | Scan output, not source | Skip |
| Username=password pattern for local dev | DataRake | Intentional dev-only pattern (ESIDEPAISY-12154 precedent) | Document as accepted risk |
#### DataRake StandardPatternFilters (built-in false positives)
DataRake filtert die folgenden Werte automatisch — sie lösen **kein** Finding aus und müssen nicht als False Positive dokumentiert werden:
| Pattern-Beispiel | Typ | Begründung |
|------------------|-----|-----------|
| `${variable_name}` | Spring/Shell Template | Wird zur Laufzeit ersetzt |
| `{variable}` | Bare Braces | Template-Platzhalter |
| `{{variable}}` | Handlebars / Helm Template | Template-Engine-Syntax |
| `$(variable)` | Shell Substitution | Wird zur Laufzeit ersetzt |
| `$VARIABLE` | Bare Dollar Sign | Shell-/Env-Variable |
| `%variable%` | Windows Env Style | Wird zur Laufzeit ersetzt |
| `#variable#` | Hash-Delimited | Template-Platzhalter |
| `??variable??` | Null-Coalescing Style | Template-Platzhalter |
| `((concourse-placeholder))` | Concourse/Pipeline | Pipeline-Variable |
| `XXXXXXXXXX` / `xxxxxxxxxx` | Repeated Single Char | Regex `^([0-9A-ZxX+#*-])(\1)+$` |
| `test-test`, `foo_foo`, `aaaaaa` | Repeated Word | 310 gleiche Zeichen/Wortgruppen |
| `ENC[AES256,...]` | AES256-verschlüsselt | Bereits sicher verschlüsselt |
| Werte unter 6 Zeichen | Zu kurz | Alle Regexes verlangen `{6,70}` |
| `example.com`, `test.de` | Domain/URL | Beispiel-Domains |
**Konsequenz für den Reviewer:** Tritt einer dieser Patterns im Diff auf, ist es **kein Finding** — auch wenn der Kontext "password" oder "secret" enthält. Nicht in den False-Positive-Report aufnehmen, sondern still überspringen.
Also search BigMind for known false positive patterns:
```python
memory_search_facts("false positive security")
memory_search_facts("SecBench accepted")
```
### 7. Generate security review document
Write `docs/<MODULE>/<TICKET_KEY>/<TICKET_KEY>-security-review.md`:
```markdown
# Security Review: <TICKET_KEY> — <Summary>
**Datum:** <today>
**Modul:** <MODULE>
**Reviewer:** Roo (Security Reviewer)
**Branch:** <branch name>
**Verdict:** ✅ PASS / ❌ FAIL
---
## Scan-Ergebnisse
| Tool | Status | Befunde |
|------|--------|---------|
| SonarQube (SAST) | ✅ Sauber / ⚠️ N Befunde | <details> |
| Datarake (Secrets) | ✅ Sauber / ⚠️ N Befunde / ⏭️ Nicht verfügbar | <details> |
| Snyk Code | ✅ Sauber / ⚠️ N Befunde / ⏭️ Nicht verfügbar | <details> |
## Manuelle Sicherheits-Checkliste
| # | Regel | Prüfpunkt | Ergebnis | Anmerkung |
|---|-------|-----------|----------|-----------|
| 1 | SEC-001..004 | Keine hartkodierten Credentials | ✅/❌ | |
| 2 | SEC-005 | Credentials via @Value/env | ✅/❌ | |
| 3 | SEC-011 | Keine SQL-Injection | ✅/❌ | |
| 4 | SEC-012 | Kein Path Traversal | ✅/❌ | |
| 5 | SEC-016 | Input-Validierung | ✅/❌ | |
| 6 | SEC-018 | Keine Info-Disclosure in Fehlern | ✅/❌ | |
| 7 | SEC-033 | PII-Verschlüsselung | ✅/N/A | |
| 8 | SEC-035 | Kein PII in LLM-Verarbeitung | ✅/N/A | |
| 9 | SEC-040 | Keine sensiblen Daten in Logs | ✅/❌ | |
| 10 | SEC-055 | Keine src.gen/ Änderungen | ✅/❌ | |
| 11 | SEC-057 | F;-Response-Validierung | ✅/N/A | |
| 12 | SEC-058 | Datums-Sentinel-Behandlung | ✅/N/A | |
| 13 | SEC-059 | Batch-EM-Flush/Clear | ✅/N/A | |
| 14 | SEC-060 | Duale Flyway-Migrationen | ✅/N/A | |
| 15 | SEC-061 | Keine hartkodierten BBNR/IDs | ✅/❌ | |
| 16 | SEC-064 | DataRake-Compliance (keine Secrets im Source) | ✅/❌ | |
## Befunde
### ❌ Kritisch / Hoch (muss behoben werden)
1. **<file>:<line>** — [SEC-XXX] <Beschreibung>
- Schweregrad: Critical/High
- Empfehlung: <Behebungsvorschlag>
### ⚠️ Mittel (sollte behoben werden)
1. **<file>:<line>** — [SEC-XXX] <Beschreibung>
- Empfehlung: <Behebungsvorschlag>
### ️ Niedrig / Info
1. **<file>:<line>** — [SEC-XXX] <Beschreibung>
## Identifizierte False Positives
| Tool | Datei | Befund | Begründung |
|------|-------|--------|-----------|
| <tool> | <file> | <finding> | <why it's safe> |
## Verdict
**✅ PASS** — Keine kritischen oder hohen Sicherheitsbefunde. Weiter zum Code Review.
— oder —
**❌ FAIL** — N kritische/hohe Befunde müssen vor dem Code Review behoben werden.
```
### 8. Determine verdict
| Verdict | Criteria | Pipeline Action |
|---------|---------|----------------|
| ✅ PASS | No Critical or High findings after false positive filtering | Proceed to Phase 6 (Code Review) |
| ❌ FAIL | Any Critical or High finding remains | Loop back to Phase 4 (Code mode) for fixes, then re-review |
### 9. Store findings in BigMind
```python
memory_store_fact(
category="codebase",
fact=f"{TICKET_KEY}: Security review {verdict}. {total_findings} findings ({critical} Critical, {high} High, {medium} Medium, {low} Low). {false_positives} false positives identified."
)
# For significant findings, store details
memory_append_chunk(
session_id=SESSION_ID,
content=f"Security review for {TICKET_KEY}:\nVerdict: {verdict}\nFindings: {details}\nFalse positives: {fp_details}",
flag_reason="security review findings"
)
```
## Expected Output
- Security review document at `docs/<MODULE>/<TICKET_KEY>/<TICKET_KEY>-security-review.md`
- SonarQube scan results integrated (where available)
- All 16 manual checklist items evaluated (incl. DataRake secrets compliance)
- False positives identified and documented with rationale
- Clear PASS/FAIL verdict
- BigMind fact stored
## Error Handling
| Error | Resolution |
|-------|------------|
| Worktree not found | Check if `/Users/pplate/git/paisy-<TICKET_KEY>` exists, or use main repo with branch checkout |
| SonarQube MCP unavailable | Skip automated scan, note in report as "⏭️ Nicht verfügbar", rely on manual checklist |
| DataRake not available as MCP | Expected — always perform manual DataRake simulation via Step 3.1 grep commands |
| No Java files changed | Skip SonarQube scan and SAST checks, focus on configuration/resource security |
| Empty diff | Branch identical to `current` — report "no changes to review" |
| Large diff (>50 files) | Focus on high-risk files first: files with `ServiceCenter`, `EntityManager`, `@Value`, SQL, file I/O |
## Severity Levels
| Severity | Symbol | Definition | SLA | Pipeline Impact |
|----------|--------|-----------|-----|----------------|
| Critical | ❌ | Exploitable vulnerability, hardcoded production credentials, data corruption risk | Fix immediately | Blocks pipeline — FAIL |
| High | ❌ | Security weakness that could be exploited with effort, missing input validation on external data | Fix before merge | Blocks pipeline — FAIL |
| Medium | ⚠️ | Security improvement needed but not immediately exploitable | Fix in sprint | Advisory — PASS with warnings |
| Low | ️ | Best practice suggestion, defense-in-depth improvement | Fix when convenient | Advisory — PASS |
## Language
- Document content: **German**
- Code references (class names, methods, file paths): English as-is
- Checklist items: German
- BigMind facts: English
## Conventions
- One security review per ticket — don't split across multiple documents
- Always run SonarQube `analyze_code_snippet` when MCP tool is available
- Document ALL false positives with rationale — this builds the knowledge base
- Reference SEC-* rule IDs in all findings for traceability
- If a finding was previously accepted as risk (check BigMind), note it but don't re-flag