Implements the Sprint 10 Phase 1 foundation for the Smart Payment Import feature:
Domain layer:
- 3 new enums: BankFormat (MT940, CAMT053, CSV), ImportSessionStatus, MatchStatus
- StaffPermission.FINANCE_IMPORT
- AuditEventType: BANK_IMPORT_STARTED/COMPLETED/FAILED + BANK_PAYMENT_CONFIRMED
- NotificationType.BANK_IMPORT_COMPLETED
- ConsentType.BANK_DATA (DSGVO consent for IBAN storage)
- 3 new entities: BankImportSession, BankTransaction, CsvColumnMapping
- Member: + iban (VARCHAR 34) + ibanConsentDate
- MemberStatus.LEFT (semantic alias for RESIGNED, referenced by Sprint 9 RetentionService)
Persistence:
- V30__bank_import_sessions.sql
- V31__bank_transactions.sql
- V32__csv_column_mappings.sql (also adds iban + iban_consent_date to members)
- 3 Spring Data repositories
Parser infrastructure (cannamanage-service/src/main/java/de/cannamanage/service/bankimport):
- BankStatementParser interface (Strategy pattern, Spring-injected list)
- ParsedTransaction + ParseResult records
- BankStatementParseException (parse errors)
- Mt940Parser: custom state machine, CENTURY_BOUNDARY=70 for YY→YYYY, proprietary
header tolerance (skips lines before first :20: for StarMoney/WISO/Hibiscus wrappers)
- Camt053Parser: StAX streaming with XXE hardening (IS_SUPPORTING_EXTERNAL_ENTITIES,
SUPPORT_DTD, IS_REPLACING_ENTITY_REFERENCES all false); supports camt.053.001.02
and camt.053.001.08 namespaces
- CsvBankParser: Apache Commons CSV with configurable columns per club; German number
format ("1.234,56"); ISO-8859-1 default encoding
- BankStatementParserService: filename-extension hint + content probe; throws
UnrecognizedFormatException when no parser claims the file
Build verified via Docker (cannamanage-api:sprint10-phase1).
Sprint 9 fix (incidental, required to compile):
- Added MemberStatus.LEFT (Sprint 9 RetentionService referenced it but the enum
value was missing)
- MemberListRegistryGenerator: added LEFT to formatStatus() switch (mapped to
"Ausgetreten", same as RESIGNED)
Sprint 10 docs: analysis, plan, plan-review, testplan.
Co-Authored-By: Lumen <lumen@cannamanage.de>
14 KiB
Sprint 10 Analysis — Smart Payment Import (Kontoauszug-Abgleich)
Date: 2026-06-15 Author: Patrick Plate / Lumen (Architect) Status: Draft v1 Sprint Goal: Automated bank statement import with intelligent payment matching
1. Problem Statement
Cannabis clubs (Anbauvereinigungen) collect monthly/quarterly member fees. Currently, the admin must manually record each payment in the system — comparing bank statements line-by-line against expected member fees. For a club with 200 members making monthly payments, this means 200+ manual Payment entries per month.
Pain points:
- Time-consuming: ~2-4 hours/month for 200-member clubs
- Error-prone: manual transcription leads to mismatches
- Delayed bookkeeping: payments sit unrecorded for days/weeks
- No audit trail of the import source (original bank file)
- Difficult to identify missing payments (overdue detection lags)
Opportunity: German banks universally support MT940 (SWIFT standard) and increasingly CAMT.053 (ISO 20022 XML) for electronic bank statement export. Automating the import and matching process reduces the workload from hours to minutes.
2. Affected Components
Existing (Sprint 8 — Finance Module)
| Component | Path | Role |
|---|---|---|
FinanceService |
cannamanage-service/.../service/FinanceService.java |
Payment recording, ledger management |
Payment entity |
cannamanage-domain/.../entity/Payment.java |
Payment records with member + period |
LedgerEntry entity |
cannamanage-domain/.../entity/LedgerEntry.java |
Double-entry financial journal |
FeeSchedule |
cannamanage-domain/.../entity/FeeSchedule.java |
Fee definitions (amount, interval) |
MemberFeeAssignment |
cannamanage-domain/.../entity/MemberFeeAssignment.java |
Member ↔ Fee schedule link |
PaymentMethod enum |
cannamanage-domain/.../enums/PaymentMethod.java |
CASH, BANK_TRANSFER, SEPA_LASTSCHRIFT, OTHER |
finance.ts service |
cannamanage-frontend/src/services/finance.ts |
Frontend API hooks for finance |
| Finance pages | cannamanage-frontend/src/app/.../finance/ |
Admin finance management UI |
New (Sprint 10)
| Component | Purpose |
|---|---|
BankImportSession entity |
Tracks each uploaded file + processing status |
BankTransaction entity |
Individual parsed transactions from bank file |
CsvColumnMapping entity |
Saved CSV column templates per bank |
BankStatementParserService |
Format detection + delegation to format-specific parsers |
Mt940Parser |
SWIFT MT940 text format parser |
Camt053Parser |
ISO 20022 XML parser |
CsvBankParser |
Configurable CSV parser |
PaymentMatchingService |
Weighted matching algorithm with confidence scoring |
BankImportService |
Orchestration: upload → parse → match → confirm |
| Import UI wizard | 4-step frontend wizard for the import flow |
3. Current State (Ist-Zustand)
Finance Module (Sprint 8)
The finance module already supports:
- Fee Schedules: Define fee amounts + intervals (MONTHLY, QUARTERLY, YEARLY, ONE_TIME)
- Member Fee Assignments: Link members to fee schedules with validity periods
- Payment Recording: Manual payment entry with member, amount, method, period, reference
- Ledger Entries: Automatic INCOME entry on payment, EXPENSE entries for club costs
- Member Balance: Calculated outstanding amounts per member
- Notifications: PAYMENT_REMINDER, PAYMENT_OVERDUE, PAYMENT_RECEIVED types exist
- Receipts: Auto-generated on payment confirmation
What's Missing
- No file upload capability for bank statements
- No MT940/CAMT.053 parsing
- No automatic matching logic
- No import session tracking / audit trail
- No member IBAN storage (for enhanced matching)
- No CSV column mapping configuration
- No GDPR consent type for bank data / IBAN storage
- No bulk payment confirmation workflow
4. Format Specifications
4.1 MT940 (SWIFT Standard)
The most common bank statement export format from German banks (Sparkasse, Volksbank, Deutsche Bank, Commerzbank, ING DiBa, DKB).
Structure:
:20:TRANSACTION_REF -- Transaction reference number
:25:BLZKTO/IBAN -- Account identification (BLZ + Account or IBAN)
:28C:STATEMENT/PAGE -- Statement number / sequence
:60F:D/CYYMMDDCURRENCY AMT -- Opening balance (D=debit, C=credit)
:61:YYMMDDYYMMDDCD AMOUNT -- Transaction line (value date, booking date, D/C, amount)
:86:PURPOSE TEXT -- Multi-line Verwendungszweck (up to 6 lines of 27 chars)
:62F:D/CYYMMDDCURRENCY AMT -- Closing balance
:64:D/CYYMMDDCURRENCY AMT -- Available balance
Key parsing challenges:
- Multi-line
:86:field contains the Verwendungszweck (payment reference) - Amount uses comma as decimal separator (German:
1234,56) - Date format is YYMMDD (2-digit year)
D= debit (outgoing),C= credit (incoming)- Some banks prefix IBAN in
:25:, others use BLZ + account number - Field
:61:encodes transaction type codes (N = normal, S = SEPA, etc.)
Example transaction block:
:61:2506150615CR50,00NMSCNONREF
:86:SVWZ+M-0042 Mitgliedsbeitrag Juni
EREF+NOTPROVIDED
KREF+NOTPROVIDED
MREF+NOTPROVIDED
CRED+DE98ZZZ09999999999
DEBT+DE89370400440532013000
4.2 CAMT.053 (ISO 20022 XML)
Newer XML-based format, richer structured data. Increasingly used by German banks alongside MT940.
Structure (simplified):
<BkToCstmrStmt>
<Stmt>
<Id>Statement-ID</Id>
<Acct><Id><IBAN>DE89370400440532013000</IBAN></Id></Acct>
<Bal><!-- Opening/Closing balances --></Bal>
<Ntry>
<Amt Ccy="EUR">50.00</Amt>
<CdtDbtInd>CRDT</CdtDbtInd> <!-- CRDT=incoming, DBIT=outgoing -->
<BookgDt><Dt>2025-06-15</Dt></BookgDt>
<ValDt><Dt>2025-06-15</Dt></ValDt>
<NtryDtls>
<TxDtls>
<RmtInf>
<Ustrd>M-0042 Mitgliedsbeitrag Juni</Ustrd> <!-- Verwendungszweck -->
</RmtInf>
<RltdPties>
<Dbtr><Nm>Max Mustermann</Nm></Dbtr>
<DbtrAcct><Id><IBAN>DE12345678901234567890</IBAN></Id></DbtrAcct>
</RltdPties>
</TxDtls>
</NtryDtls>
</Ntry>
</Stmt>
</BkToCstmrStmt>
Advantages over MT940:
- Full counterparty IBAN always available
- Counterparty name in structured field
- Verwendungszweck clearly separated (Ustrd = unstructured, Strd = structured)
- ISO date format (YYYY-MM-DD)
- Amount with dot decimal separator
- Richer metadata (end-to-end reference, mandate ID for SEPA)
4.3 Generic CSV
Every German bank offers CSV download but with wildly different column layouts:
| Bank | Date Col | Amount Col | Reference Col | Separator | Encoding |
|---|---|---|---|---|---|
| Sparkasse | 0 | 4 | 8 | ; |
ISO-8859-1 |
| ING DiBa | 0 | 7 | 4 | ; |
ISO-8859-1 |
| DKB | 0 | 7 | 3 | ; |
ISO-8859-1 |
| Commerzbank | 0 | 4 | 3 | ; |
UTF-8 |
| Volksbank | 0 | 11 | 8 | ; |
ISO-8859-1 |
Common German CSV traits:
- Semicolon delimiter (not comma)
- ISO-8859-1 encoding (not UTF-8) for most banks
- German date format: dd.MM.yyyy
- German number format:
1.234,56(dot as thousands separator, comma as decimal) - Header rows (1-2 lines to skip)
- Quote character:
"for fields containing semicolons
5. Matching Algorithm Design
5.1 Candidate Generation
For each incoming transaction (credit only — debits are expenses):
- Filter members with active fee assignments in the club
- Calculate expected payment amount for current/recent periods
- Generate candidate pairs: (transaction, member)
5.2 Weighted Scoring
| Criterion | Weight | Score Logic |
|---|---|---|
| Amount match | 35% | 100 if exact match; 80 if within ±5%; 0 otherwise |
| Verwendungszweck contains member number | 30% | 100 if "M-XXXX" found; 50 if partial match |
| Verwendungszweck contains member name | 15% | 100 if full name match; 70 if last name only; 50 if fuzzy (Levenshtein ≤2) |
| IBAN match | 15% | 100 if exact IBAN match (requires stored IBAN) |
| Date within payment window | 5% | 100 if within expected period; 50 if ±30 days; 0 otherwise |
5.3 Confidence Thresholds
| Confidence | Classification | Action |
|---|---|---|
| ≥90% | AUTO_MATCHED | Green badge, ready for bulk confirm |
| 60–89% | SUGGESTED | Yellow badge, needs manual review |
| <60% | UNMATCHED | Red badge, manual assignment required |
5.4 Conflict Resolution
- If multiple members match a single transaction: pick highest confidence, mark as SUGGESTED (never AUTO)
- If a member matches multiple transactions: possible double payment — flag for review
- Negative amounts (outgoing): skip matching, offer expense categorization
6. Legal & Compliance Analysis
6.1 DSGVO (GDPR)
| Aspect | Legal Basis | Implementation |
|---|---|---|
| IBAN storage | Art. 6(1)(b) — contract performance + Art. 6(1)(a) — explicit consent | New BANK_DATA consent type, opt-in per member |
| Bank statement data | Art. 6(1)(f) — legitimate interest (bookkeeping) | Process & match, don't store raw file permanently |
| Counterparty names | Art. 6(1)(f) — legitimate interest | Only display during import review, don't persist for non-members |
| Data minimization | Art. 5(1)(c) | Delete raw import file after 30 days, keep only parsed transactions |
| Right to erasure | Art. 17 | bank_transactions linked to member — cascade on member deletion |
6.2 Financial Retention (§147 AO / GoBD)
| Data | Retention Period | Basis |
|---|---|---|
| Bank import sessions | 10 years | §147 Abs. 1 Nr. 5 AO (Buchungsbelege) |
| Bank transactions (parsed) | 10 years | §147 Abs. 1 Nr. 5 AO |
| Confirmed payments | 10 years | §147 Abs. 1 Nr. 5 AO |
| Raw import files | 30 days (then delete) | Data minimization — parsed data is the Beleg |
| CSV column mappings | Until club deletion | Configuration data, no retention requirement |
6.3 Tier Restrictions
| Feature | Starter | Pro | Enterprise |
|---|---|---|---|
| Imports per month | 1 | Unlimited | Unlimited |
| Formats | CSV only | MT940 + CAMT.053 + CSV | All formats |
| Saved column templates | 0 | 3 | Unlimited |
| Auto-confirm (≥90%) | ❌ | ❌ | ✅ |
| Import history retention | 3 months | 12 months | Unlimited |
7. Technology Decisions
7.1 Parser Libraries
| Format | Approach | Rationale |
|---|---|---|
| MT940 | Custom parser (no external lib) | Format is simple enough, no mature Java MT940 lib exists for Spring Boot 3.x. Hand-rolled parser gives full control over German bank quirks. |
| CAMT.053 | JAXB with XSD-generated classes | ISO 20022 has official XSD schemas. JAXB (Jakarta XML Binding) generates type-safe classes. Already used in PAISY for GKV data exchange. |
| CSV | Apache Commons CSV | Already a pattern in the project (semicolon + ISO-8859-1). Configurable delimiter, quote, encoding. |
7.2 File Upload
- Spring Boot multipart upload (
MultipartFile) - Max file size: 10 MB (sufficient for years of bank statements)
- Temporary storage: server filesystem during processing, then delete
- No S3/cloud storage needed — parsed data is persisted in DB
7.3 Fuzzy Matching
- Levenshtein distance for name matching: Apache Commons Text
StringUtils.getLevenshteinDistance() - Already available via Spring Boot starter dependencies
- No external NLP/AI needed — rule-based matching is sufficient for this domain
8. Risk Assessment
| Risk | Probability | Impact | Mitigation |
|---|---|---|---|
| MT940 format variations across banks | High | Medium | Extensive test fixtures from multiple banks; graceful degradation on parse errors |
| False positive matches (wrong member) | Medium | High | Never auto-confirm without admin review; require ≥90% confidence for green badge |
| Performance with large files (5000+ transactions) | Low | Medium | Stream-based parsing; batch DB inserts (flush/clear every 100); async processing for large files |
| IBAN storage GDPR complaints | Low | Medium | Explicit opt-in consent; easy deletion; clear privacy notice |
| Duplicate import (same file uploaded twice) | Medium | Low | Detect by filename + date range + total; warn but allow (idempotent match re-run) |
| CSV encoding issues (mojibake) | Medium | Low | Default ISO-8859-1; allow encoding override in mapping template |
9. Data Flow
graph TD
A[Admin uploads bank file] --> B[Format Detection]
B -->|MT940| C[Mt940Parser]
B -->|CAMT.053| D[Camt053Parser]
B -->|CSV| E[CsvBankParser]
C --> F[List of BankTransaction entities]
D --> F
E --> F
F --> G[PaymentMatchingService]
G --> H{Confidence Score}
H -->|>= 90%| I[AUTO_MATCHED - green]
H -->|60-89%| J[SUGGESTED - yellow]
H -->|< 60%| K[UNMATCHED - red]
I --> L[Admin reviews match table]
J --> L
K --> L
L --> M[Confirm matches]
M --> N[Create Payment + LedgerEntry]
N --> O[Send PAYMENT_RECEIVED notification]
M --> P[Categorize as expense]
P --> Q[Create expense LedgerEntry]
10. Open Questions
- Should we support MT940 multi-statement files (multiple accounts in one file)?
- Recommendation: Yes — filter by club's own IBAN, ignore other accounts.
- Should debit transactions (outgoing) be auto-categorized as expenses?
- Recommendation: Offer optional categorization, but don't auto-create. Admin selects category.
- Should we generate a PDF reconciliation report after import completion?
- Recommendation: Nice-to-have for Sprint 11. Focus on core import/match flow first.
- Should the matching algorithm learn from previous confirmations (ML)?
- Recommendation: Out of scope. Rule-based matching is sufficient for club-size data (50-500 members).
11. Integration Points
| System | Direction | What |
|---|---|---|
| FinanceService | Outgoing | recordPayment() called for each confirmed match |
| NotificationDispatchService | Outgoing | PAYMENT_RECEIVED to member on confirmation |
| AuditService | Outgoing | BANK_IMPORT_STARTED, BANK_IMPORT_COMPLETED events |
| RetentionService | Scheduled | Auto-delete raw files after 30 days; retain parsed data 10 years |
| ConsentService | Read | Check BANK_DATA consent before storing member IBAN |
| TierLimitService | Read | Enforce import count + format restrictions per plan tier |