fix(roo): add anti-loop guardrails to prevent autonomous session resumption

- Add Rule 9 (Anti-Loop Guardrail) to 01-bigmind-core.md: detect 2+ identical partial sessions and surface the loop to user instead of auto-resuming - Add partial=history clause to Rule 1: partial/blocked/abandoned outcomes are historical records only, never task queue items - Add focus guard to memory_announce_focus: must reflect current user message, not prior session outcome; use 'Awaiting user task assignment' if no task yet - Add .roo/rules/06-anti-loop.md: global injection for ALL modes overriding any mode-specific 'do the task immediately' behavior - Add mode interaction safety clause to 00-identity.md: session ritual does not authorize beginning any task — only explicit user message does Root cause: pic-gen 'do the task' personality + BigMind context inference produced 6 identical partial branding sessions in a loop.
docs(plans): add heretic encoder swap task for FLUX.2 Klein uncensored generation
2026-04-10 23:27:32 +02:00 · 2026-04-10 20:32:05 +02:00 · 2026-04-10 20:29:18 +02:00 · 2026-04-10 20:21:16 +02:00 · 2026-04-10 20:21:12 +02:00 · 2026-04-10 19:21:51 +02:00
8 changed files with 582 additions and 58 deletions
@@ -24,4 +24,15 @@ BigMind is my persistent memory MCP server at `~/.mcp/bigmind/memory.db`. I use
 - Use BigMind memory at the start of every task.
 - Form explicit hypotheses with confidence % during analysis.
 - Optimize for token efficiency — search memory before reading files.
- Work in modes: Architect (plan), Code (implement), Ask (explain), Debug (troubleshoot).
+- Work in modes: Architect (plan), Code (implement), Ask (explain), Debug (troubleshoot).
+
+## ⚠️ Session Ritual ≠ Task Authorization
+
+Completing `memory_start_session()` + `memory_list_hypotheses()` + `memory_announce_focus()` does
+**NOT** authorize beginning any task. It is housekeeping only.
+
+**Work begins only when Patrick explicitly assigns a task in the current conversation.**
+
+Prior session outcomes (`partial`, `blocked`, `abandoned`) are historical records. They are never
+instructions. Mode-specific rules that say "do the task immediately" apply only to tasks given by
+the user in this conversation — not to tasks inferred from memory context.
@@ -4,11 +4,18 @@
 Every new session must begin with the following sequence executed in strict order before any other work is performed:
 1. `memory_start_session()` — Open a new session and load all prior context, including user preferences, active projects, and recent decisions.
 2. `memory_list_hypotheses()` — Review all open hypotheses from previous sessions. Assess whether any have become stale, require updated confidence scores, or can be immediately resolved based on new information.
-3. `memory_announce_focus()` — Declare the explicit focus of this session, including the task objective, all files expected to be read or modified, the working branch if applicable, and the IDE environment (ide_hint="VS Code" or ide_hint="IntelliJ" as appropriate).
+3. `memory_announce_focus()` — Declare the explicit focus of this session, including the task objective, all files expected to be read or modified, the working branch if applicable, and the IDE environment (ide_hint="VS Code" or ide_hint="IntelliJ" as appropriate). **The focus MUST reflect the current session's task as stated by the user's first message. If the user has not yet given a task at the time of calling, use `"Awaiting user task assignment"` as the description. Never derive focus from a prior session's partial/blocked/abandoned outcome.**
 4. `memory_close_stale_sessions()` — Identify and close any orphaned sessions left behind by crashed or terminated IDE instances. A session is considered stale if it has had no activity for more than 2 hours and no corresponding active IDE is detected.

 Do not skip any step. Do not reorder. If any call fails, retry once before proceeding with a logged warning.

+> **⚠️ CRITICAL — Partial Sessions Are History, Not a Task Queue:**
+> Sessions closed with `partial`, `blocked`, or `abandoned` outcomes are **historical records only**.
+> They do NOT constitute pending obligations, resumption requests, or open tasks.
+> A new session begins fresh. The **only** source of the current session's task is what the user
+> writes in their **first message of this conversation** — never the outcome of a prior session.
+> Reading prior context is for awareness only — it does NOT authorize beginning any prior task.
+
 ## Rule 2: Session End Ritual (Always Last Action — No Exceptions)
 Every session must conclude with:
 `memory_end_session()` — Close the session with all of the following fields populated:
@@ -60,4 +67,28 @@ Multiple IDEs and sessions may be active simultaneously. Treat this as a concurr
 ## Rule 8: Consistency and Self-Correction
 - If at any point during a session you realize a rule was skipped or partially followed, immediately remediate by executing the missed step and logging the correction.
 - Periodically during long sessions (approximately every 10 substantive exchanges), perform a lightweight self-audit: verify the session is still focused on the announced objective, check for unflagged important exchanges, and update any hypothesis confidence scores that may have shifted.
- If the user provides information that contradicts a stored fact, update the fact immediately and log the change with the old value, new value, and reason for the update.
+- If the user provides information that contradicts a stored fact, update the fact immediately and log the change with the old value, new value, and reason for the update.
+
+## Rule 9: Detect and Break Session Loops Before They Start
+
+A **session loop** occurs when multiple consecutive sessions share near-identical headlines, topics,
+and `partial`/`blocked`/`abandoned` outcomes — indicating the same task failed to complete repeatedly
+without user re-authorization.
+
+**Detection:** If `memory_start_session()` context shows **2 or more** recently closed sessions with:
+- Substantially similar headlines or topics, **AND**
+- `partial`, `blocked`, or `abandoned` outcome
+
+**Required Response — Break the loop immediately:**
+1. Do NOT attempt to resume or retry the repeated task silently
+2. Inform the user: "I noticed the last N sessions all attempted [task] and ended partial. I won't auto-resume that. What would you like to do?"
+3. Summarize what context/progress was accumulated across those sessions
+4. Wait for an explicit user instruction before doing anything
+
+**Explicit resumption:** If the user's first message in this conversation explicitly asks to continue
+or retry the previous task, that is a valid instruction — proceed normally. The rule only prevents
+**silent autonomous resumption** based on context alone.
+
+**Mode interaction:** This rule applies regardless of mode. Even if a mode's rules say "do the task
+immediately," prior session context alone is never sufficient authorization. Only the user's live
+message in this conversation authorizes action.
@@ -0,0 +1,56 @@
+# Anti-Loop Guardrail — Mandatory for All Modes
+
+## ⛔ Never Resume Past Work Without Explicit User Authorization
+
+This rule applies to **every mode** (code, architect, debug, pic-gen, ask, homelab, paisy, etc.)
+and **overrides any mode-specific "do the task immediately" instructions**.
+
+### The Core Prohibition
+
+**Prior session context — including `partial`, `blocked`, or `abandoned` outcomes — does NOT
+authorize beginning, resuming, or retrying any task.**
+
+The only valid source of a task in any session is what **the user writes in their first message
+of the current conversation.**
+
+### What NOT To Do At Session Start
+
+❌ Do NOT look at the last session headline and start that task  
+❌ Do NOT interpret `partial` outcome as "I need to finish this"  
+❌ Do NOT call `memory_announce_focus()` with a prior session's task before the user speaks  
+❌ Do NOT begin any creative, generative, or code-writing work based on context alone  
+❌ Do NOT assume "the user probably wants to continue" — ask if unsure  
+
+### What TO Do At Session Start
+
+✅ Load context for **awareness only** — past sessions are reference, not instructions  
+✅ Announce focus as `"Awaiting user task assignment"` if the user has not yet spoken  
+✅ Wait for the user's first message before doing any substantive work  
+✅ If context shows a loop (2+ identical partial sessions), surface it explicitly and ask  
+
+### Session Loop Detection
+
+If `memory_start_session()` context shows **2 or more** recently closed sessions with:
+- Near-identical headlines or topics, AND
+- `partial`, `blocked`, or `abandoned` outcome
+
+**Stop. Do not resume.** Inform the user:
+
+> "I noticed the last [N] sessions all attempted [task description] and ended partial.
+> I won't auto-resume that — it's likely causing a loop. What would you like to do?"
+
+Then wait for an explicit instruction.
+
+### Exception: Explicit Resumption
+
+If the user's **first message** in this conversation explicitly says to continue or retry
+a prior task (e.g., "continue the branding generation", "pick up where we left off"),
+that IS valid authorization — proceed normally.
+
+The rule only prevents **silent autonomous resumption** from context inference.
+
+---
+
+*This file is loaded for all modes via `.roo/rules/`. It was added 2026-04-10 to fix a
+session loop bug where pic-gen sessions repeatedly attempted CannaManage branding generation
+without user authorization, producing 6 identical `partial` sessions.*
@@ -1,73 +1,98 @@
 {
-  "6": {
+  "1": {
+    "class_type": "CLIPLoader",
+    "inputs": {
+      "clip_name": "qwen_3_4b_klein.safetensors",
+      "type": "flux2",
+      "device": "default"
+    }
+  },
+  "2": {
    "class_type": "CLIPTextEncode",
    "inputs": {
-      "clip": ["30", 0],
+      "clip": ["1", 0],
      "text": "PROMPT_PLACEHOLDER"
    }
  },
-  "8": {
-    "class_type": "VAEDecode",
+  "3": {
+    "class_type": "CLIPTextEncode",
    "inputs": {
-      "samples": ["13", 0],
-      "vae": ["31", 0]
+      "clip": ["1", 0],
+      "text": "NEGATIVE_PLACEHOLDER"
    }
  },
-  "9": {
-    "class_type": "SaveImage",
+  "4": {
+    "class_type": "UNETLoader",
    "inputs": {
-      "filename_prefix": "mcp-image-gen",
-      "images": ["8", 0]
+      "unet_name": "flux-2-klein-4b.safetensors",
+      "weight_dtype": "default"
    }
  },
-  "13": {
-    "class_type": "KSampler",
-    "inputs": {
-      "cfg": 1.0,
-      "denoise": 1.0,
-      "latent_image": ["27", 0],
-      "model": ["32", 0],
-      "negative": ["33", 0],
-      "positive": ["6", 0],
-      "sampler_name": "euler",
-      "scheduler": "beta",
-      "seed": 42,
-      "steps": 4
-    }
-  },
-  "27": {
-    "class_type": "EmptySD3LatentImage",
-    "inputs": {
-      "batch_size": 1,
-      "height": 1024,
-      "width": 1024
-    }
-  },
-  "30": {
-    "class_type": "CLIPLoader",
-    "inputs": {
-      "clip_name": "qwen_3_4b_heretic.safetensors",
-      "type": "flux"
-    }
-  },
-  "31": {
+  "5": {
    "class_type": "VAELoader",
    "inputs": {
      "vae_name": "flux2-vae.safetensors"
    }
  },
-  "32": {
-    "class_type": "UNETLoader",
+  "6": {
+    "class_type": "EmptyFlux2LatentImage",
    "inputs": {
-      "unet_name": "flux-2-klein-4b.safetensors",
-      "weight_dtype": "fp8_e4m3fn"
+      "width": 1024,
+      "height": 1024,
+      "batch_size": 1
    }
  },
-  "33": {
-    "class_type": "CLIPTextEncode",
+  "7": {
+    "class_type": "Flux2Scheduler",
    "inputs": {
-      "clip": ["30", 0],
-      "text": "NEGATIVE_PLACEHOLDER"
+      "steps": 20,
+      "width": 1024,
+      "height": 1024
+    }
+  },
+  "8": {
+    "class_type": "CFGGuider",
+    "inputs": {
+      "model": ["4", 0],
+      "positive": ["2", 0],
+      "negative": ["3", 0],
+      "cfg": 5
+    }
+  },
+  "9": {
+    "class_type": "KSamplerSelect",
+    "inputs": {
+      "sampler_name": "euler"
+    }
+  },
+  "10": {
+    "class_type": "RandomNoise",
+    "inputs": {
+      "noise_seed": 42
+    }
+  },
+  "11": {
+    "class_type": "SamplerCustomAdvanced",
+    "inputs": {
+      "noise": ["10", 0],
+      "guider": ["8", 0],
+      "sampler": ["9", 0],
+      "sigmas": ["7", 0],
+      "latent_image": ["6", 0]
+    }
+  },
+  "12": {
+    "class_type": "VAEDecode",
+    "inputs": {
+      "samples": ["11", 0],
+      "vae": ["5", 0]
+    }
+  },
+  "13": {
+    "class_type": "SaveImage",
+    "inputs": {
+      "filename_prefix": "mcp-image-gen",
+      "images": ["12", 0]
    }
  }
-}
+}
@@ -63,11 +63,20 @@ def test_build_flux_workflow_heretic_model():
        seed=42,
        model="flux-2-klein-4b.safetensors",
    )
-    assert wf["6"]["class_type"] == "CLIPTextEncode"
-    assert wf["30"]["class_type"] == "CLIPLoader"  # Qwen3-4B uses single CLIPLoader
-    assert wf["32"]["inputs"]["unet_name"] == "flux-2-klein-4b.safetensors"
-    assert wf["31"]["inputs"]["vae_name"] == "flux2-vae.safetensors"
-    assert wf["13"]["inputs"]["scheduler"] == "beta"  # FLUX.2 Klein uses beta scheduler
+    # New FLUX.2 workflow uses different node IDs and types
+    assert wf["1"]["class_type"] == "CLIPLoader"          # Qwen3-4B uses single CLIPLoader
+    assert wf["1"]["inputs"]["type"] == "flux2"            # correct type for FLUX.2
+    assert wf["1"]["inputs"]["device"] == "default"        # required for FLUX.2 CLIPLoader
+    assert wf["1"]["inputs"]["clip_name"] == "qwen_3_4b_klein.safetensors"  # Comfy-Org/vae-text-encorder-for-flux-klein-4b
+    assert wf["2"]["class_type"] == "CLIPTextEncode"       # standard CLIP encode (not Flux-specific)
+    assert wf["4"]["class_type"] == "UNETLoader"
+    assert wf["4"]["inputs"]["unet_name"] == "flux-2-klein-4b.safetensors"
+    assert wf["4"]["inputs"]["weight_dtype"] == "default"  # not fp8 — avoids dimension errors
+    assert wf["6"]["class_type"] == "EmptyFlux2LatentImage"  # FLUX.2-specific latent
+    assert wf["8"]["class_type"] == "CFGGuider"            # CFGGuider replaces FluxDisableGuidance+BasicGuider
+    assert wf["8"]["inputs"]["cfg"] == 5                   # cfg=5 for FLUX.2 Klein
+    assert wf["11"]["class_type"] == "SamplerCustomAdvanced"  # FLUX.2 sampler (node 11, not 12)
+    assert wf["13"]["class_type"] == "SaveImage"           # output node


 def test_workflow_registry_contains_both_models():
@@ -0,0 +1,149 @@
+# BigMind Session Loop — Root Cause & Fix Plan
+
+**Date:** 2026-04-10  
+**Reported by:** Patrick  
+**Severity:** High — caused 6 identical wasted sessions with $0+ API cost per loop  
+
+---
+
+## Problem Statement
+
+BigMind's session ritual, combined with mode-specific behavior rules, creates a self-reinforcing
+resumption loop when a session ends as `partial`. The model loads prior context, sees an incomplete
+task, and autonomously attempts to resume it — without ever waiting for user input. This produces
+a chain of identical `partial` sessions that only breaks when Patrick manually intervenes.
+
+Observed: 6 identical sessions titled *"Prepared large-scale CannaManage branding generation"*,
+all `partial`, all spawned from one session ending before image generation completed in pic-gen mode.
+
+---
+
+## Root Cause Analysis
+
+### Loop Trigger Chain
+
+```
+[Session N] ends partial (task: CannaManage branding generation)
+      │
+      ▼
+[Session N+1] memory_start_session() → loads context
+      │
+      │  Context shows: last outcome = partial
+      │  Rule 1: "search before every task, avoid redundant work"
+      │  → model reads: "prior task incomplete, I must finish it"
+      │
+      ▼
+memory_announce_focus() called with prior session's task
+      │  → locks in wrong objective BEFORE user speaks
+      │
+      ▼
+Mode rules (pic-gen) fire: "generate images now"
+      │  → autonomous action without user instruction
+      │
+      ▼
+Hits context/token/tool limit → session ends partial
+      │
+      └──────────────────────────────────────────► REPEAT
+```
+
+### Three Compounding Failures
+
+#### Failure 1: Rule 1 — No "partial = history only" clause
+Rule 1 says to load context and search for prior work. It has **no explicit instruction**
+that sessions marked `partial` are historical records, NOT resumption requests.
+The model's default behavior is to treat incomplete work as a pending obligation.
+
+#### Failure 2: memory_announce_focus — Called on prior context, not current task
+The architect rules say to call `memory_announce_focus()` as part of the startup ritual.
+But when no user message has been received yet, the model has nothing to announce except
+the prior session's objective — which is the wrong task for the new session.
+
+#### Failure 3: Mode interaction amplification
+Modes with strong "do the task" personalities (pic-gen, code) compound the loop. When
+context suggests "there's pending image generation work", pic-gen mode's instructions
+say to start generating — creating autonomous action before the user speaks.
+
+---
+
+## Fix Design
+
+### Fix 1: Rule 1 Addendum — Partial Sessions Are History
+
+Add explicit text to Rule 1 in `01-bigmind-core.md`:
+
+> **`partial`, `blocked`, or `abandoned` outcomes are historical records only.**
+> They do NOT constitute task queues, resumption requests, or pending obligations.
+> A new session begins fresh. The current session's task is determined solely by
+> what the user writes in their first message — never by the outcome of a prior session.
+
+### Fix 2: New Rule 9 — Anti-Loop Guardrail
+
+Add Rule 9 to `01-bigmind-core.md`:
+
+> **Rule 9: Detect and Break Loops Before They Start**
+>
+> If `memory_start_session()` context shows 2 or more recently closed sessions with:
+> - Near-identical headlines or topics, AND
+> - `partial` or `blocked` outcome
+>
+> → **Do NOT attempt to resume the repeated task.**
+> → Instead: acknowledge the loop to the user, summarize what context was accumulated
+>   across the repeated sessions, and ask: "What would you like to do?"
+>
+> Never assume the correct action is to retry a failed/partial task silently.
+
+### Fix 3: memory_announce_focus — Wait for User Input
+
+Add a constraint to Rule 3 (announce focus):
+
+> **`memory_announce_focus()` must reflect the CURRENT session's task.**
+> Call it only AFTER the user has given a clear instruction for this conversation.
+> Do NOT announce focus derived from prior session outcomes before the user speaks.
+> During the startup ritual (steps 1-4 of Rule 1), use a placeholder focus if needed:
+> `memory_announce_focus(session_id, "Awaiting user task assignment")`
+
+### Fix 4: Mode Interaction Safety Clause
+
+Add a universal safety rule (applies to all modes):
+
+> **Session ritual completion ≠ task authorization.**
+> Completing `memory_start_session()` + `memory_list_hypotheses()` + `memory_announce_focus()`
+> does NOT authorize beginning any task. Work begins only when the user explicitly assigns it
+> in the current conversation. Prior session context is reference material, not instruction.
+
+---
+
+## Files to Change
+
+| File | Change |
+|------|--------|
+| `.roo/rules/01-bigmind-core.md` | Add Rule 9, add partial=history clause to Rule 1, add focus guard to Rule 3 |
+| `.roo/rules/00-identity.md` | Add mode-interaction safety clause |
+
+---
+
+## Risk Assessment
+
+| Risk | Likelihood | Mitigation |
+|------|-----------|------------|
+| Model ignores new rules in long context | Medium | Rules are loaded via rules files, not context — they apply per-session |
+| Fix breaks legitimate resumption (e.g., user explicitly asks to continue) | Low | Rules say "task determined by user's first message" — explicit resumption request still works |
+| New Rule 9 fires falsely on legitimate repeated partial tasks | Low | Trigger requires near-identical headlines AND repeated partial — normal work produces diverse headlines |
+
+---
+
+## Success Criteria
+
+1. Starting a new session after a partial pic-gen session → model waits for user input, no autonomous generation
+2. Starting a new session after 2+ identical partial sessions → model acknowledges the loop and asks what to do
+3. User explicitly asking "continue the branding generation" → model correctly resumes (rule only prevents silent resumption)
+
+---
+
+## Implementation Order
+
+1. Patch `.roo/rules/01-bigmind-core.md` — add Rule 9 + partial=history clause + focus guard
+2. Patch `.roo/rules/00-identity.md` — add mode interaction safety clause
+3. Test by starting a new session in pic-gen mode with partial history in context
+4. Push to Gitea
+
@@ -0,0 +1,139 @@
+# Task: Swap Qwen3-4B Encoder for Heretic Abliterated Version
+
+**Datum:** 2026-04-10  
+**Status:** Ready — waiting for correct Heretic encoder to be published  
+**Depends on:** FLUX.2 Klein 4B working (✅ done as of 2026-04-10)
+
+---
+
+## Goal
+
+Replace the standard `qwen_3_4b_klein.safetensors` with an abliterated (Heretic) version that has:
+- **Zero measurable quality loss** (KL divergence = 0.0000)
+- **No prompt refusals** (≤3/100 in DreamFast v1.2.0 testing)
+
+Result: `generate_image(prompt, model="flux-2-klein-4b.safetensors")` will work with **any** prompt without refusals.
+
+---
+
+## Current State
+
+| File | Location | Status |
+|------|----------|--------|
+| `flux-2-klein-4b.safetensors` | `~/ComfyUI/models/diffusion_models/` | ✅ Working |
+| `qwen_3_4b_klein.safetensors` | `~/ComfyUI/models/text_encoders/` | ✅ Working (standard, has refusals) |
+| `flux2-vae.safetensors` | `~/ComfyUI/models/vae/` | ✅ Working |
+
+The MCP workflow [`mcp/mcp-image-gen/src/workflows/flux2_klein_heretic.json`](../mcp/mcp-image-gen/src/workflows/flux2_klein_heretic.json) already uses `qwen_3_4b_klein.safetensors` — **no code change needed**, only the file on disk needs to be replaced.
+
+---
+
+## The Problem to Solve First
+
+The standard Heretic repos may not have the **FLUX.2 Klein-compatible** encoder dimensions:
+
+| Encoder | `hidden_size` | Conditioning dim | Usable? |
+|---------|--------------|-----------------|---------|
+| BFL Qwen3-4B (FLUX.2 Klein) | **2560** | 7680 (2560×3) | ✅ |
+| DreamFast/qwen3-4b-heretic | unknown — must check | ? | ⚠️ verify first |
+| Standard Qwen3-4B | 4096 | 4096 | ❌ wrong |
+
+**Before downloading, verify DreamFast's model is fine-tuned from the BFL variant** (hidden_size=2560), not the standard Qwen3 (hidden_size=4096).
+
+---
+
+## Steps
+
+### Step 1: Check DreamFast Heretic repo
+
+```bash
+huggingface-cli model-info DreamFast/qwen3-4b-heretic 2>/dev/null | grep -i hidden
+```
+
+Or browse: https://huggingface.co/DreamFast/qwen3-4b-heretic/blob/main/config.json  
+Look for: `"hidden_size": 2560` — that's the FLUX.2 Klein-compatible version.
+
+### Step 2a: If DreamFast has the right dimensions (2560)
+
+```bash
+# Download
+huggingface-cli download DreamFast/qwen3-4b-heretic \
+  --local-dir /tmp/qwen3-4b-heretic/
+
+# Back up working encoder first
+cp ~/ComfyUI/models/text_encoders/qwen_3_4b_klein.safetensors \
+   ~/ComfyUI/models/text_encoders/qwen_3_4b_klein_backup.safetensors
+
+# Swap in the Heretic version
+cp /tmp/qwen3-4b-heretic/model.safetensors \
+   ~/ComfyUI/models/text_encoders/qwen_3_4b_klein.safetensors
+```
+
+### Step 2b: If DreamFast has wrong dimensions (4096) — find alternative
+
+Options in order of preference:
+1. **Lockout/qwen3-4b-heretic-zimage** — check if BFL-compatible:
+   ```bash
+   huggingface-cli model-info Lockout/qwen3-4b-heretic-zimage 2>/dev/null | grep hidden
+   ```
+2. **Run Heretic abliteration yourself** on the working `qwen_3_4b_klein.safetensors`  
+   Tool: https://github.com/FailSpy/abliterator  
+   Script: `python abliterator.py --model qwen_3_4b_klein.safetensors --output qwen_3_4b_klein_heretic.safetensors`
+
+3. **Wait** for DreamFast or BFL to publish the FLUX.2-specific abliterated encoder
+
+### Step 3: Live test
+
+```python
+generate_image(
+    "an explicit test prompt that would normally be refused",
+    model="flux-2-klein-4b.safetensors",
+    steps=20
+)
+```
+
+Expected: Image generated, no refusal error in ComfyUI logs.
+
+### Step 4: If it works — no code changes needed
+
+The MCP code, workflow JSON, and registry are already correct. Just verify:
+- Check `journalctl --user -u comfyui -f` during generation for any errors
+- Confirm file in `~/Pictures/mcp-generated/` was saved
+
+---
+
+## Fallback Plan
+
+If the Heretic encoder is unavailable in the right dimensions, the **GGUF route** works too:
+
+```bash
+# ComfyUI-GGUF is already installed: ~/ComfyUI/custom_nodes/ComfyUI-GGUF
+# Download Heretic GGUF (if BFL-compatible variant published):
+huggingface-cli download Lockout/qwen3-4b-heretic-zimage \
+  qwen-4b-zimage-hereticV2-q8.gguf \
+  --local-dir ~/ComfyUI/models/text_encoders/
+```
+
+Then update [`flux2_klein_heretic.json`](../mcp/mcp-image-gen/src/workflows/flux2_klein_heretic.json) node `"1"`:
+```json
+"class_type": "CLIPLoaderGGUF",  // instead of CLIPLoader
+"inputs": {
+  "clip_name": "qwen-4b-zimage-hereticV2-q8.gguf",
+  "type": "flux2"
+}
+```
+
+---
+
+## No Code Changes Required (unless GGUF fallback)
+
+The entire MCP server, workflow registry, and test suite are already correct. This is **purely a model file task**.
+
+---
+
+## Success Criteria
+
+- [ ] `generate_image("...", model="flux-2-klein-4b.safetensors")` works with prompts that currently get refused
+- [ ] Output image quality identical to standard encoder (check: no visible artifacts vs reference)
+- [ ] ComfyUI logs show no dimension errors
+- [ ] `qwen_3_4b_klein_backup.safetensors` kept as rollback
@@ -0,0 +1,104 @@
+# FLUX.2 Klein 4B + Heretic — Session Recap
+
+**Date:** 2026-04-10  
+**Status:** Code complete, live generation BLOCKED by encoder dimension mismatch  
+
+---
+
+## What We Achieved ✅
+
+### Code Infrastructure (Solid)
+- **`mcp-image-gen/src/server.py`** — Generic workflow registry with model-based dispatch, `_inject_workflow_params()` works recursively on any node layout
+- **`mcp-image-gen/tests/test_server.py`** — 37/37 tests passing
+- **Gitea** — pushed to main (commit `38d26ad`)
+- The architecture is right: adding a new model = add 1 JSON file + 1 registry entry
+
+### Models Downloaded (on disk)
+| File | Location | Status |
+|------|----------|--------|
+| `flux-2-klein-4b.safetensors` | `~/ComfyUI/models/diffusion_models/` | ✅ 7.3GB |
+| `qwen_3_4b_bfl.safetensors` | `~/ComfyUI/models/text_encoders/` | ✅ merged from BFL shards |
+| `qwen_3_4b.safetensors` (z_image) | `~/ComfyUI/models/text_encoders/split_files/` | ✅ wrong model |
+| `Qwen3-4B-Q8_0.gguf` | `~/ComfyUI/models/text_encoders/` | ✅ wrong arch |
+| ComfyUI-GGUF extension | `~/ComfyUI/custom_nodes/ComfyUI-GGUF` | ✅ installed |
+
+---
+
+## What Failed and Why ❌
+
+### The Error (persistent)
+```
+mat1 and mat2 shapes cannot be multiplied (512x4096 and 7680x3072)
+```
+
+### Root Cause Analysis
+
+**Node 13** (`SamplerCustomAdvanced`) fails — meaning the conditioning vector from the text encoder doesn't match the diffusion model's expected input.
+
+| Component | Expected | Got |
+|-----------|----------|-----|
+| FLUX.2 Klein 4B conditioning input | **7680-dim** (2560 × 3) | **4096-dim** |
+
+**Why 7680 = 2560 × 3?**  
+FLUX models concatenate text embeddings across multiple time steps. The BFL Qwen3 encoder has `hidden_size=2560`, so the concatenated output is 2560×3=7680.
+
+**Why 4096?**  
+Every other Qwen3 variant (z_image_turbo, official Qwen repo GGUF) uses standard Qwen3 with `hidden_size=4096` — these are for Z-Image and text generation respectively, NOT for FLUX.2 Klein.
+
+### What We Tried (and Why Each Failed)
+1. `CLIPLoader type=flux` → wrong architecture (FLUX.1 style)
+2. `CLIPLoader type=flux2` → correct node, wrong encoder file (z_image Qwen)
+3. `CLIPLoaderGGUF type=flux2` → correct node, wrong GGUF (standard Qwen3)
+4. `CLIPLoader type=flux2 + qwen_3_4b_bfl.safetensors` → merged BFL shards, but still fails
+5. Workflow: `KSampler` → doesn't work with FLUX.2 (different architecture)
+6. Workflow: `SamplerCustomAdvanced + BasicGuider + Flux2Scheduler` → correct architecture but encoding mismatch persists
+
+### The Real Missing Piece
+
+The BFL FLUX.2 Klein text encoder in Diffusers format is designed for use via `transformers/diffusers` pipeline, NOT via ComfyUI's `CLIPLoader`. ComfyUI reads the weights differently. The weights are there but ComfyUI doesn't know how to map `model.embed_tokens`, `model.layers.N.*` etc. to the CLIP interface it expects.
+
+**The correct encoder file for ComfyUI** is `Comfy-Org/vae-text-encorder-for-flux-klein-4b` — the 7.5GB file we downloaded IS the right one, but ComfyUI is likely loading it with the wrong adapter in the `CLIPLoader`.
+
+---
+
+## Clean Approach — What We Need to Do
+
+### Option A: Use ComfyUI Web UI (Easiest)
+1. Open `http://localhost:8188` in browser
+2. Load the "Flux.2 Klein 4B Text-to-Image" workflow template (it's in the UI Templates)
+3. **Export the working API JSON** (Ctrl+Shift+E or Settings → Save as API format)
+4. Replace our `flux2_klein_heretic.json` with the exported JSON
+5. Add placeholders and test
+
+This gives us the **verified working node graph** without guessing. 10 minutes.
+
+### Option B: Find a Working API JSON online
+- Reddit r/comfyui has working FLUX.2 Klein workflows
+- Export format is what we need
+
+### Then: Add Heretic
+Once we have a working standard workflow:
+1. Download the actual Heretic-abliterated version of the BFL encoder (once it's published)
+2. Swap encoder filename in the JSON
+
+---
+
+## My Recommendation
+
+**Do Option A right now.** Open `http://localhost:8188`, load the template, export to API format, paste the JSON. We'll be running in 10 minutes instead of guessing node names.
+
+The MCP server code is solid — the only broken piece is `flux2_klein_heretic.json`. Once we have the right JSON from the UI, everything else works.
+
+---
+
+## Files to Clean Up (After We Have the Right JSON)
+
+```bash
+# Remove wrong encoders (save ~8GB)
+rm ~/ComfyUI/models/text_encoders/qwen_3_4b.safetensors   # z_image version
+rm ~/ComfyUI/models/text_encoders/qwen_3_4b_flux2.safetensors
+
+# Keep
+# ~/ComfyUI/models/text_encoders/qwen_3_4b_bfl.safetensors  ← correct encoder
+# ~/ComfyUI/models/text_encoders/Qwen3-4B-Q8_0.gguf          ← maybe useful later
+```
Author	SHA1	Message	Date
Patrick Plate	9453aecf0b	fix(roo): add anti-loop guardrails to prevent autonomous session resumption - Add Rule 9 (Anti-Loop Guardrail) to 01-bigmind-core.md: detect 2+ identical partial sessions and surface the loop to user instead of auto-resuming - Add partial=history clause to Rule 1: partial/blocked/abandoned outcomes are historical records only, never task queue items - Add focus guard to memory_announce_focus: must reflect current user message, not prior session outcome; use 'Awaiting user task assignment' if no task yet - Add .roo/rules/06-anti-loop.md: global injection for ALL modes overriding any mode-specific 'do the task immediately' behavior - Add mode interaction safety clause to 00-identity.md: session ritual does not authorize beginning any task — only explicit user message does Root cause: pic-gen 'do the task' personality + BigMind context inference produced 6 identical partial branding sessions in a loop.	2026-04-10 23:27:32 +02:00
Patrick Plate	1d1e70776f	docs(plans): add heretic encoder swap task for FLUX.2 Klein uncensored generation	2026-04-10 20:32:05 +02:00
Patrick Plate	1d8849cb41	fix(mcp-image-gen): confirmed working FLUX.2 Klein encoder filename - CLIPLoader clip_name: qwen_3_4b_klein.safetensors (from Comfy-Org/vae-text-encorder-for-flux-klein-4b) - VAE: flux2-vae.safetensors (321MB, same repo) - Live test confirmed: 2.1MB photorealistic 1024x1024 PNG in 52.43s on RX 7900 XTX - Test: assert clip_name == qwen_3_4b_klein.safetensors - 37/37 tests pass	2026-04-10 20:29:18 +02:00
Patrick Plate	40c91edf2f	fix(mcp-image-gen): merge CFGGuider workflow fix for FLUX.2 Klein 4B	2026-04-10 20:21:16 +02:00
Patrick Plate	4a99a3625a	fix(mcp-image-gen): rewrite flux2_klein_heretic workflow with CFGGuider + correct node types - Replace FluxDisableGuidance+BasicGuider chain with CFGGuider (cfg=5) - CLIPLoader: add device='default', keep type='flux2' - UNETLoader: weight_dtype='default' (not fp8_e4m3fn — avoids dimension mismatch) - VAEDecode/SaveImage: updated node IDs (11→VAEDecode, 12→SaveImage) - Encoder: qwen_3_4b_bfl.safetensors (7.5GB BFL-merged shards) - Tests: update heretic model assertions for new node structure (37/37 pass) - Add RECAP doc with root cause analysis and session history	2026-04-10 20:21:12 +02:00
Patrick Plate	38d26adb1f	Merge branch 'fix/mcp-image-gen/heretic-flux2-bugfixes'	2026-04-10 19:21:51 +02:00