Merge branch 'fix/mcp-image-gen/heretic-flux2-bugfixes'

This commit is contained in:
Patrick Plate
2026-04-10 19:21:51 +02:00
6 changed files with 531 additions and 38 deletions
+5 -1
View File
@@ -2,7 +2,11 @@
**FastMCP server for AI image generation via ComfyUI.**
This MCP server wraps a locally running [ComfyUI](https://github.com/comfyanonymous/ComfyUI) instance, exposing image generation as MCP tools callable from Roo Code, Claude Desktop, or any MCP-compatible client. It supports FLUX.1-schnell, FLUX.1-dev, SDXL, and any other ComfyUI-compatible checkpoint model. Generated images are saved to disk **and** returned as inline base64 so Claude can display them directly in chat.
This MCP server wraps a locally running [ComfyUI](https://github.com/comfyanonymous/ComfyUI) instance, exposing image generation as MCP tools callable from Roo Code, Claude Desktop, or any MCP-compatible client.
**New:** Support for **FLUX.2 Klein 4B** with **Heretic-abliterated Qwen3-4B text encoder** (zero KL divergence, no refusals). Select via `model="flux-2-klein-4b-fp8.safetensors"`.
It supports FLUX.1-schnell (default), FLUX.2 Klein (Heretic), and any other ComfyUI-compatible checkpoint model. Generated images are saved to disk **and** returned as inline base64 so Claude can display them directly in chat.
---
+50 -1
View File
@@ -565,7 +565,56 @@ Then pass it back: `seed=3847291045`
---
## 10. Known Limitations
## 10. FLUX.2 Klein 4B with Heretic Abliteration (New)
**New in this release:** Support for **FLUX.2 Klein 4B** using an **abliterated Qwen3-4B text encoder** via Heretic.
### Why Heretic?
FLUX.2 Klein uses a full LLM (Qwen3-4B) as its text encoder instead of CLIP+T5. This LLM has safety alignment that can refuse certain prompts. Heretic removes this alignment with **zero measurable KL divergence** (0.0000) and only 3/100 refusals.
### How to use it
```python
generate_image(
prompt="a beautiful cyberpunk fox in neon tokyo, highly detailed",
model="flux-2-klein-4b-fp8.safetensors",
width=1024,
height=1024,
steps=4
)
```
### Models to download
```bash
# 1. FLUX.2 Klein 4B (distilled, fp8)
huggingface-cli download black-forest-labs/FLUX.2-klein-4B \
flux-2-klein-4b-fp8.safetensors \
--local-dir ~/ComfyUI/models/diffusion_models/
# 2. FLUX.2 VAE
huggingface-cli download black-forest-labs/FLUX.2-klein-4B \
flux2-vae.safetensors \
--local-dir ~/ComfyUI/models/vae/
# 3. Heretic-abliterated Qwen3-4B (from DreamFast)
huggingface-cli download DreamFast/qwen3-4b-heretic \
--local-dir /tmp/qwen3-heretic/
cp /tmp/qwen3-heretic/model.safetensors \
~/ComfyUI/models/text_encoders/qwen_3_4b_heretic.safetensors
```
### Supported models (via `model=` parameter)
| Model | Description | VRAM | Speed | Censorship |
|-------|-------------|------|-------|------------|
| `flux1-schnell.safetensors` | Original (default) | ~8GB | Very fast | None |
| `flux-2-klein-4b-fp8.safetensors` | **New** — with Heretic Qwen3-4B | ~12GB | Fast | **Removed** |
---
## 11. Known Limitations
### ComfyUI must run locally
+44 -20
View File
@@ -39,8 +39,14 @@ COMFYUI_DIR = Path(
# Maximum number of images allowed in a single batch call
MAX_COUNT = 10
# Path to the bundled FLUX.1-schnell workflow template
_WORKFLOW_PATH = Path(__file__).parent / "workflows" / "flux_schnell.json"
# Workflow registry: model filename → workflow JSON path
# This allows us to support multiple models (FLUX.1-schnell + FLUX.2 Klein with Heretic encoder)
_WORKFLOW_REGISTRY: dict[str, Path] = {
"flux1-schnell.safetensors": Path(__file__).parent / "workflows" / "flux_schnell.json",
"flux-2-klein-4b.safetensors": Path(__file__).parent / "workflows" / "flux2_klein_heretic.json",
}
_DEFAULT_MODEL = "flux1-schnell.safetensors"
# ---------------------------------------------------------------------------
@@ -181,21 +187,37 @@ class ComfyUIClient:
return resp.content
async def get_models(self) -> list[str]:
"""Return the list of available checkpoint model filenames."""
"""Return the list of available checkpoint model filenames.
Combines models known to ComfyUI with our internal registry
(including FLUX.2 Klein with Heretic encoder).
"""
models = set()
# Get models from ComfyUI
try:
async with httpx.AsyncClient(timeout=10.0) as client:
resp = await client.get(
f"{self.base_url}/object_info/CheckpointLoaderSimple"
)
resp.raise_for_status()
data = resp.json()
# ComfyUI returns: {"CheckpointLoaderSimple": {"input": {"required": {"ckpt_name": [["model1.safetensors", ...], ...]}}}}
node_info = data.get("CheckpointLoaderSimple", {})
ckpt_list = (
node_info.get("input", {})
.get("required", {})
.get("ckpt_name", [[]])[0]
)
return ckpt_list if isinstance(ckpt_list, list) else []
if isinstance(ckpt_list, list):
models.update(ckpt_list)
except Exception:
# ComfyUI not reachable — fall back to registry only
pass
# Add our registered models
models.update(_WORKFLOW_REGISTRY.keys())
return sorted(list(models))
# ---------------------------------------------------------------------------
@@ -209,13 +231,20 @@ def build_flux_workflow(
height: int,
steps: int,
seed: int,
model: str,
model: str = _DEFAULT_MODEL,
) -> dict:
"""Build a ComfyUI API-format workflow dict for FLUX.1-schnell text-to-image.
"""Build a ComfyUI API-format workflow dict for the requested model.
This is a pure function — no I/O, fully testable.
Supports:
- "flux1-schnell.safetensors" (original)
- "flux-2-klein-4b-fp8.safetensors" (with Heretic-abliterated Qwen3-4B text encoder)
Falls back to FLUX.1-schnell if model is unknown.
This is a pure function — no I/O outside the registry, fully testable.
"""
with open(_WORKFLOW_PATH) as f:
workflow_path = _WORKFLOW_REGISTRY.get(model, _WORKFLOW_REGISTRY[_DEFAULT_MODEL])
with open(workflow_path) as f:
wf = json.load(f)
wf = copy.deepcopy(wf)
@@ -277,18 +306,13 @@ async def _generate_single(
) -> list:
"""Generate a single image and return [TextContent, ImageContent] or [TextContent] on error.
Args:
client: ComfyUIClient instance.
prompt: Positive text prompt.
negative_prompt: Negative text prompt.
width / height: Image dimensions.
steps: Inference steps.
seed: Seed value (-1 = random).
model: ComfyUI model filename.
resolved_output_dir: Resolved output directory Path.
name: User-supplied name prefix (unsanitized).
label: Human-readable label for TextContent prefix (e.g. "[lumen 1/3]").
Supports two models:
- flux1-schnell.safetensors (default, fast 4-step)
- flux-2-klein-4b.safetensors (with Heretic-abliterated Qwen3-4B text encoder — no refusals)
"""
if model not in _WORKFLOW_REGISTRY:
model = _DEFAULT_MODEL
logger.warning("Unknown model %s, falling back to %s", model, _DEFAULT_MODEL)
# Build and submit workflow
try:
workflow = build_flux_workflow(
@@ -0,0 +1,73 @@
{
"6": {
"class_type": "CLIPTextEncode",
"inputs": {
"clip": ["30", 0],
"text": "PROMPT_PLACEHOLDER"
}
},
"8": {
"class_type": "VAEDecode",
"inputs": {
"samples": ["13", 0],
"vae": ["31", 0]
}
},
"9": {
"class_type": "SaveImage",
"inputs": {
"filename_prefix": "mcp-image-gen",
"images": ["8", 0]
}
},
"13": {
"class_type": "KSampler",
"inputs": {
"cfg": 1.0,
"denoise": 1.0,
"latent_image": ["27", 0],
"model": ["32", 0],
"negative": ["33", 0],
"positive": ["6", 0],
"sampler_name": "euler",
"scheduler": "beta",
"seed": 42,
"steps": 4
}
},
"27": {
"class_type": "EmptySD3LatentImage",
"inputs": {
"batch_size": 1,
"height": 1024,
"width": 1024
}
},
"30": {
"class_type": "CLIPLoader",
"inputs": {
"clip_name": "qwen_3_4b_heretic.safetensors",
"type": "flux"
}
},
"31": {
"class_type": "VAELoader",
"inputs": {
"vae_name": "flux2-vae.safetensors"
}
},
"32": {
"class_type": "UNETLoader",
"inputs": {
"unet_name": "flux-2-klein-4b.safetensors",
"weight_dtype": "fp8_e4m3fn"
}
},
"33": {
"class_type": "CLIPTextEncode",
"inputs": {
"clip": ["30", 0],
"text": "NEGATIVE_PLACEHOLDER"
}
}
}
+47 -4
View File
@@ -31,7 +31,7 @@ COMFYUI_BASE = "http://test-comfyui:8188"
# ---------------------------------------------------------------------------
def test_build_flux_workflow_structure():
"""Verify build_flux_workflow returns a dict with correct node types."""
"""Verify build_flux_workflow returns a dict with correct node types for default model."""
wf = build_flux_workflow(
prompt="a red cat",
neg_prompt="ugly",
@@ -52,6 +52,47 @@ def test_build_flux_workflow_structure():
assert wf["33"]["class_type"] == "CLIPTextEncode"
def test_build_flux_workflow_heretic_model():
"""Verify FLUX.2 Klein 4B with Heretic Qwen3-4B encoder uses correct nodes."""
wf = build_flux_workflow(
prompt="a red cat",
neg_prompt="ugly",
width=1024,
height=1024,
steps=4,
seed=42,
model="flux-2-klein-4b.safetensors",
)
assert wf["6"]["class_type"] == "CLIPTextEncode"
assert wf["30"]["class_type"] == "CLIPLoader" # Qwen3-4B uses single CLIPLoader
assert wf["32"]["inputs"]["unet_name"] == "flux-2-klein-4b.safetensors"
assert wf["31"]["inputs"]["vae_name"] == "flux2-vae.safetensors"
assert wf["13"]["inputs"]["scheduler"] == "beta" # FLUX.2 Klein uses beta scheduler
def test_workflow_registry_contains_both_models():
"""Verify the registry contains both supported models."""
assert "flux1-schnell.safetensors" in server._WORKFLOW_REGISTRY
assert "flux-2-klein-4b.safetensors" in server._WORKFLOW_REGISTRY
assert len(server._WORKFLOW_REGISTRY) == 2
def test_workflow_registry_fallback():
"""Unknown model falls back to default (FLUX.1-schnell)."""
wf = build_flux_workflow(
prompt="test",
neg_prompt="",
width=512,
height=512,
steps=4,
seed=42,
model="unknown-model.safetensors",
)
# Should have used default workflow (DualCLIPLoader)
assert wf["30"]["class_type"] == "DualCLIPLoader"
assert wf["32"]["inputs"]["unet_name"] == "unknown-model.safetensors"
def test_build_flux_workflow_params_injected():
"""Verify all parameters are injected into correct nodes."""
wf = build_flux_workflow(
@@ -202,14 +243,16 @@ async def test_list_available_models():
@respx.mock
@pytest.mark.asyncio
async def test_list_available_models_comfyui_offline():
"""When ComfyUI is unreachable, list_available_models returns error message."""
"""When ComfyUI is unreachable, list_available_models falls back to registry models."""
respx.get(f"{COMFYUI_BASE}/object_info/CheckpointLoaderSimple").mock(
side_effect=httpx.ConnectError("connection refused")
)
result = await list_available_models()
assert len(result) == 1
assert "not reachable" in result[0].lower()
# Should return registry models even when ComfyUI is offline
assert isinstance(result, list)
assert "flux1-schnell.safetensors" in result
assert "flux-2-klein-4b.safetensors" in result
# ---------------------------------------------------------------------------
+300
View File
@@ -0,0 +1,300 @@
# Plan: FLUX.2 Klein 4B + Heretic Abliterated Text Encoder in mcp-image-gen
**Datum:** 2026-04-10
**Autor:** Lumen / Patrick Plate
**Status:** Ready for Implementation
---
## Ziel
Das bestehende `mcp-image-gen` ComfyUI-Backend um ein zweites Modell erweitern:
**FLUX.2 Klein 4B** mit dem abliterierten **Qwen3-4B-Heretic** als Text-Encoder.
Ergebnis: `generate_image` kann via `model`-Parameter zwischen zwei Workflows wählen:
- `flux1-schnell.safetensors` → bestehender Workflow (unverändert)
- `flux-2-klein-4b-fp8.safetensors` → neuer Heretic-Workflow (keine Prompt-Refusals)
---
## Technischer Hintergrund
### Warum Heretic + FLUX.2 Klein?
FLUX.2 Klein 4B verwendet **Qwen3-4B als LLM Text-Encoder** (statt CLIP+T5 wie bei FLUX.1).
Dieser LLM-Encoder hat Safety-Alignment → verweigert bestimmte Prompts → abliterieren.
`DreamFast/qwen3-4b-heretic` (HuggingFace):
- **KL Divergenz: 0.0000** — null messbarer Modell-Schaden
- Nur **3/100 Refusals** nach Heretic v1.2.0 (200 Trials)
- Drop-in Replacement für `qwen_3_4b.safetensors`
### Modell-Architektur Unterschied
| | FLUX.1-schnell | FLUX.2 Klein 4B |
|---|---|---|
| Diffusion Model | `flux1-schnell.safetensors` (UNet) | `flux-2-klein-4b-fp8.safetensors` |
| Text Encoder | `DualCLIPLoader` (T5+CLIP) | `CLIPLoader` (Qwen3-4B) |
| VAE | `ae.safetensors` | `flux2-vae.safetensors` |
| Steps | 4 | 4 (distilled) |
| VRAM | ~8GB | ~8.4GB |
| Refusals | keine (kein LLM-Encoder) | keine (abliteriert) |
---
## Dateien & Ordner
### Neue Modell-Dateien (herunterzuladen)
```
~/ComfyUI/models/
├── diffusion_models/
│ └── flux-2-klein-4b-fp8.safetensors ← FLUX.2 Klein distilled 4B
├── text_encoders/
│ └── qwen_3_4b_heretic.safetensors ← Heretic abliteriert (von DreamFast/qwen3-4b-heretic)
└── vae/
└── flux2-vae.safetensors ← VAE für FLUX.2
```
### Neue/geänderte Projekt-Dateien
```
mcp/mcp-image-gen/
├── src/
│ ├── server.py ← Workflow-Registry ergänzen
│ └── workflows/
│ ├── flux_schnell.json ← unverändert
│ └── flux2_klein_heretic.json ← NEU
├── tests/
│ └── test_server.py ← neue Tests für Registry + Workflow
└── USAGE.md ← Download-Anleitung ergänzen
```
---
## Phase 1: Modelle herunterladen
### 1a. FLUX.2 Klein 4B (Diffusion Model)
```bash
# Von Black Forest Labs HuggingFace
huggingface-cli download black-forest-labs/FLUX.2-klein-4B \
flux-2-klein-4b-fp8.safetensors \
--local-dir ~/ComfyUI/models/diffusion_models/
```
### 1b. FLUX.2 VAE
```bash
huggingface-cli download black-forest-labs/FLUX.2-klein-4B \
flux2-vae.safetensors \
--local-dir ~/ComfyUI/models/vae/
```
### 1c. Qwen3-4B-Heretic (abliterierter Text-Encoder)
```bash
# Von DreamFast — bereits abliteriert, kein Heretic-Run nötig
huggingface-cli download DreamFast/qwen3-4b-heretic \
--local-dir /tmp/qwen3-4b-heretic/
# Safetensors-Datei in ComfyUI text_encoders ablegen
cp /tmp/qwen3-4b-heretic/model.safetensors \
~/ComfyUI/models/text_encoders/qwen_3_4b_heretic.safetensors
```
> **Hinweis:** DreamFast/qwen3-4b-heretic ist ein GGUF-/SafeTensors-Mix.
> Wir brauchen die `.safetensors` Variante für ComfyUI. Falls nur GGUF verfügbar:
> `huggingface-cli download Lockout/qwen3-4b-heretic-zimage qwen-4b-zimage-hereticV2-q8.gguf`
---
## Phase 2: Neues Workflow-JSON
**Datei:** [`mcp/mcp-image-gen/src/workflows/flux2_klein_heretic.json`](mcp/mcp-image-gen/src/workflows/flux2_klein_heretic.json)
FLUX.2 Klein verwendet andere ComfyUI-Nodes als FLUX.1-schnell:
- `DualCLIPLoader``CLIPLoader` (einzelner Qwen-Encoder)
- `UNETLoader` mit `diffusion_models/` Pfad statt `checkpoints/`
- `EmptySD3LatentImage` → gleich (kompatibel)
- `KSampler` → gleich aber `sampler_name: "euler"`, `scheduler: "beta"`, `steps: 4`
```json
{
"6": {
"class_type": "CLIPTextEncode",
"inputs": {
"clip": ["30", 0],
"text": "PROMPT_PLACEHOLDER"
}
},
"8": {
"class_type": "VAEDecode",
"inputs": {
"samples": ["13", 0],
"vae": ["31", 0]
}
},
"9": {
"class_type": "SaveImage",
"inputs": {
"filename_prefix": "mcp-image-gen",
"images": ["8", 0]
}
},
"13": {
"class_type": "KSampler",
"inputs": {
"cfg": 1.0,
"denoise": 1.0,
"latent_image": ["27", 0],
"model": ["32", 0],
"negative": ["33", 0],
"positive": ["6", 0],
"sampler_name": "euler",
"scheduler": "beta",
"seed": 42,
"steps": 4
}
},
"27": {
"class_type": "EmptySD3LatentImage",
"inputs": {
"batch_size": 1,
"height": 1024,
"width": 1024
}
},
"30": {
"class_type": "CLIPLoader",
"inputs": {
"clip_name": "qwen_3_4b_heretic.safetensors",
"type": "flux"
}
},
"31": {
"class_type": "VAELoader",
"inputs": {
"vae_name": "flux2-vae.safetensors"
}
},
"32": {
"class_type": "UNETLoader",
"inputs": {
"unet_name": "flux-2-klein-4b-fp8.safetensors",
"weight_dtype": "fp8_e4m3fn"
}
},
"33": {
"class_type": "CLIPTextEncode",
"inputs": {
"clip": ["30", 0],
"text": "NEGATIVE_PLACEHOLDER"
}
}
}
```
---
## Phase 3: server.py — Workflow-Registry
### Änderung 1: Workflow-Registry dict (nach `_WORKFLOW_PATH`)
```python
# Path to the bundled FLUX.1-schnell workflow template
_WORKFLOW_PATH = Path(__file__).parent / "workflows" / "flux_schnell.json"
# Workflow registry: model filename → workflow JSON path
_WORKFLOW_REGISTRY: dict[str, Path] = {
"flux1-schnell.safetensors": Path(__file__).parent / "workflows" / "flux_schnell.json",
"flux-2-klein-4b-fp8.safetensors": Path(__file__).parent / "workflows" / "flux2_klein_heretic.json",
}
_DEFAULT_MODEL = "flux1-schnell.safetensors"
```
### Änderung 2: `_load_workflow()` Hilfsfunktion
```python
def _load_workflow(model: str) -> dict:
"""Load the correct workflow JSON for the requested model.
Falls back to FLUX.1-schnell if model not in registry.
"""
path = _WORKFLOW_REGISTRY.get(model, _WORKFLOW_PATH)
if not path.exists():
raise FileNotFoundError(f"Workflow JSON not found: {path}")
return json.loads(path.read_text())
```
### Änderung 3: `_generate_single()` nutzt Registry
Aktueller Code lädt immer `_WORKFLOW_PATH`. Änderung: `_load_workflow(model)` aufrufen:
```python
async def _generate_single(
client: ComfyUIClient,
prompt: str,
negative_prompt: str,
model: str,
seed: int,
width: int,
height: int,
steps: int,
output_dir: Path,
name: str,
) -> tuple[TextContent, ImageContent | None]:
workflow = _load_workflow(model) # ← statt json.loads(_WORKFLOW_PATH.read_text())
# ... rest unchanged
```
---
## Phase 4: Tests
Neue Tests in [`mcp/mcp-image-gen/tests/test_server.py`](mcp/mcp-image-gen/tests/test_server.py):
1. **`test_workflow_registry_contains_both_models`** — Registry hat flux1-schnell + flux2-klein
2. **`test_load_workflow_flux1_schnell`** — lädt flux_schnell.json korrekt
3. **`test_load_workflow_flux2_klein`** — lädt flux2_klein_heretic.json korrekt
4. **`test_load_workflow_unknown_model_falls_back`** — unbekanntes Modell → FLUX.1-schnell
5. **`test_generate_image_uses_flux2_workflow`** — end-to-end Mock mit flux-2-klein-4b-fp8.safetensors
---
## Phase 5: USAGE.md Update
Neuer Abschnitt "FLUX.2 Klein 4B (Heretic)" in [`mcp/mcp-image-gen/USAGE.md`](mcp/mcp-image-gen/USAGE.md):
- Download-Befehle für alle 3 neuen Modell-Dateien
- Erklärung warum Heretic (abliterierter Text-Encoder, KL=0)
- Beispiel-Aufruf: `generate_image("...", model="flux-2-klein-4b-fp8.safetensors")`
---
## VRAM-Analyse
| Modell | VRAM gesamt | Passt in 24GB? |
|---|---|---|
| FLUX.1-schnell (fp8) | ~8GB | ✅ |
| FLUX.2 Klein 4B (fp8) + Qwen3-4B | ~8.4GB + ~4GB = ~12.4GB | ✅ |
| Beide gleichzeitig geladen | ~20GB | ✅ mit Margin |
Der RX 7900 XTX mit 24GB VRAM kann beide Modelle komfortabel halten.
---
## Risiken & Mitigationen
| Risiko | Wahrscheinlichkeit | Mitigation |
|---|---|---|
| `CLIPLoader` node nicht verfügbar in ComfyUI | niedrig | ComfyUI updaten; alternativ custom node |
| DreamFast-Modell nur als GGUF verfügbar | mittel | Lockout/qwen3-4b-heretic-zimage GGUF als Fallback |
| Qwen3-4B braucht anderen node type | mittel | Live-Test in ComfyUI UI zuerst; workflow anpassen |
| ROCm + Qwen3-4B Kompatibilität | niedrig | gleiche ROCm-Umgebung wie FLUX.1-schnell |
---
## Entscheidung
**Empfehlung: Umsetzen.** Minimale Code-Änderungen, kein Breaking Change, klarer Mehrwert.
Der einzige unsichere Punkt ist der genaue ComfyUI-Node-Name für den Qwen3-4B-Loader.
**Empfohlene Vorgehensweise:** Erst in der ComfyUI-Web-UI manuell einen Workflow mit Qwen3-4B aufbauen → JSON exportieren → als `flux2_klein_heretic.json` speichern. Das garantiert korrekte Node-Namen ohne Guess-Work.