Compare commits

...

4 Commits

Author SHA1 Message Date
Patrick Plate c662a5237b feat(mcp-image-gen): add ComfyUI auto-start health check + systemd service
Option A: Add lifespan context manager to server.py
- _ping_comfyui(): async health check against /system_stats
- check_and_start_comfyui(): ping on startup; if down, launches ComfyUI
  via subprocess.Popen from COMFYUI_DIR (.venv/bin/python main.py)
  with HSA_OVERRIDE_GFX_VERSION=11.0.0 injected for AMD ROCm
- Polls up to 30s for readiness after auto-start
- New env var: COMFYUI_DIR (default ~/ComfyUI)
- FastMCP lifespan= wired in; 34/34 tests still passing

Option B: Add comfyui.service systemd user service file
- Install: cp mcp/mcp-image-gen/comfyui.service ~/.config/systemd/user/
- Enable: systemctl --user enable --now comfyui
- Sets HSA_OVERRIDE_GFX_VERSION=11.0.0, WorkingDirectory=%h/ComfyUI
- Restart=on-failure, logs via journald

docs: Update mcp-image-gen-ComfyUI-Setup.md
- New Step 4: systemd service install + linger instructions
- Step 5: manual start (moved from old Step 4)
- Step 6/7 renumbered; COMFYUI_DIR env var documented
- Architecture diagram added; troubleshooting rows updated
2026-04-06 10:43:36 +02:00
Patrick Plate 0ff3f20589 feat(mcp-image-gen): merge name and count params into main 2026-04-06 07:45:45 +02:00
Patrick Plate 79f1e6d65f feat(mcp-image-gen): add name and count params to generate_image
- Add name (str) param: filename prefix saved as {name}_{timestamp}_{seed}.png
- Add count (int, 1-10) param: generate N images in one call
- Extract _sanitize_name() helper: strips special chars, collapses underscores, caps at 64 chars
- Extract _build_filename() helper: pure function for testable filename construction
- Extract _generate_single() coroutine: clean loop body for batch generation
- Fixed seed batches increment seed per image (seed+i-1) for deterministic variation
- random seed (-1) batches give independent random seeds per image
- Partial batch failures continue (error TextContent in slot, remaining images proceed)
- Returns flat interleaved [Text1, Image1, Text2, Image2, ...] list
- 34/34 tests passing (was 19, added 15 new tests)
2026-04-06 07:45:37 +02:00
Patrick Plate 79a2e1d10a Merge branch 'feat/roo/ollama-backed-modes' 2026-04-05 10:27:37 +02:00
5 changed files with 988 additions and 60 deletions
+81 -10
View File
@@ -62,7 +62,52 @@ huggingface-cli download comfyanonymous/flux_text_encoders \
--local-dir ~/ComfyUI/models/clip
```
## Step 4: Start ComfyUI
## Step 4: Install the systemd User Service (Recommended)
Installing ComfyUI as a systemd user service ensures it starts automatically on login and restarts on failure.
```bash
# Copy the bundled service file to the systemd user directory
mkdir -p ~/.config/systemd/user
cp ~/pi_mcps/mcp/mcp-image-gen/comfyui.service ~/.config/systemd/user/comfyui.service
# Reload systemd, enable + start the service
systemctl --user daemon-reload
systemctl --user enable --now comfyui
# Verify it is running
systemctl --user status comfyui
```
> ⚠️ `HSA_OVERRIDE_GFX_VERSION=11.0.0` is already set in the service file — it is mandatory for RX 7900 XTX on ROCm. Without it, model loading fails silently.
### Enable lingering (start ComfyUI even without a login session)
```bash
loginctl enable-linger $USER
```
This ensures the service starts at boot even before you log in — recommended for headless / homelab setups.
### Managing the service
```bash
# Follow live logs
journalctl --user -u comfyui -f
# Restart after model changes
systemctl --user restart comfyui
# Stop temporarily
systemctl --user stop comfyui
# Disable autostart
systemctl --user disable comfyui
```
## Step 5: Manual Start (without systemd)
If you prefer to start ComfyUI manually (e.g. for debugging):
```bash
cd ~/ComfyUI
@@ -74,26 +119,36 @@ HSA_OVERRIDE_GFX_VERSION=11.0.0 \
echo "ComfyUI PID: $!"
```
> ⚠️ `HSA_OVERRIDE_GFX_VERSION=11.0.0` is mandatory for RX 7900 XTX on ROCm. Without it, model loading fails silently.
## Step 5: Verify ComfyUI is Running
## Step 6: Verify ComfyUI is Running
```bash
curl http://localhost:8188/system_stats
# Should return JSON with GPU info
```
## Step 6: Configure mcp-image-gen
## Step 7: Configure mcp-image-gen
```bash
cd /home/pplate/pi_mcps/mcp/mcp-image-gen
# Environment variables (set in .roo/mcp.json or shell):
# COMFYUI_URL=http://localhost:8188
# IMAGE_OUTPUT_DIR=~/Pictures/mcp-generated
# COMFYUI_TIMEOUT=120
# COMFYUI_URL=http://localhost:8188 — ComfyUI API endpoint
# IMAGE_OUTPUT_DIR=~/Pictures/mcp-generated — where generated images are saved
# COMFYUI_TIMEOUT=120 — max wait time (seconds) per image
# COMFYUI_DIR=~/ComfyUI — path to ComfyUI install (used by auto-start)
```
### Auto-start behaviour
`mcp-image-gen` includes a **startup health check** in its lifespan. Every time the MCP server starts it:
1. Pings `http://localhost:8188/system_stats`
2. **If reachable** — logs `ComfyUI is already running ✓` and proceeds normally.
3. **If not reachable** — attempts to launch ComfyUI as a background subprocess from `COMFYUI_DIR` using `.venv/bin/python main.py --listen --port 8188` with `HSA_OVERRIDE_GFX_VERSION=11.0.0` injected automatically.
4. Polls up to 30 s for ComfyUI to become ready.
With the systemd service enabled, step 3 is never needed in practice — but the check acts as a safety net.
## Performance
| GPU | Model | Resolution | Steps | Time |
@@ -101,12 +156,28 @@ cd /home/pplate/pi_mcps/mcp/mcp-image-gen
| AMD RX 7900 XTX | FLUX.1-schnell | 1024×1024 | 4 | ~8s |
| AMD RX 7900 XTX | FLUX.1-schnell | 1280×512 | 4 | ~7s |
## Architecture Overview
```
Boot
└─ systemd --user (comfyui.service)
└─ ComfyUI at localhost:8188
VS Code / Roo Code
└─ mcp-image-gen MCP server (stdio)
├─ lifespan startup: ping localhost:8188
│ └─ if down: subprocess.Popen ComfyUI, wait ≤30s
└─ tools: generate_image, list_available_models, …
```
## Troubleshooting
| Problem | Solution |
|---|---|
| `HTTP 401` downloading model | Accept FLUX license on HuggingFace first |
| GPU not detected | Ensure `HSA_OVERRIDE_GFX_VERSION=11.0.0` is set |
| `Connection refused` from mcp-image-gen | Start ComfyUI first, check port 8188 |
| Slow generation (>60s) | ComfyUI may be running on CPU — check ROCm install |
| `Connection refused` from mcp-image-gen | Check `systemctl --user status comfyui`; or set `COMFYUI_DIR` so auto-start can locate the install |
| Slow generation (>60s) | ComfyUI may be running on CPU — check ROCm install and `HSA_OVERRIDE_GFX_VERSION` |
| Ollama image gen | As of April 2026: macOS-only, not available on Linux |
| Auto-start logs | `journalctl --user -u comfyui -f` or check mcp-image-gen server logs |
| Service not starting at boot | Run `loginctl enable-linger $USER` to enable session-less startup |
@@ -0,0 +1,319 @@
# Assessment: Expand `generate_image` with `name` and `count` Parameters
*Author: Lumen | Date: 2026-04-06 | Ticket: —*
*BigMind Session: `00070c37-b013-4342-a8ae-f81da0e3180d`*
*Status: 🔵 DRAFT — awaiting Patrick review*
---
## 1. Problem Statement
The current [`generate_image()`](mcp/mcp-image-gen/src/server.py:133) tool generates a single image and saves it with an auto-generated filename of `{timestamp}_{seed}.png`. Two common workflows are not yet supported:
1. **Named outputs** — When generating thematic sets (Lumen profile images, wiki banners, concept art), the caller wants a meaningful prefix in the filename (e.g., `lumen_profile_20260406_140236_2409122067.png`) rather than a bare timestamp. This also enables grouping output by purpose in the directory listing.
2. **Batch generation** — Generating multiple variations of the same prompt in one tool call is a common creative workflow. Currently, the caller must invoke `generate_image` N times with separate tool calls, which is verbose and loses the semantic grouping.
**Goal:** Add two optional parameters — `name` (filename prefix string) and `count` (integer repetitions) — to `generate_image` with minimal disruption to existing behaviour and test coverage.
---
## 2. Requirements
### 2.1 Functional Requirements
| ID | Requirement |
|----|-------------|
| F-1 | `name` parameter (default `""`) prepends a sanitized label to the output filename |
| F-2 | When `name=""` (default), filename format is unchanged: `{timestamp}_{seed}.png` |
| F-3 | When `name="lumen_profile"`, filename format is: `lumen_profile_{timestamp}_{seed}.png` |
| F-4 | `count` parameter (default `1`) generates N images sequentially |
| F-5 | When `count=1` (default), return value is identical to the current `[TextContent, ImageContent]` |
| F-6 | When `count=N > 1`, return value is a flat list: `[Text1, Image1, Text2, Image2, ..., TextN, ImageN]` |
| F-7 | When `count>1` and `seed=-1`, each image gets an independently random seed |
| F-8 | When `count>1` and a fixed `seed` is provided, images use `seed`, `seed+1`, `seed+2`, … to produce deterministic variation |
| F-9 | `count` is capped at a maximum (proposed: 10) to prevent runaway generation |
| F-10 | `name` is sanitized: non-alphanumeric characters (except `-` and `_`) are stripped/replaced; max 64 chars |
| F-11 | Partial success: if one image in a batch fails, the error is returned as a `TextContent` error item in that position rather than aborting the whole batch |
| F-12 | The TextContent for each image in a batch includes the 1-of-N index: `[1/3] Generated: ...` |
### 2.2 Non-Functional Requirements
| ID | Requirement |
|----|-------------|
| NF-1 | Sequential generation — no concurrent ComfyUI submissions (ComfyUI queues internally; parallel MCP submissions would complicate polling) |
| NF-2 | Backward compatibility — all existing callers with no `name`/`count` args produce identical output |
| NF-3 | All existing 19 tests must continue to pass without modification |
| NF-4 | New tests must cover: name prefix in filename, count=2 success, count with fixed seed increments, count with partial failure, name sanitization, count cap enforcement |
| NF-5 | MCP tool schema (visible in Claude/Roo Code) must surface clear descriptions for the new params |
---
## 3. Affected Files
| File | Change Type | Description |
|------|-------------|-------------|
| [`mcp/mcp-image-gen/src/server.py`](mcp/mcp-image-gen/src/server.py:133) | Modify | Add `name: str = ""` and `count: int = 1` params to `generate_image()`; add `_sanitize_name()` helper; extract `_generate_single()` inner logic |
| [`mcp/mcp-image-gen/tests/test_server.py`](mcp/mcp-image-gen/tests/test_server.py:1) | Modify | Add 6+ new test cases covering new parameters |
| [`mcp/mcp-image-gen/README.md`](mcp/mcp-image-gen/README.md) | Modify | Update `generate_image` tool documentation table |
| [`docs/wiki/pages/mcp-image-gen.md`](docs/wiki/pages/mcp-image-gen.md) | Modify | Update tool reference table with new parameters |
No schema changes, no new dependencies, no workflow JSON changes.
---
## 4. Design Decisions
### 4.1 Filename Convention with `name`
**Current:** `{timestamp}_{seed}.png`
**Proposed:** `{sanitized_name}_{timestamp}_{seed}.png` (when `name` is provided)
The `name` is placed as a **prefix** rather than suffix so directory `ls` output groups named sets together alphabetically:
```
lumen_profile_20260406_140236_2409122067.png
lumen_profile_20260406_140258_764633840.png
wiki_banner_20260406_141000_1234567.png
```
**Sanitization rule:** `re.sub(r'[^a-zA-Z0-9_-]', '_', name)[:64]` — replaces any character that is not alphanumeric, dash, or underscore with `_`, then truncates to 64 chars.
### 4.2 Seed Behaviour for Batch Generation
| Scenario | Behaviour |
|----------|-----------|
| `count=3, seed=-1` | Each call to `build_flux_workflow` gets `seed=-1` → 3 independent random seeds |
| `count=3, seed=42` | Seeds are 42, 43, 44 — deterministic, reproducible variation |
This follows the convention of most image generation tools (e.g., ComfyUI's own batch seed increment).
### 4.3 Return Structure for `count > 1`
Return a **flat interleaved list**: `[Text1, Image1, Text2, Image2]`
**Rationale:** MCP content lists are flat arrays. Claude/Roo Code renders them sequentially — a flat list means each image appears immediately below its metadata line. A nested structure would require the caller to unwrap it.
**For `count=1` (default):** Behaviour is identical to today — `[TextContent, ImageContent]`. No caller breakage.
### 4.4 Refactoring: Extract `_generate_single()`
The current `generate_image` function is 180+ lines of inline logic. To support `count`, the inner pipeline (queue → poll → history → download → save → encode) will be extracted to a private `async def _generate_single(prompt, ..., index, total)` coroutine. `generate_image` then loops `count` times calling `_generate_single` and accumulates results.
This refactoring:
- Makes the count loop clean (`results.extend(await _generate_single(...))`)
- Makes partial failure handling straightforward (catch per iteration)
- Improves testability of the single-image path
### 4.5 Maximum Count Cap
Cap `count` at **10**. Rationale:
- FLUX.1-schnell takes ~1035s per image on RX 7900 XTX → 10 images ≈ 100350s maximum
- MCP tool call timeout in Roo Code defaults to 5 minutes — 10 images is safe margin
- ComfyUI queues them internally; the MCP server polls sequentially, not in parallel
When `count > 10`, the tool returns a single `TextContent` error immediately (no images generated) with message: `"count={N} exceeds maximum of 10. Reduce count and retry."`
---
## 5. Implementation Plan
### Step 1 — Add `_sanitize_name()` helper
```python
import re
def _sanitize_name(name: str) -> str:
"""Sanitize a name for use as a filename prefix."""
sanitized = re.sub(r'[^a-zA-Z0-9_-]', '_', name)
return sanitized[:64]
```
Location: [`server.py`](mcp/mcp-image-gen/src/server.py:95), after `build_flux_workflow()` (pure function section).
### Step 2 — Extract `_generate_single()` coroutine
Extract the body of the current `generate_image` (lines 162310) into:
```python
async def _generate_single(
prompt: str,
width: int,
height: int,
steps: int,
model: str,
seed: int,
negative_prompt: str,
resolved_output_dir: Path,
filename_prefix: str,
index: int,
total: int,
) -> list:
```
The `filename` construction changes to:
```python
filename = f"{filename_prefix}{timestamp}_{actual_seed}.png"
# where filename_prefix = f"{sanitized_name}_" if sanitized_name else ""
```
The `TextContent` text changes when `total > 1`:
```python
prefix_label = f"[{index}/{total}] " if total > 1 else ""
text = f"{prefix_label}Generated: {out_path}\nSeed: ..."
```
### Step 3 — Update `generate_image()` signature
```python
@mcp.tool()
async def generate_image(
prompt: str,
width: int = 1024,
height: int = 1024,
steps: int = 4,
model: str = "flux1-schnell.safetensors",
seed: int = -1,
negative_prompt: str = "",
output_dir: str = "",
name: str = "",
count: int = 1,
) -> list:
```
Body of `generate_image` becomes:
```python
# Validate count
MAX_COUNT = 10
if count < 1 or count > MAX_COUNT:
return [TextContent(type="text", text=f"count={count} is invalid. Must be 1{MAX_COUNT}.")]
sanitized_name = _sanitize_name(name) if name else ""
filename_prefix = f"{sanitized_name}_" if sanitized_name else ""
resolved_output_dir = Path(output_dir or IMAGE_OUTPUT_DIR).expanduser().resolve()
results = []
for i in range(1, count + 1):
actual_seed = seed if seed == -1 else seed + (i - 1)
items = await _generate_single(
prompt=prompt, width=width, height=height, steps=steps,
model=model, seed=actual_seed, negative_prompt=negative_prompt,
resolved_output_dir=resolved_output_dir,
filename_prefix=filename_prefix, index=i, total=count,
)
results.extend(items)
return results
```
### Step 4 — Write new tests
Add to [`test_server.py`](mcp/mcp-image-gen/tests/test_server.py:550):
| Test | Description |
|------|-------------|
| `test_generate_image_with_name` | `name="lumen"` → filename starts with `lumen_` |
| `test_generate_image_name_sanitization` | `name="my image! v2"``my_image__v2_` prefix |
| `test_generate_image_count_2_success` | `count=2` → 4 items in result, 2 files saved |
| `test_generate_image_count_fixed_seed` | `count=2, seed=42` → seeds 42 and 43 in filenames |
| `test_generate_image_count_partial_failure` | `count=2`, second POST fails → 2 items (success) + 1 item (error) |
| `test_generate_image_count_cap_exceeded` | `count=11` → single TextContent error, no generation |
| `test_generate_image_count_0_invalid` | `count=0` → single TextContent error |
| `test_generate_image_name_and_count_combined` | `name="banner", count=2` → both files prefixed `banner_` |
### Step 5 — Update documentation
- Update `generate_image` docstring in [`server.py`](mcp/mcp-image-gen/src/server.py:144) to document `name` and `count`
- Update parameter table in [`README.md`](mcp/mcp-image-gen/README.md)
- Update tool reference in [`docs/wiki/pages/mcp-image-gen.md`](docs/wiki/pages/mcp-image-gen.md)
### Step 6 — Run full test suite
```bash
cd mcp/mcp-image-gen && uv run pytest tests/ -v --tb=short
```
All 19 existing + 8 new = **27 tests** must pass.
### Step 7 — Commit and push
Branch: `feat/mcp-image-gen/generate-image-name-count`
Commit: `feat(mcp-image-gen): add name and count params to generate_image`
---
## 6. Risks
| Risk | Likelihood | Impact | Mitigation |
|------|------------|--------|------------|
| Partial batch failure leaves orphaned files on disk | Medium | Low | Files for successful images are kept; error TextContent clearly identifies which index failed. No cleanup needed — partial results are useful. |
| `count` loop adds significant latency visible in Roo Code | Medium | Medium | Document expected time: `count × ~15s`. MCP timeout is 5 min; max 10 images ≈ 150s. Still within limit. |
| Seed increment wraps around at `2^32` | Very Low | Low | `(seed + i - 1) % 2**32` — add modulo guard in `_generate_single` |
| `_generate_single` refactor introduces regression in existing tests | Low | High | Existing test fixtures mock ComfyUI endpoints — as long as the HTTP call sequence is unchanged, respx mocks will match. Verify each existing test still passes before adding new ones. |
| `name` with only special chars becomes empty after sanitization | Low | Medium | After sanitization, if result is empty string, treat as unnamed (no prefix). Add assertion in `_sanitize_name` to return `""` for all-whitespace/special inputs. |
| MCP tool schema change breaks existing callers | Very Low | Low | New params are optional with defaults — backward compatible. Roo Code re-reads schema on server restart. |
---
## 7. Alternatives Considered
### 7.1 Separate `generate_images_batch()` Tool (Rejected)
Add a new tool instead of expanding `generate_image`.
**Pros:** Clean separation, no refactoring of existing tool.
**Cons:** Two tools for the same backend; callers must learn two tool names; MCP tool list grows. The MCP convention favours extending existing tools with optional parameters rather than proliferating tools.
**Verdict:** Rejected. Optional parameters with backward-compatible defaults is the right pattern here.
### 7.2 Return Grouped List of Lists for `count > 1` (Rejected)
Return `[[Text1, Image1], [Text2, Image2]]` for batch results.
**Pros:** Caller can index by image number cleanly.
**Cons:** MCP content type is a flat `list[ContentBlock]`. FastMCP does not support nested lists in tool returns — they would be serialized as strings, not rendered. Roo Code renders content sequentially; flat interleaved is the idiomatic structure.
**Verdict:** Rejected. Flat interleaved list `[Text1, Image1, Text2, Image2]` is MCP-idiomatic.
### 7.3 Parallel ComfyUI Submission for Batch (Rejected)
Submit all `count` prompts to ComfyUI simultaneously (async tasks), then collect results in order.
**Pros:** Faster if ComfyUI supports parallel queue processing (it does).
**Cons:** ComfyUI processes one job at a time on a single GPU regardless — parallel submission just fills the queue. Polling becomes complex (N polling loops). Error handling harder. Out-of-order completions break index alignment.
**Verdict:** Rejected for v1. Sequential submission is simpler, correct, and produces no worse throughput. Can revisit if ComfyUI gains true parallel processing support.
### 7.4 Name as Subdirectory Instead of Filename Prefix (Rejected)
When `name="lumen"`, save to `output_dir/lumen/` instead of `output_dir/lumen_*.png`.
**Pros:** Better directory organisation for large sets.
**Cons:** Complicates the implementation (directory creation per name), changes the return path format, breaks callers who assume a flat output directory. Adds complexity for minimal gain at `count ≤ 10`.
**Verdict:** Rejected for v1. Prefix approach is simpler and equally readable.
---
## 8. Success Criteria
| Criterion | Measure |
|-----------|---------|
| All 27 tests pass | `uv run pytest tests/ -v` exits 0 |
| `name="lumen"` → file starts with `lumen_` | Assert in `test_generate_image_with_name` |
| `count=2` → 4 content items, 2 files | Assert `len(result) == 4`, `len(glob("*.png")) == 2` |
| `count=2, seed=42` → seeds 42 and 43 | Assert seed values in TextContent |
| `count=11` → error TextContent, no ComfyUI call | Assert `len(result) == 1`, no `/api/prompt` mock hit |
| Backward compat: existing callers unaffected | All 19 existing tests pass without modification |
| MCP tool schema shows `name` and `count` params | Visible in Roo Code tool list after server restart |
---
## 9. Open Questions
| # | Question | Owner | Priority |
|---|----------|-------|----------|
| Q1 | Should `count=0` be an error, or silently return `[]` (empty list)? | Patrick | Low — assessment recommends error for clarity |
| Q2 | Max count cap: 10 or higher? 10 ≈ 150s max at 15s/image — feels right, but could be raised to 20 for batch profile image sets. | Patrick | Medium |
| Q3 | Should partial batch failure stop remaining iterations, or always complete all N? | Patrick | Medium — assessment recommends continue (partial success) |
| Q4 | Should `name` parameter also tag the TextContent output text, e.g. `[lumen_profile 1/3] Generated: ...`? | Patrick | Low |
+21
View File
@@ -0,0 +1,21 @@
[Unit]
Description=ComfyUI — Local AI Image Generation (AMD ROCm / FLUX.1-schnell)
Documentation=https://github.com/comfyanonymous/ComfyUI
After=network.target
[Service]
Type=simple
WorkingDirectory=%h/ComfyUI
ExecStart=%h/ComfyUI/.venv/bin/python main.py --listen --port 8188
Restart=on-failure
RestartSec=10
# AMD RX 7900 XTX ROCm GFX override — required for correct GPU detection
Environment=HSA_OVERRIDE_GFX_VERSION=11.0.0
# Redirect output — follow with: journalctl --user -u comfyui -f
StandardOutput=journal
StandardError=journal
[Install]
WantedBy=default.target
+248 -49
View File
@@ -4,15 +4,23 @@ import asyncio
import base64
import copy
import json
import logging
import os
import random
import re
import subprocess
import time
from contextlib import asynccontextmanager
from datetime import datetime
from pathlib import Path
from typing import Annotated
import httpx
from fastmcp import FastMCP
from mcp.types import ImageContent, TextContent
from pydantic import Field
logger = logging.getLogger("mcp-image-gen")
# ---------------------------------------------------------------------------
# Configuration
@@ -22,10 +30,112 @@ COMFYUI_URL = os.environ.get("COMFYUI_URL", "http://localhost:8188").rstrip("/")
IMAGE_OUTPUT_DIR = os.environ.get("IMAGE_OUTPUT_DIR", "~/Pictures/mcp-generated")
COMFYUI_TIMEOUT = int(os.environ.get("COMFYUI_TIMEOUT", "120"))
# Directory where ComfyUI is installed (used for auto-start only)
# Override via COMFYUI_DIR env var. Systemd service sets this automatically.
COMFYUI_DIR = Path(
os.environ.get("COMFYUI_DIR", "~/ComfyUI")
).expanduser().resolve()
# Maximum number of images allowed in a single batch call
MAX_COUNT = 10
# Path to the bundled FLUX.1-schnell workflow template
_WORKFLOW_PATH = Path(__file__).parent / "workflows" / "flux_schnell.json"
mcp = FastMCP("mcp-image-gen")
# ---------------------------------------------------------------------------
# ComfyUI health check + auto-start
# ---------------------------------------------------------------------------
async def _ping_comfyui(url: str, timeout: float = 5.0) -> bool:
"""Return True if ComfyUI is reachable at *url*/system_stats."""
try:
async with httpx.AsyncClient(timeout=timeout) as client:
resp = await client.get(f"{url}/system_stats")
return resp.status_code == 200
except (httpx.ConnectError, httpx.TimeoutException, OSError):
return False
async def check_and_start_comfyui() -> None:
"""Ping ComfyUI; if not reachable, attempt to launch it as a subprocess.
Called once at server startup from the lifespan context manager.
Uses COMFYUI_DIR to locate the installation and its venv Python.
The HSA_OVERRIDE_GFX_VERSION=11.0.0 env var is injected automatically
for AMD ROCm / RX 7900 XTX compatibility.
"""
if await _ping_comfyui(COMFYUI_URL):
logger.info("ComfyUI is already running at %s", COMFYUI_URL)
return
logger.warning(
"ComfyUI not reachable at %s — attempting to start from %s",
COMFYUI_URL, COMFYUI_DIR,
)
python = COMFYUI_DIR / ".venv" / "bin" / "python"
main_py = COMFYUI_DIR / "main.py"
if not python.exists():
logger.error(
"ComfyUI venv Python not found at %s. "
"Install ComfyUI first (see docs/wiki/pages/mcp-image-gen-ComfyUI-Setup.md).",
python,
)
return
if not main_py.exists():
logger.error(
"ComfyUI main.py not found at %s — is COMFYUI_DIR correct?",
main_py,
)
return
# Build environment: inherit current env, set ROCm override for AMD RX 7900 XTX
env = os.environ.copy()
env.setdefault("HSA_OVERRIDE_GFX_VERSION", "11.0.0")
try:
proc = subprocess.Popen(
[str(python), str(main_py), "--listen", "--port", "8188"],
cwd=str(COMFYUI_DIR),
env=env,
stdout=subprocess.DEVNULL,
stderr=subprocess.DEVNULL,
start_new_session=True, # detach from MCP server process group
)
logger.info("ComfyUI launched (PID %d) — waiting for readiness…", proc.pid)
except OSError as exc:
logger.error("Failed to start ComfyUI subprocess: %s", exc)
return
# Wait up to 30 s for ComfyUI to become ready (polls every 2 s)
wait_limit = 30
for attempt in range(wait_limit // 2):
await asyncio.sleep(2)
if await _ping_comfyui(COMFYUI_URL):
logger.info(
"ComfyUI ready at %s after ~%ds ✓", COMFYUI_URL, (attempt + 1) * 2
)
return
logger.warning(
"ComfyUI did not respond within %ds. "
"Generation calls will fail until it is ready. "
"Check logs: journalctl --user -u comfyui -f",
wait_limit,
)
@asynccontextmanager
async def lifespan(app):
"""FastMCP lifespan: run ComfyUI health check at server startup."""
await check_and_start_comfyui()
yield # server is live here
# Nothing to tear down — ComfyUI is managed by systemd, not this process
mcp = FastMCP("mcp-image-gen", lifespan=lifespan)
# ---------------------------------------------------------------------------
@@ -126,46 +236,59 @@ def build_flux_workflow(
# ---------------------------------------------------------------------------
# Tools
# Helpers
# ---------------------------------------------------------------------------
@mcp.tool()
async def generate_image(
prompt: str,
width: int = 1024,
height: int = 1024,
steps: int = 4,
model: str = "flux1-schnell.safetensors",
seed: int = -1,
negative_prompt: str = "",
output_dir: str = "",
) -> list:
"""Generate an image from a text prompt using ComfyUI.
def _sanitize_name(name: str) -> str:
"""Sanitize a user-provided name for safe use in filenames.
Returns both a file path (for persistence) and an inline base64 image
(for display in Claude / Roo Code chat).
Replaces whitespace with underscores, strips any characters that are not
alphanumeric, underscores, or hyphens, and collapses consecutive
underscores/hyphens. Returns empty string if nothing usable remains.
"""
name = name.strip()
name = re.sub(r"\s+", "_", name) # spaces → underscores
name = re.sub(r"[^\w\-]", "", name) # strip non-alphanum/underscore/hyphen
name = re.sub(r"[_\-]{2,}", "_", name) # collapse runs
name = name.strip("_-") # trim leading/trailing separators
return name[:64] # cap at 64 chars
def _build_filename(name: str, timestamp: str, actual_seed: int) -> str:
"""Build an output filename from optional name, timestamp and seed."""
sanitized = _sanitize_name(name)
if sanitized:
return f"{sanitized}_{timestamp}_{actual_seed}.png"
return f"{timestamp}_{actual_seed}.png"
async def _generate_single(
client: ComfyUIClient,
prompt: str,
negative_prompt: str,
width: int,
height: int,
steps: int,
seed: int,
model: str,
resolved_output_dir: Path,
name: str,
label: str,
) -> list:
"""Generate a single image and return [TextContent, ImageContent] or [TextContent] on error.
Args:
prompt: Text description of the image to generate.
width: Image width in pixels (default: 1024).
height: Image height in pixels (default: 1024).
steps: Number of inference steps. FLUX.1-schnell works well at 4.
model: ComfyUI model filename (default: flux1-schnell.safetensors).
seed: Random seed for reproducibility. -1 = random.
negative_prompt: Things to exclude from the image (optional).
output_dir: Override output directory. Defaults to IMAGE_OUTPUT_DIR env var
or ~/Pictures/mcp-generated.
Returns:
[TextContent(path + metadata), ImageContent(base64 PNG)]
client: ComfyUIClient instance.
prompt: Positive text prompt.
negative_prompt: Negative text prompt.
width / height: Image dimensions.
steps: Inference steps.
seed: Seed value (-1 = random).
model: ComfyUI model filename.
resolved_output_dir: Resolved output directory Path.
name: User-supplied name prefix (unsanitized).
label: Human-readable label for TextContent prefix (e.g. "[lumen 1/3]").
"""
# Resolve output directory
resolved_output_dir = Path(
output_dir or IMAGE_OUTPUT_DIR
).expanduser().resolve()
client = ComfyUIClient(COMFYUI_URL)
# Build and submit workflow
try:
workflow = build_flux_workflow(
@@ -178,14 +301,13 @@ async def generate_image(
model=model,
)
actual_seed = workflow["_meta"]["actual_seed"]
prompt_id = await client.queue_prompt(workflow)
except httpx.ConnectError:
return [
TextContent(
type="text",
text=(
f"ComfyUI not reachable at {COMFYUI_URL}. "
f"{label} ComfyUI not reachable at {COMFYUI_URL}. "
"Start it with: python main.py --listen"
),
)
@@ -194,7 +316,7 @@ async def generate_image(
return [
TextContent(
type="text",
text=f"ComfyUI returned an error: {e.response.status_code}{e.response.text}",
text=f"{label} ComfyUI returned an error: {e.response.status_code}{e.response.text}",
)
]
@@ -207,7 +329,7 @@ async def generate_image(
TextContent(
type="text",
text=(
f"Generation timed out after {COMFYUI_TIMEOUT}s. "
f"{label} Generation timed out after {COMFYUI_TIMEOUT}s. "
f"prompt_id={prompt_id} — use get_generation_status to check"
),
)
@@ -236,7 +358,7 @@ async def generate_image(
return [
TextContent(
type="text",
text=f"Failed to retrieve generation history: {e}",
text=f"{label} Failed to retrieve generation history: {e}",
)
]
@@ -255,7 +377,7 @@ async def generate_image(
return [
TextContent(
type="text",
text=f"No output image found in history for prompt_id={prompt_id}",
text=f"{label} No output image found in history for prompt_id={prompt_id}",
)
]
@@ -270,7 +392,7 @@ async def generate_image(
return [
TextContent(
type="text",
text=f"Failed to download generated image: {e}",
text=f"{label} Failed to download generated image: {e}",
)
]
@@ -278,14 +400,14 @@ async def generate_image(
try:
resolved_output_dir.mkdir(parents=True, exist_ok=True)
timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
filename = f"{timestamp}_{actual_seed}.png"
filename = _build_filename(name, timestamp, actual_seed)
out_path = resolved_output_dir / filename
out_path.write_bytes(image_bytes)
except OSError as e:
return [
TextContent(
type="text",
text=f"Cannot write to output directory: {resolved_output_dir}{e}",
text=f"{label} Cannot write to output directory: {resolved_output_dir}{e}",
)
]
@@ -296,7 +418,7 @@ async def generate_image(
TextContent(
type="text",
text=(
f"Generated: {out_path}\n"
f"{label} Generated: {out_path}\n"
f"Seed: {actual_seed}\n"
f"Elapsed: {elapsed:.1f}s\n"
f"Size: {width}x{height}, Steps: {steps}, Model: {model}"
@@ -310,6 +432,84 @@ async def generate_image(
]
# ---------------------------------------------------------------------------
# Tools
# ---------------------------------------------------------------------------
@mcp.tool()
async def generate_image(
prompt: Annotated[str, Field(description="Text description of the image to generate.")],
width: Annotated[int, Field(description="Image width in pixels (default: 1024).")] = 1024,
height: Annotated[int, Field(description="Image height in pixels (default: 1024).")] = 1024,
steps: Annotated[int, Field(description="Number of inference steps. FLUX.1-schnell works well at 4.")] = 4,
model: Annotated[str, Field(description="ComfyUI model filename (default: flux1-schnell.safetensors).")] = "flux1-schnell.safetensors",
seed: Annotated[int, Field(description="Random seed for reproducibility. -1 = random. When count > 1 and seed != -1, seeds are incremented per image (seed, seed+1, seed+2, ...) to produce deterministic variation.")] = -1,
negative_prompt: Annotated[str, Field(description="Things to exclude from the image (optional).")] = "",
output_dir: Annotated[str, Field(description="Override output directory. Defaults to IMAGE_OUTPUT_DIR env var or ~/Pictures/mcp-generated.")] = "",
name: Annotated[str, Field(description="Optional filename prefix. Saved as {name}_{timestamp}_{seed}.png. Useful to avoid confusion with auto-generated timestamp filenames.")] = "",
count: Annotated[int, Field(description="Number of images to generate (110). Each image is generated sequentially. Partial failures are returned inline — the batch continues even if one image fails.")] = 1,
) -> list:
"""Generate an image from a text prompt using ComfyUI.
Returns both a file path (for persistence) and an inline base64 image
(for display in Claude / Roo Code chat).
Returns:
Flat interleaved list: [TextContent1, ImageContent1, TextContent2, ImageContent2, ...]
On error for any single image, that slot contains only [TextContent(error)].
"""
# Validate count
if count < 1:
return [
TextContent(
type="text",
text=f"count must be at least 1 (got {count}).",
)
]
if count > MAX_COUNT:
return [
TextContent(
type="text",
text=f"count must be at most {MAX_COUNT} (got {count}). Use multiple calls for larger batches.",
)
]
# Resolve output directory once
resolved_output_dir = Path(
output_dir or IMAGE_OUTPUT_DIR
).expanduser().resolve()
client = ComfyUIClient(COMFYUI_URL)
results = []
for i in range(1, count + 1):
# Compute seed for this image:
# - seed=-1 → each image gets an independent random seed
# - fixed seed → increment by i-1 for deterministic variation across the batch
image_seed = seed if seed == -1 else seed + (i - 1)
label = f"[{_sanitize_name(name) or 'image'} {i}/{count}]" if count > 1 else (
f"[{_sanitize_name(name)}]" if _sanitize_name(name) else ""
)
single_result = await _generate_single(
client=client,
prompt=prompt,
negative_prompt=negative_prompt,
width=width,
height=height,
steps=steps,
seed=image_seed,
model=model,
resolved_output_dir=resolved_output_dir,
name=name,
label=label,
)
results.extend(single_result)
return results
@mcp.tool()
async def list_available_models() -> list[str]:
"""List all checkpoint models available in ComfyUI.
@@ -330,12 +530,11 @@ async def list_available_models() -> list[str]:
@mcp.tool()
async def get_generation_status(prompt_id: str) -> dict:
async def get_generation_status(
prompt_id: Annotated[str, Field(description="The prompt ID returned by a previous generate_image call.")],
) -> dict:
"""Check the status of a queued or running generation job.
Args:
prompt_id: The prompt ID returned by a previous generate_image call.
Returns:
Dict with 'status' key: "pending", "running", "completed", or "not_found".
"""
+319 -1
View File
@@ -14,6 +14,8 @@ import respx
import server
from server import (
ComfyUIClient,
_build_filename,
_sanitize_name,
build_flux_workflow,
generate_image,
get_generation_status,
@@ -100,6 +102,74 @@ def test_random_seed_generated():
assert "_meta" in wf2
# ---------------------------------------------------------------------------
# _sanitize_name — pure function
# ---------------------------------------------------------------------------
def test_sanitize_name_basic():
"""Simple alphanumeric name passes through unchanged."""
assert _sanitize_name("lumen_profile") == "lumen_profile"
def test_sanitize_name_spaces_to_underscores():
"""Spaces are converted to underscores."""
assert _sanitize_name("my cool image") == "my_cool_image"
def test_sanitize_name_special_chars_stripped():
"""Special characters (!, @, #, etc.) are stripped."""
result = _sanitize_name("hello! world@2024#")
assert "!" not in result
assert "@" not in result
assert "#" not in result
assert "hello" in result
assert "world" in result
def test_sanitize_name_empty_returns_empty():
"""Empty string or whitespace-only returns empty string."""
assert _sanitize_name("") == ""
assert _sanitize_name(" ") == ""
def test_sanitize_name_collapse_underscores():
"""Multiple consecutive underscores/hyphens are collapsed to one."""
result = _sanitize_name("lumen__profile")
assert "__" not in result
def test_sanitize_name_truncates_at_64():
"""Names longer than 64 chars are truncated."""
long_name = "a" * 100
result = _sanitize_name(long_name)
assert len(result) <= 64
# ---------------------------------------------------------------------------
# _build_filename — pure function
# ---------------------------------------------------------------------------
def test_build_filename_with_name():
"""When name is provided, filename includes it as prefix."""
filename = _build_filename("lumen", "20260406_120000", 12345)
assert filename == "lumen_20260406_120000_12345.png"
def test_build_filename_without_name():
"""When name is empty, filename is timestamp_seed.png."""
filename = _build_filename("", "20260406_120000", 12345)
assert filename == "20260406_120000_12345.png"
assert not filename.startswith("_")
def test_build_filename_sanitizes_name():
"""Name with spaces and special chars is sanitized before use in filename."""
filename = _build_filename("my image!", "20260406_120000", 99)
assert "!" not in filename
assert "my_image" in filename
assert filename.endswith("_20260406_120000_99.png")
# ---------------------------------------------------------------------------
# list_available_models
# ---------------------------------------------------------------------------
@@ -211,7 +281,255 @@ def test_get_output_directory_custom(monkeypatch, tmp_path):
# ---------------------------------------------------------------------------
# generate_image
# generate_image — count/name validation
# ---------------------------------------------------------------------------
@pytest.mark.asyncio
async def test_generate_image_count_zero_returns_error():
"""count=0 → returns error TextContent without calling ComfyUI."""
result = await generate_image(prompt="a cat", count=0)
assert len(result) == 1
assert "count must be at least 1" in result[0].text
@pytest.mark.asyncio
async def test_generate_image_count_exceeds_max_returns_error():
"""count=11 (> MAX_COUNT=10) → returns error TextContent without calling ComfyUI."""
result = await generate_image(prompt="a cat", count=11)
assert len(result) == 1
assert "at most 10" in result[0].text
# ---------------------------------------------------------------------------
# generate_image — name parameter
# ---------------------------------------------------------------------------
@respx.mock
@pytest.mark.asyncio
async def test_generate_image_with_name(
tmp_path, sample_image_bytes, mock_history_response, queue_empty, monkeypatch
):
"""name param → saved file has name as prefix."""
monkeypatch.setattr(server, "IMAGE_OUTPUT_DIR", str(tmp_path))
respx.post(f"{COMFYUI_BASE}/api/prompt").mock(
return_value=httpx.Response(200, json={"prompt_id": "name-test-uuid"})
)
respx.get(f"{COMFYUI_BASE}/api/queue").mock(
return_value=httpx.Response(200, json=queue_empty)
)
mock_history_named = {
"name-test-uuid": {
"outputs": {
"9": {
"images": [
{
"filename": "mcp-image-gen_00001_.png",
"subfolder": "",
"type": "output",
}
]
}
},
"status": {"completed": True},
}
}
respx.get(f"{COMFYUI_BASE}/api/history/name-test-uuid").mock(
return_value=httpx.Response(200, json=mock_history_named)
)
respx.get(f"{COMFYUI_BASE}/api/view").mock(
return_value=httpx.Response(200, content=sample_image_bytes)
)
result = await generate_image(
prompt="lumen portrait",
name="lumen_profile",
output_dir=str(tmp_path),
)
assert len(result) == 2
saved_files = list(tmp_path.glob("lumen_profile_*.png"))
assert len(saved_files) == 1, f"Expected 1 file with 'lumen_profile_' prefix, got: {list(tmp_path.glob('*.png'))}"
# Path in TextContent also has the name prefix
assert "lumen_profile_" in result[0].text
# ---------------------------------------------------------------------------
# generate_image — count=2 batch
# ---------------------------------------------------------------------------
@respx.mock
@pytest.mark.asyncio
async def test_generate_image_count_2(
tmp_path, sample_image_bytes, queue_empty, monkeypatch
):
"""count=2 → returns 4 content items (Text+Image per image), 2 files saved."""
monkeypatch.setattr(server, "IMAGE_OUTPUT_DIR", str(tmp_path))
# First image
mock_history_1 = {
"uuid-batch-1": {
"outputs": {"9": {"images": [{"filename": "img1.png", "subfolder": "", "type": "output"}]}},
"status": {"completed": True},
}
}
# Second image
mock_history_2 = {
"uuid-batch-2": {
"outputs": {"9": {"images": [{"filename": "img2.png", "subfolder": "", "type": "output"}]}},
"status": {"completed": True},
}
}
respx.post(f"{COMFYUI_BASE}/api/prompt").mock(
side_effect=[
httpx.Response(200, json={"prompt_id": "uuid-batch-1"}),
httpx.Response(200, json={"prompt_id": "uuid-batch-2"}),
]
)
respx.get(f"{COMFYUI_BASE}/api/queue").mock(
return_value=httpx.Response(200, json=queue_empty)
)
respx.get(f"{COMFYUI_BASE}/api/history/uuid-batch-1").mock(
return_value=httpx.Response(200, json=mock_history_1)
)
respx.get(f"{COMFYUI_BASE}/api/history/uuid-batch-2").mock(
return_value=httpx.Response(200, json=mock_history_2)
)
respx.get(f"{COMFYUI_BASE}/api/view").mock(
return_value=httpx.Response(200, content=sample_image_bytes)
)
result = await generate_image(
prompt="a landscape",
count=2,
seed=100,
output_dir=str(tmp_path),
)
# 4 content items: [Text1, Image1, Text2, Image2]
assert len(result) == 4
assert result[0].type == "text"
assert result[1].type == "image"
assert result[2].type == "text"
assert result[3].type == "image"
# 2 files saved
saved_files = list(tmp_path.glob("*.png"))
assert len(saved_files) == 2
# Label contains batch index
assert "1/2" in result[0].text
assert "2/2" in result[2].text
@respx.mock
@pytest.mark.asyncio
async def test_generate_image_count_2_fixed_seed_increments(
tmp_path, sample_image_bytes, queue_empty, monkeypatch
):
"""count=2 with fixed seed → seeds are incremented (seed, seed+1)."""
monkeypatch.setattr(server, "IMAGE_OUTPUT_DIR", str(tmp_path))
submitted_seeds = []
def capture_prompt(request):
body = json.loads(request.content)
seed_val = body["prompt"]["13"]["inputs"]["seed"]
submitted_seeds.append(seed_val)
idx = len(submitted_seeds)
return httpx.Response(200, json={"prompt_id": f"seed-test-{idx}"})
mock_history_1 = {
"seed-test-1": {
"outputs": {"9": {"images": [{"filename": "img1.png", "subfolder": "", "type": "output"}]}},
"status": {"completed": True},
}
}
mock_history_2 = {
"seed-test-2": {
"outputs": {"9": {"images": [{"filename": "img2.png", "subfolder": "", "type": "output"}]}},
"status": {"completed": True},
}
}
respx.post(f"{COMFYUI_BASE}/api/prompt").mock(side_effect=capture_prompt)
respx.get(f"{COMFYUI_BASE}/api/queue").mock(
return_value=httpx.Response(200, json=queue_empty)
)
respx.get(f"{COMFYUI_BASE}/api/history/seed-test-1").mock(
return_value=httpx.Response(200, json=mock_history_1)
)
respx.get(f"{COMFYUI_BASE}/api/history/seed-test-2").mock(
return_value=httpx.Response(200, json=mock_history_2)
)
respx.get(f"{COMFYUI_BASE}/api/view").mock(
return_value=httpx.Response(200, content=sample_image_bytes)
)
await generate_image(
prompt="a test",
count=2,
seed=42,
output_dir=str(tmp_path),
)
assert submitted_seeds == [42, 43], f"Expected [42, 43], got {submitted_seeds}"
@respx.mock
@pytest.mark.asyncio
async def test_generate_image_count_partial_failure_continues(
tmp_path, sample_image_bytes, queue_empty, monkeypatch
):
"""count=2 where first image fails → error in slot 1, second image succeeds in slot 2."""
monkeypatch.setattr(server, "IMAGE_OUTPUT_DIR", str(tmp_path))
mock_history_2 = {
"uuid-ok": {
"outputs": {"9": {"images": [{"filename": "img2.png", "subfolder": "", "type": "output"}]}},
"status": {"completed": True},
}
}
respx.post(f"{COMFYUI_BASE}/api/prompt").mock(
side_effect=[
httpx.Response(500, json={"error": "GPU OOM"}), # first fails
httpx.Response(200, json={"prompt_id": "uuid-ok"}), # second succeeds
]
)
respx.get(f"{COMFYUI_BASE}/api/queue").mock(
return_value=httpx.Response(200, json=queue_empty)
)
respx.get(f"{COMFYUI_BASE}/api/history/uuid-ok").mock(
return_value=httpx.Response(200, json=mock_history_2)
)
respx.get(f"{COMFYUI_BASE}/api/view").mock(
return_value=httpx.Response(200, content=sample_image_bytes)
)
result = await generate_image(
prompt="a test",
count=2,
seed=10,
output_dir=str(tmp_path),
)
# First: error TextContent only (no ImageContent)
# Second: [TextContent, ImageContent]
assert len(result) == 3
assert result[0].type == "text"
assert "500" in result[0].text or "error" in result[0].text.lower()
assert result[1].type == "text"
assert result[2].type == "image"
# Only 1 file saved (the successful one)
saved_files = list(tmp_path.glob("*.png"))
assert len(saved_files) == 1
# ---------------------------------------------------------------------------
# generate_image — existing tests (kept intact)
# ---------------------------------------------------------------------------
@respx.mock