Files
Patrick Plate ea0c5d39c4 fix(mcp-image-gen): fix Heretic/FLUX2 integration bugs
- Fix syntax error in server.py (dangling docstring lines)
- Correct model filename: flux-2-klein-4b.safetensors (without -fp8)
- Fix _WORKFLOW_REGISTRY key to match actual downloaded filename
- Update get_models() to always include registry models as fallback
- Fix test expectations to match corrected model names
- All 37 tests passing
2026-04-10 19:21:51 +02:00

4.8 KiB

mcp-image-gen

FastMCP server for AI image generation via ComfyUI.

This MCP server wraps a locally running ComfyUI instance, exposing image generation as MCP tools callable from Roo Code, Claude Desktop, or any MCP-compatible client.

New: Support for FLUX.2 Klein 4B with Heretic-abliterated Qwen3-4B text encoder (zero KL divergence, no refusals). Select via model="flux-2-klein-4b-fp8.safetensors".

It supports FLUX.1-schnell (default), FLUX.2 Klein (Heretic), and any other ComfyUI-compatible checkpoint model. Generated images are saved to disk and returned as inline base64 so Claude can display them directly in chat.


Prerequisites

  1. ComfyUI installed and running at http://localhost:8188
  2. At least one checkpoint model downloaded (see ComfyUI Setup below)
  3. Python 3.11+ and uv installed on the system

Installation

cd mcp/mcp-image-gen
uv sync

Configuration

All configuration is via environment variables:

Variable Default Description
COMFYUI_URL http://localhost:8188 Base URL of the running ComfyUI instance
IMAGE_OUTPUT_DIR ~/Pictures/mcp-generated Directory where generated PNG files are saved
COMFYUI_TIMEOUT 120 Max seconds to wait for generation before timeout

Usage

Add to .roo/mcp.json (Roo Code)

"mcp-image-gen": {
  "command": "uv",
  "args": [
    "--directory", "/home/pplate/pi_mcps/mcp/mcp-image-gen",
    "run", "src/server.py"
  ],
  "env": {
    "COMFYUI_URL": "http://localhost:8188",
    "IMAGE_OUTPUT_DIR": "/home/pplate/Pictures/mcp-generated"
  }
}

Add to Claude Desktop (claude_desktop_config.json)

{
  "mcpServers": {
    "mcp-image-gen": {
      "command": "uv",
      "args": [
        "--directory", "/home/pplate/pi_mcps/mcp/mcp-image-gen",
        "run", "src/server.py"
      ],
      "env": {
        "COMFYUI_URL": "http://localhost:8188",
        "IMAGE_OUTPUT_DIR": "/home/pplate/Pictures/mcp-generated"
      }
    }
  }
}

Run directly

cd mcp/mcp-image-gen
./run.sh

Available Tools

Tool Description
generate_image Generate an image from a text prompt. Returns file path + inline base64 PNG.
list_available_models List all checkpoint models loaded in ComfyUI.
get_generation_status Check status of a running/queued generation by prompt_id.
get_output_directory Return the current output directory path.

generate_image parameters

Parameter Default Description
prompt (required) Text description of the image
width 1024 Image width in pixels
height 1024 Image height in pixels
steps 4 Inference steps (FLUX.1-schnell: 4 is optimal)
model flux1-schnell.safetensors Checkpoint model filename
seed -1 Seed for reproducibility (-1 = random)
negative_prompt "" Things to exclude from the image
output_dir (IMAGE_OUTPUT_DIR) Override output directory

ComfyUI Setup (Fedora + AMD ROCm)

# Install ComfyUI
pip install comfyui

# Download FLUX.1-schnell model (~8GB, Apache 2.0)
# Place in: ComfyUI/models/checkpoints/flux1-schnell.safetensors
# Source: https://huggingface.co/black-forest-labs/FLUX.1-schnell

# Start ComfyUI with ROCm support for AMD RX 7900 XTX
HSA_OVERRIDE_GFX_VERSION=11.0.0 python main.py --listen

# Verify the API is reachable
curl http://localhost:8188/system_stats

Note: HSA_OVERRIDE_GFX_VERSION=11.0.0 may be needed for the RX 7900 XTX (gfx1100) to be recognized correctly by ROCm libraries.

PyTorch with ROCm (if needed separately)

pip install torch torchvision --index-url https://download.pytorch.org/whl/rocm6.1

Testing

cd mcp/mcp-image-gen
uv run pytest tests/ -v

All tests mock the ComfyUI HTTP API — no running ComfyUI instance needed.


Ollama Migration Path

When Ollama adds Linux image generation support (announced "coming soon" as of April 2026, currently macOS-only), this server can switch backends via a single env var:

IMAGE_BACKEND=ollama  # currently only "comfyui" is implemented

The tool signatures, return types, and MCP interface will remain unchanged — only the underlying HTTP calls switch from ComfyUI to Ollama's /api/generate endpoint.


Architecture

Roo Code / Claude Desktop
        │
        │ MCP (stdio)
        ▼
  mcp-image-gen (FastMCP)
        │
        │ HTTP REST
        ▼
  ComfyUI @ localhost:8188
        │
        │ ROCm / AMD GPU
        ▼
  FLUX.1-schnell / SDXL / SD3.5

The server submits a FLUX.1-schnell ComfyUI API-format workflow, polls until complete, downloads the PNG, saves it to disk, and returns both a text summary and a base64-encoded inline image.