Files

T

Patrick Plate 8112ff2f12 feat(mcp-image-gen): scaffold ComfyUI-backed image generation MCP server

- FastMCP server with 4 tools: generate_image, list_available_models,
  get_generation_status, get_output_directory
- ComfyUI REST API client (httpx) polling lifecycle
- FLUX.1-schnell workflow JSON template
- Dual output: TextContent (path + seed) + ImageContent (base64 PNG)
- 14 passing pytest tests with respx HTTP mocking
- ROCm/AMD RX 7900 XTX optimized setup in README
- Ollama Linux migration path documented (future)

2026-04-04 11:49:31 +02:00

4.6 KiB

Raw Permalink Blame History

mcp-image-gen

FastMCP server for AI image generation via ComfyUI.

This MCP server wraps a locally running ComfyUI instance, exposing image generation as MCP tools callable from Roo Code, Claude Desktop, or any MCP-compatible client. It supports FLUX.1-schnell, FLUX.1-dev, SDXL, and any other ComfyUI-compatible checkpoint model. Generated images are saved to disk and returned as inline base64 so Claude can display them directly in chat.

Prerequisites

ComfyUI installed and running at http://localhost:8188
At least one checkpoint model downloaded (see ComfyUI Setup below)
Python 3.11+ and uv installed on the system

Installation

cd mcp/mcp-image-gen
uv sync

Configuration

All configuration is via environment variables:

Variable	Default	Description
`COMFYUI_URL`	`http://localhost:8188`	Base URL of the running ComfyUI instance
`IMAGE_OUTPUT_DIR`	`~/Pictures/mcp-generated`	Directory where generated PNG files are saved
`COMFYUI_TIMEOUT`	`120`	Max seconds to wait for generation before timeout

Usage

Add to `.roo/mcp.json` (Roo Code)

"mcp-image-gen": {
  "command": "uv",
  "args": [
    "--directory", "/home/pplate/pi_mcps/mcp/mcp-image-gen",
    "run", "src/server.py"
  ],
  "env": {
    "COMFYUI_URL": "http://localhost:8188",
    "IMAGE_OUTPUT_DIR": "/home/pplate/Pictures/mcp-generated"
  }
}

Add to Claude Desktop (`claude_desktop_config.json`)

{
  "mcpServers": {
    "mcp-image-gen": {
      "command": "uv",
      "args": [
        "--directory", "/home/pplate/pi_mcps/mcp/mcp-image-gen",
        "run", "src/server.py"
      ],
      "env": {
        "COMFYUI_URL": "http://localhost:8188",
        "IMAGE_OUTPUT_DIR": "/home/pplate/Pictures/mcp-generated"
      }
    }
  }
}

Run directly

cd mcp/mcp-image-gen
./run.sh

Available Tools

Tool	Description
`generate_image`	Generate an image from a text prompt. Returns file path + inline base64 PNG.
`list_available_models`	List all checkpoint models loaded in ComfyUI.
`get_generation_status`	Check status of a running/queued generation by `prompt_id`.
`get_output_directory`	Return the current output directory path.

`generate_image` parameters

Parameter	Default	Description
`prompt`	(required)	Text description of the image
`width`	`1024`	Image width in pixels
`height`	`1024`	Image height in pixels
`steps`	`4`	Inference steps (FLUX.1-schnell: 4 is optimal)
`model`	`flux1-schnell.safetensors`	Checkpoint model filename
`seed`	`-1`	Seed for reproducibility (`-1` = random)
`negative_prompt`	`""`	Things to exclude from the image
`output_dir`	(IMAGE_OUTPUT_DIR)	Override output directory

ComfyUI Setup (Fedora + AMD ROCm)

# Install ComfyUI
pip install comfyui

# Download FLUX.1-schnell model (~8GB, Apache 2.0)
# Place in: ComfyUI/models/checkpoints/flux1-schnell.safetensors
# Source: https://huggingface.co/black-forest-labs/FLUX.1-schnell

# Start ComfyUI with ROCm support for AMD RX 7900 XTX
HSA_OVERRIDE_GFX_VERSION=11.0.0 python main.py --listen

# Verify the API is reachable
curl http://localhost:8188/system_stats

Note: HSA_OVERRIDE_GFX_VERSION=11.0.0 may be needed for the RX 7900 XTX (gfx1100) to be recognized correctly by ROCm libraries.

PyTorch with ROCm (if needed separately)

pip install torch torchvision --index-url https://download.pytorch.org/whl/rocm6.1

Testing

cd mcp/mcp-image-gen
uv run pytest tests/ -v

All tests mock the ComfyUI HTTP API — no running ComfyUI instance needed.

Ollama Migration Path

When Ollama adds Linux image generation support (announced "coming soon" as of April 2026, currently macOS-only), this server can switch backends via a single env var:

IMAGE_BACKEND=ollama  # currently only "comfyui" is implemented

The tool signatures, return types, and MCP interface will remain unchanged — only the underlying HTTP calls switch from ComfyUI to Ollama's /api/generate endpoint.

Architecture

Roo Code / Claude Desktop
        │
        │ MCP (stdio)
        ▼
  mcp-image-gen (FastMCP)
        │
        │ HTTP REST
        ▼
  ComfyUI @ localhost:8188
        │
        │ ROCm / AMD GPU
        ▼
  FLUX.1-schnell / SDXL / SD3.5

The server submits a FLUX.1-schnell ComfyUI API-format workflow, polls until complete, downloads the PNG, saves it to disk, and returns both a text summary and a base64-encoded inline image.

4.6 KiB Raw Permalink Blame History