feat(mcp-image-gen): scaffold ComfyUI-backed image generation MCP server

- FastMCP server with 4 tools: generate_image, list_available_models, get_generation_status, get_output_directory - ComfyUI REST API client (httpx) polling lifecycle - FLUX.1-schnell workflow JSON template - Dual output: TextContent (path + seed) + ImageContent (base64 PNG) - 14 passing pytest tests with respx HTTP mocking - ROCm/AMD RX 7900 XTX optimized setup in README - Ollama Linux migration path documented (future)
2026-04-04 11:49:31 +02:00
parent ba7d4bc248
commit 8112ff2f12
11 changed files with 1748 additions and 0 deletions
@@ -0,0 +1,178 @@
+# mcp-image-gen
+
+**FastMCP server for AI image generation via ComfyUI.**
+
+This MCP server wraps a locally running [ComfyUI](https://github.com/comfyanonymous/ComfyUI) instance, exposing image generation as MCP tools callable from Roo Code, Claude Desktop, or any MCP-compatible client. It supports FLUX.1-schnell, FLUX.1-dev, SDXL, and any other ComfyUI-compatible checkpoint model. Generated images are saved to disk **and** returned as inline base64 so Claude can display them directly in chat.
+
+---
+
+## Prerequisites
+
+1. **ComfyUI** installed and running at `http://localhost:8188`
+2. At least one checkpoint model downloaded (see ComfyUI Setup below)
+3. **Python 3.11+** and **uv** installed on the system
+
+---
+
+## Installation
+
+```bash
+cd mcp/mcp-image-gen
+uv sync
+```
+
+---
+
+## Configuration
+
+All configuration is via environment variables:
+
+| Variable | Default | Description |
+|---|---|---|
+| `COMFYUI_URL` | `http://localhost:8188` | Base URL of the running ComfyUI instance |
+| `IMAGE_OUTPUT_DIR` | `~/Pictures/mcp-generated` | Directory where generated PNG files are saved |
+| `COMFYUI_TIMEOUT` | `120` | Max seconds to wait for generation before timeout |
+
+---
+
+## Usage
+
+### Add to `.roo/mcp.json` (Roo Code)
+
+```json
+"mcp-image-gen": {
+  "command": "uv",
+  "args": [
+    "--directory", "/home/pplate/pi_mcps/mcp/mcp-image-gen",
+    "run", "src/server.py"
+  ],
+  "env": {
+    "COMFYUI_URL": "http://localhost:8188",
+    "IMAGE_OUTPUT_DIR": "/home/pplate/Pictures/mcp-generated"
+  }
+}
+```
+
+### Add to Claude Desktop (`claude_desktop_config.json`)
+
+```json
+{
+  "mcpServers": {
+    "mcp-image-gen": {
+      "command": "uv",
+      "args": [
+        "--directory", "/home/pplate/pi_mcps/mcp/mcp-image-gen",
+        "run", "src/server.py"
+      ],
+      "env": {
+        "COMFYUI_URL": "http://localhost:8188",
+        "IMAGE_OUTPUT_DIR": "/home/pplate/Pictures/mcp-generated"
+      }
+    }
+  }
+}
+```
+
+### Run directly
+
+```bash
+cd mcp/mcp-image-gen
+./run.sh
+```
+
+---
+
+## Available Tools
+
+| Tool | Description |
+|---|---|
+| `generate_image` | Generate an image from a text prompt. Returns file path + inline base64 PNG. |
+| `list_available_models` | List all checkpoint models loaded in ComfyUI. |
+| `get_generation_status` | Check status of a running/queued generation by `prompt_id`. |
+| `get_output_directory` | Return the current output directory path. |
+
+### `generate_image` parameters
+
+| Parameter | Default | Description |
+|---|---|---|
+| `prompt` | *(required)* | Text description of the image |
+| `width` | `1024` | Image width in pixels |
+| `height` | `1024` | Image height in pixels |
+| `steps` | `4` | Inference steps (FLUX.1-schnell: 4 is optimal) |
+| `model` | `flux1-schnell.safetensors` | Checkpoint model filename |
+| `seed` | `-1` | Seed for reproducibility (`-1` = random) |
+| `negative_prompt` | `""` | Things to exclude from the image |
+| `output_dir` | *(IMAGE_OUTPUT_DIR)* | Override output directory |
+
+---
+
+## ComfyUI Setup (Fedora + AMD ROCm)
+
+```bash
+# Install ComfyUI
+pip install comfyui
+
+# Download FLUX.1-schnell model (~8GB, Apache 2.0)
+# Place in: ComfyUI/models/checkpoints/flux1-schnell.safetensors
+# Source: https://huggingface.co/black-forest-labs/FLUX.1-schnell
+
+# Start ComfyUI with ROCm support for AMD RX 7900 XTX
+HSA_OVERRIDE_GFX_VERSION=11.0.0 python main.py --listen
+
+# Verify the API is reachable
+curl http://localhost:8188/system_stats
+```
+
+> **Note:** `HSA_OVERRIDE_GFX_VERSION=11.0.0` may be needed for the RX 7900 XTX (gfx1100)
+> to be recognized correctly by ROCm libraries.
+
+### PyTorch with ROCm (if needed separately)
+
+```bash
+pip install torch torchvision --index-url https://download.pytorch.org/whl/rocm6.1
+```
+
+---
+
+## Testing
+
+```bash
+cd mcp/mcp-image-gen
+uv run pytest tests/ -v
+```
+
+All tests mock the ComfyUI HTTP API — no running ComfyUI instance needed.
+
+---
+
+## Ollama Migration Path
+
+When Ollama adds Linux image generation support (announced "coming soon" as of April 2026, currently macOS-only), this server can switch backends via a single env var:
+
+```bash
+IMAGE_BACKEND=ollama  # currently only "comfyui" is implemented
+```
+
+The tool signatures, return types, and MCP interface will remain unchanged — only the underlying HTTP calls switch from ComfyUI to Ollama's `/api/generate` endpoint.
+
+---
+
+## Architecture
+
+```
+Roo Code / Claude Desktop
+        │
+        │ MCP (stdio)
+        ▼
+  mcp-image-gen (FastMCP)
+        │
+        │ HTTP REST
+        ▼
+  ComfyUI @ localhost:8188
+        │
+        │ ROCm / AMD GPU
+        ▼
+  FLUX.1-schnell / SDXL / SD3.5
+```
+
+The server submits a FLUX.1-schnell ComfyUI API-format workflow, polls until complete, downloads the PNG, saves it to disk, and returns both a text summary and a base64-encoded inline image.