# mcp-image-gen **FastMCP server for AI image generation via ComfyUI.** This MCP server wraps a locally running [ComfyUI](https://github.com/comfyanonymous/ComfyUI) instance, exposing image generation as MCP tools callable from Roo Code, Claude Desktop, or any MCP-compatible client. **New:** Support for **FLUX.2 Klein 4B** with **Heretic-abliterated Qwen3-4B text encoder** (zero KL divergence, no refusals). Select via `model="flux-2-klein-4b-fp8.safetensors"`. It supports FLUX.1-schnell (default), FLUX.2 Klein (Heretic), and any other ComfyUI-compatible checkpoint model. Generated images are saved to disk **and** returned as inline base64 so Claude can display them directly in chat. --- ## Prerequisites 1. **ComfyUI** installed and running at `http://localhost:8188` 2. At least one checkpoint model downloaded (see ComfyUI Setup below) 3. **Python 3.11+** and **uv** installed on the system --- ## Installation ```bash cd mcp/mcp-image-gen uv sync ``` --- ## Configuration All configuration is via environment variables: | Variable | Default | Description | |---|---|---| | `COMFYUI_URL` | `http://localhost:8188` | Base URL of the running ComfyUI instance | | `IMAGE_OUTPUT_DIR` | `~/Pictures/mcp-generated` | Directory where generated PNG files are saved | | `COMFYUI_TIMEOUT` | `120` | Max seconds to wait for generation before timeout | --- ## Usage ### Add to `.roo/mcp.json` (Roo Code) ```json "mcp-image-gen": { "command": "uv", "args": [ "--directory", "/home/pplate/pi_mcps/mcp/mcp-image-gen", "run", "src/server.py" ], "env": { "COMFYUI_URL": "http://localhost:8188", "IMAGE_OUTPUT_DIR": "/home/pplate/Pictures/mcp-generated" } } ``` ### Add to Claude Desktop (`claude_desktop_config.json`) ```json { "mcpServers": { "mcp-image-gen": { "command": "uv", "args": [ "--directory", "/home/pplate/pi_mcps/mcp/mcp-image-gen", "run", "src/server.py" ], "env": { "COMFYUI_URL": "http://localhost:8188", "IMAGE_OUTPUT_DIR": "/home/pplate/Pictures/mcp-generated" } } } } ``` ### Run directly ```bash cd mcp/mcp-image-gen ./run.sh ``` --- ## Available Tools | Tool | Description | |---|---| | `generate_image` | Generate an image from a text prompt. Returns file path + inline base64 PNG. | | `list_available_models` | List all checkpoint models loaded in ComfyUI. | | `get_generation_status` | Check status of a running/queued generation by `prompt_id`. | | `get_output_directory` | Return the current output directory path. | ### `generate_image` parameters | Parameter | Default | Description | |---|---|---| | `prompt` | *(required)* | Text description of the image | | `width` | `1024` | Image width in pixels | | `height` | `1024` | Image height in pixels | | `steps` | `4` | Inference steps (FLUX.1-schnell: 4 is optimal) | | `model` | `flux1-schnell.safetensors` | Checkpoint model filename | | `seed` | `-1` | Seed for reproducibility (`-1` = random) | | `negative_prompt` | `""` | Things to exclude from the image | | `output_dir` | *(IMAGE_OUTPUT_DIR)* | Override output directory | --- ## ComfyUI Setup (Fedora + AMD ROCm) ```bash # Install ComfyUI pip install comfyui # Download FLUX.1-schnell model (~8GB, Apache 2.0) # Place in: ComfyUI/models/checkpoints/flux1-schnell.safetensors # Source: https://huggingface.co/black-forest-labs/FLUX.1-schnell # Start ComfyUI with ROCm support for AMD RX 7900 XTX HSA_OVERRIDE_GFX_VERSION=11.0.0 python main.py --listen # Verify the API is reachable curl http://localhost:8188/system_stats ``` > **Note:** `HSA_OVERRIDE_GFX_VERSION=11.0.0` may be needed for the RX 7900 XTX (gfx1100) > to be recognized correctly by ROCm libraries. ### PyTorch with ROCm (if needed separately) ```bash pip install torch torchvision --index-url https://download.pytorch.org/whl/rocm6.1 ``` --- ## Testing ```bash cd mcp/mcp-image-gen uv run pytest tests/ -v ``` All tests mock the ComfyUI HTTP API — no running ComfyUI instance needed. --- ## Ollama Migration Path When Ollama adds Linux image generation support (announced "coming soon" as of April 2026, currently macOS-only), this server can switch backends via a single env var: ```bash IMAGE_BACKEND=ollama # currently only "comfyui" is implemented ``` The tool signatures, return types, and MCP interface will remain unchanged — only the underlying HTTP calls switch from ComfyUI to Ollama's `/api/generate` endpoint. --- ## Architecture ``` Roo Code / Claude Desktop │ │ MCP (stdio) ▼ mcp-image-gen (FastMCP) │ │ HTTP REST ▼ ComfyUI @ localhost:8188 │ │ ROCm / AMD GPU ▼ FLUX.1-schnell / SDXL / SD3.5 ``` The server submits a FLUX.1-schnell ComfyUI API-format workflow, polls until complete, downloads the PNG, saves it to disk, and returns both a text summary and a base64-encoded inline image.