Files

T

pplate 2f01ff0639 fix(mcp-image-gen): correct ComfyUI install instructions in USAGE.md

ComfyUI is NOT on PyPI — `pip install comfyui` fails with
"No matching distribution found". Remove the wrong Option A.

Replace with:
- Warning note that pip install does not work
- Only correct method: git clone from GitHub + pip install -r requirements.txt

ROCm status confirmed: rocm-smi 3.1.0 / ROCm-SMI-LIB 7.7.0 installed.

2026-04-04 12:20:28 +02:00

19 KiB

Raw Blame History

mcp-image-gen — Usage Guide

Comprehensive reference for using the ComfyUI-backed image generation MCP server

Prerequisites — ComfyUI Setup
Quick Start — Running the MCP Server
How to Ask Lumen to Generate Images
Available Tools
Parameters Reference
Output Format
Environment Variables
Test Status
Prompt Tips for FLUX.1-schnell
Known Limitations

1. Prerequisites — ComfyUI Setup

ComfyUI must be running before any image generation tool call succeeds.

The MCP server connects to ComfyUI's REST API at http://localhost:8188. If ComfyUI is not running, generate_image and list_available_models will return a graceful error message — no crash.

Install ComfyUI

⚠️ ComfyUI is NOT on PyPI — pip install comfyui will fail with "No matching distribution found". It must be installed from source via git clone.

# Clone from source (the only correct installation method)
git clone https://github.com/comfyanonymous/ComfyUI.git
cd ComfyUI
pip install -r requirements.txt

Install PyTorch with ROCm (AMD RX 7900 XTX)

Patrick's RX 7900 XTX (gfx1100, 24GB VRAM) uses the ROCm backend. Standard CUDA builds will not work on AMD hardware.

# PyTorch with ROCm 6.1 support
pip install torch torchvision --index-url https://download.pytorch.org/whl/rocm6.1

ROCm version note: ROCm 7.2.1 is the current production release as of April 2026. Check rocm-smi to confirm your ROCm version before installing torch.

Download FLUX.1-schnell (Primary Model)

FLUX.1-schnell is the recommended model — fast (4 steps), Apache 2.0 licensed, excellent quality.

# Download (~8GB) — place in ComfyUI/models/checkpoints/
wget https://huggingface.co/black-forest-labs/FLUX.1-schnell/resolve/main/flux1-schnell.safetensors \
     -O ~/ComfyUI/models/checkpoints/flux1-schnell.safetensors

# Or use huggingface_hub:
huggingface-cli download black-forest-labs/FLUX.1-schnell \
    flux1-schnell.safetensors \
    --local-dir ~/ComfyUI/models/checkpoints/

You'll also need the CLIP and VAE models — see the ComfyUI FLUX guide for full model list.

Start ComfyUI (AMD ROCm)

# Standard start — listens on all interfaces at port 8188
HSA_OVERRIDE_GFX_VERSION=11.0.0 python main.py --listen

# Or with explicit port
HSA_OVERRIDE_GFX_VERSION=11.0.0 python main.py --listen --port 8188

HSA_OVERRIDE_GFX_VERSION=11.0.0 — Required for RX 7900 XTX (gfx1100). Without this, ROCm may fail to detect the GPU correctly. This tells the HIP runtime to treat the GPU as gfx1100 architecture.

Verify ComfyUI is Running

curl -s http://localhost:8188/system_stats | python3 -m json.tool | head -20

Expected response includes system object with python_version, pytorch_version, embedded_python, and comfyui_version.

2. Quick Start — Running the MCP Server

Via `run.sh` (recommended)

cd /home/pplate/pi_mcps/mcp/mcp-image-gen
./run.sh

run.sh automatically:

Sets PATH to include ~/.local/bin for uv
Creates IMAGE_OUTPUT_DIR (~/Pictures/mcp-generated) if it doesn't exist
Launches the FastMCP server via uv run src/server.py (stdio transport)

Via uv directly

cd /home/pplate/pi_mcps/mcp/mcp-image-gen
uv run src/server.py

Wired into `.roo/mcp.json`

The server is already configured in .roo/mcp.json:

"mcp-image-gen": {
  "command": "uv",
  "args": [
    "--directory", "/home/pplate/pi_mcps/mcp/mcp-image-gen",
    "run", "src/server.py"
  ],
  "env": {
    "COMFYUI_URL": "http://localhost:8188",
    "IMAGE_OUTPUT_DIR": "/home/pplate/Pictures/mcp-generated"
  }
}

Roo Code / Claude Desktop will auto-start the server when any image generation tool is invoked. The MCP server itself starts in ~1 second — ComfyUI must already be running separately.

Install dependencies (first time)

cd /home/pplate/pi_mcps/mcp/mcp-image-gen
uv sync

3. How to Ask Lumen to Generate Images

Just speak naturally. Lumen will call the appropriate MCP tool automatically.

Basic generation

"Generate an image of a futuristic city at sunset"

→ generate_image(prompt="futuristic city at sunset", width=1024, height=1024, steps=4)

Specific style and size

"Create a portrait of a red fox in watercolor style, 1024x1024"

→ generate_image(
    prompt="portrait of a red fox, watercolor style, detailed fur, soft brushstrokes",
    width=1024, height=1024
  )

Reproducible with a fixed seed

"Make an image with seed 42 so I can reproduce it"

→ generate_image(prompt="...", seed=42)

The seed is reported in the text output so you can use the same seed again.

Landscape format

"Generate a wide cinematic landscape of a Norwegian fjord, 1920x1080"

→ generate_image(prompt="Norwegian fjord, cinematic, golden hour", width=1920, height=1080)

Excluding unwanted elements

"Generate a clean product photo of a coffee mug, no background clutter, no text"

→ generate_image(
    prompt="product photo of a ceramic coffee mug, studio lighting, white background",
    negative_prompt="clutter, text, watermark, blurry, shadows"
  )

More inference steps for higher quality

"Generate a highly detailed oil painting of a medieval castle, use 20 steps"

→ generate_image(
    prompt="oil painting of a medieval castle, highly detailed, dramatic lighting",
    steps=20,
    model="flux1-dev.safetensors"   # FLUX.1-dev supports higher step counts better
  )

Check what models are available

"List what models are available in ComfyUI"

→ list_available_models()

Check status of a long-running job

"What's the status of prompt ID abc-123?"

→ get_generation_status(prompt_id="abc-123")

Find out where images are saved

"Where are my generated images being saved?"

→ get_output_directory()

4. Available Tools

`generate_image`

Generate an image from a text prompt using ComfyUI's FLUX.1-schnell workflow.

Full signature:

async def generate_image(
    prompt: str,
    width: int = 1024,
    height: int = 1024,
    steps: int = 4,
    model: str = "flux1-schnell.safetensors",
    seed: int = -1,
    negative_prompt: str = "",
    output_dir: str = "",
) -> list[TextContent | ImageContent]

What it does:

Loads the bundled flux_schnell.json ComfyUI API workflow template
Injects your prompt, dimensions, seed, model into the correct workflow nodes
Submits the workflow to ComfyUI via POST /api/prompt
Polls /api/queue every 2 seconds until the job leaves the queue
Fetches history via /api/history/{prompt_id} to find the output filename
Downloads the PNG from /api/view
Saves the PNG to disk as YYYYMMDD_HHMMSS_{seed}.png
Returns [TextContent(path + metadata), ImageContent(base64 PNG)]

`list_available_models`

List all checkpoint models currently available in ComfyUI.

async def list_available_models() -> list[str]

Calls /object_info/CheckpointLoaderSimple and extracts the checkpoint name list. Use this to discover what models are installed before passing a model name to generate_image.

Example return:

["flux1-schnell.safetensors", "flux1-dev.safetensors", "sd_xl_base_1.0.safetensors"]

`get_generation_status`

Check the status of a queued or running generation job.

async def get_generation_status(prompt_id: str) -> dict

Return values:

`status`	Meaning
`"pending"`	Job is in the queue, not yet started
`"running"`	Job is currently being processed
`"completed"`	Job finished — image is in ComfyUI's history
`"not_found"`	Unknown prompt_id — may have expired from history
`"error"`	ComfyUI was unreachable

Useful when generate_image times out (default 120s) — the job may still be running in ComfyUI.

`get_output_directory`

Return the absolute path where generated images will be saved.

def get_output_directory() -> str

Returns the expanded, absolute path derived from IMAGE_OUTPUT_DIR env var (or ~/Pictures/mcp-generated default). The directory may not exist yet — generate_image creates it on first use.

5. Parameters Reference

Full parameter table for generate_image:

Parameter	Type	Default	Description
`prompt`	`str`	(required)	Text description of the image. Goes into the positive CLIP text encoder node.
`width`	`int`	`1024`	Image width in pixels. FLUX.1-schnell: 512–2048 recommended.
`height`	`int`	`1024`	Image height in pixels. FLUX.1-schnell: 512–2048 recommended.
`steps`	`int`	`4`	Number of KSampler inference steps. FLUX.1-schnell is designed for 1–8 steps.
`model`	`str`	`"flux1-schnell.safetensors"`	Checkpoint model filename as listed by `list_available_models`.
`seed`	`int`	`-1`	RNG seed for reproducibility. `-1` = new random seed each call (0 to 2³²−1).
`negative_prompt`	`str`	`""`	Text description of things to exclude. Goes into negative CLIP encoder node.
`output_dir`	`str`	`""`	Override save directory. Empty = uses `IMAGE_OUTPUT_DIR` env var or default.

Recommended dimensions

Use case	Width	Height
Square (default)	1024	1024
Portrait	768	1024
Landscape	1024	768
Widescreen	1280	720
HD widescreen	1920	1080
Tall portrait	512	768

VRAM note: Patrick's RX 7900 XTX has 24GB VRAM. FLUX.1-schnell requires ~8GB, so you can comfortably run 1920×1080 and even larger. FLUX.1-dev requires ~12GB.

6. Output Format

generate_image returns a list with two items when successful:

Item 1 — `TextContent` (file path + metadata)

Generated: /home/pplate/Pictures/mcp-generated/20260404_121500_3847291045.png
Seed: 3847291045
Elapsed: 8.3s
Size: 1024x1024, Steps: 4, Model: flux1-schnell.safetensors

The filename format is YYYYMMDD_HHMMSS_{seed}.png — the seed is embedded so you can reproduce the exact image by passing it back as the seed parameter.

Item 2 — `ImageContent` (inline base64 PNG)

The image displays directly in Roo Code / Claude Desktop chat as an inline image — no need to open a file browser. The same PNG is also saved to disk at the path shown in the TextContent.

{
  "type": "image",
  "mimeType": "image/png",
  "data": "<base64-encoded PNG bytes>"
}

Error responses

When ComfyUI is unreachable or an error occurs, only one TextContent is returned (no ImageContent):

ComfyUI not reachable at http://localhost:8188. Start it with: python main.py --listen

Generation timed out after 120s. prompt_id=abc-123 — use get_generation_status to check

7. Environment Variables

Configure via environment variables in .roo/mcp.json or shell:

Variable	Default	Description
`COMFYUI_URL`	`http://localhost:8188`	Base URL of the running ComfyUI REST API. Change this if ComfyUI runs on a different host or port.
`IMAGE_OUTPUT_DIR`	`~/Pictures/mcp-generated`	Directory where generated PNG files are saved. Supports `~` expansion. Created automatically on first generation.
`COMFYUI_TIMEOUT`	`120`	Maximum seconds to wait for a generation job before returning a timeout error. Increase for very large images or slow hardware.

Setting via shell

export COMFYUI_URL="http://localhost:8188"
export IMAGE_OUTPUT_DIR="/home/pplate/Pictures/ai-art"
export COMFYUI_TIMEOUT="300"
./run.sh

Setting via mcp.json env block

"mcp-image-gen": {
  "command": "uv",
  "args": ["--directory", "/home/pplate/pi_mcps/mcp/mcp-image-gen", "run", "src/server.py"],
  "env": {
    "COMFYUI_URL": "http://localhost:8188",
    "IMAGE_OUTPUT_DIR": "/home/pplate/Pictures/mcp-generated",
    "COMFYUI_TIMEOUT": "120"
  }
}

8. Test Status

19 pytest tests — all passing. Tests mock all ComfyUI HTTP calls using respx. No running ComfyUI instance is needed to run the tests.

cd /home/pplate/pi_mcps/mcp/mcp-image-gen
uv run pytest tests/ -v

Test coverage breakdown

Test file	Tests	Coverage area
`tests/test_server.py`	19	All 4 tools + workflow builder

Test name	What it verifies
`test_build_flux_workflow_structure`	Workflow has correct node class_types
`test_build_flux_workflow_params_injected`	All params injected into correct nodes
`test_negative_prompt_included`	Negative prompt goes to node 33
`test_random_seed_generated`	`seed=-1` produces a valid integer in `_meta`
`test_list_available_models`	Returns model list from mocked `/object_info`
`test_list_available_models_comfyui_offline`	ConnectError → graceful error string
`test_get_generation_status_pending`	`prompt_id` in queue_pending → `"pending"`
`test_get_generation_status_running`	`prompt_id` in queue_running → `"running"`
`test_get_generation_status_complete`	Not in queue + in history → `"completed"`
`test_get_output_directory_default`	No env var → `~/Pictures/mcp-generated` expanded
`test_get_output_directory_custom`	Custom env var → that path returned
`test_generate_image_success`	Full lifecycle: queue→poll→history→view→save
`test_generate_image_comfyui_unavailable`	ConnectError → single TextContent error
`test_generate_image_timeout`	COMFYUI_TIMEOUT=0 → timeout TextContent
`test_generate_image_empty_prompt`	Empty string prompt → still succeeds
`test_generate_image_long_prompt`	500-char prompt → not truncated, succeeds
`test_generate_image_invalid_model`	404 from /prompt → error TextContent, no file saved
`test_generate_image_custom_output_dir`	Custom `output_dir` param → saved there, dir created
`test_generate_image_random_seed_variance`	`seed=-1` × 2 → different seeds, different filenames

Test mock stack

respx — HTTP-level mocking for all ComfyUI API endpoints
Pillow (in conftest) — generates real PNG bytes for image response fixtures
monkeypatch — env vars (IMAGE_OUTPUT_DIR, COMFYUI_URL, COMFYUI_TIMEOUT) and server module attributes

Real image generation requires ComfyUI to be running. Tests prove the tool logic is correct at the protocol level.

9. Prompt Tips for FLUX.1-schnell

FLUX.1-schnell is a guidance-distilled model designed for speed at 1–8 steps. It responds differently from SDXL or SD1.5.

Prompt structure that works well

[subject], [style/medium], [lighting], [camera/composition], [mood/atmosphere], [quality modifiers]

Example:

ancient library at night, oil painting, warm candlelight, wide angle, mysterious atmosphere, highly detailed, sharp focus

Style keywords

Style	Prompt keywords
Photography	`cinematic photograph, DSLR, 85mm lens, shallow depth of field, bokeh`
Oil painting	`oil painting, thick brushstrokes, textured canvas, impressionist`
Watercolor	`watercolor painting, soft washes, paper texture, flowing colors`
Digital art	`digital art, concept art, artstation, octane render`
Anime/illustration	`anime style, cel shading, vibrant colors, clean linework`
Sketch	`pencil sketch, hand drawn, crosshatching, charcoal`

Lighting keywords

golden hour, blue hour, dramatic lighting, rim lighting
studio lighting, soft diffused light, volumetric light
neon glow, bioluminescent, moonlit, candlelight

What works well with FLUX.1-schnell

Clear subject + style — "red panda in a cozy library, watercolor style"
Landscape scenes — fjords, forests, cities, abstract environments
Portrait shots — animals and characters with descriptive appearance
Concept art — futuristic cities, sci-fi environments, fantasy scenes
Low step counts — 4 steps is designed to be near-optimal for this model

What to avoid

Booru-style tag dumps (FLUX handles natural language better than SD1.5)
Contradictory instructions — "dark AND bright", "realistic AND cartoon"
Overly complex scenes at very small resolutions

Using the negative prompt

FLUX.1-schnell has reduced CFG guidance so negative prompts have less impact than in SDXL. Use them for broad exclusions:

negative_prompt="blurry, out of focus, watermark, text, signature, low quality, artifacts"

Reproducibility

Always save the seed from the TextContent output if you want to reproduce a result:

Seed: 3847291045

Then pass it back: seed=3847291045

10. Known Limitations

ComfyUI must run locally

The MCP server connects to COMFYUI_URL (default: http://localhost:8188). ComfyUI is a local application — it does not have a cloud API. You must start it before requesting image generation. The server returns a clear error message if ComfyUI is not reachable.

Model must be pre-loaded

ComfyUI loads checkpoint models into VRAM on first use. The first generation with a model takes longer as VRAM is allocated (FLUX.1-schnell: ~8GB). Subsequent generations with the same model are faster.

# Verify model is installed before generation
# → ask Lumen: "list available models in ComfyUI"

AMD ROCm setup complexity

AMD GPU support requires:

ROCm drivers installed (rocm-smi working)
PyTorch built with ROCm support (not the default CUDA build)
HSA_OVERRIDE_GFX_VERSION=11.0.0 for RX 7900 XTX (gfx1100)

Without these, ComfyUI will fall back to CPU — very slow (minutes per image vs. ~8 seconds on RX 7900 XTX).

Check GPU is being used:

# In another terminal while generating:
watch -n 1 rocm-smi
# VRAM usage should spike to ~8GB during generation

Timeout on large images

The default COMFYUI_TIMEOUT=120 (2 minutes) may not be enough for:

Very large resolutions (2048×2048+)
High step counts (20+)
First generation loading a new model

Increase via env var:

export COMFYUI_TIMEOUT=300  # 5 minutes

If generate_image returns a timeout error, the job may still be running in ComfyUI. Use get_generation_status(prompt_id) to check.

Ollama image gen is macOS-only (April 2026)

Ollama launched experimental image generation in January 2026, but it is macOS-only as of April 2026. Linux support is announced as "coming soon." When Linux support arrives, the server can switch backends via IMAGE_BACKEND=ollama without changing any tool signatures.

ComfyUI history is ephemeral

ComfyUI keeps generation history in memory — it is lost on restart. The get_generation_status tool will return "not_found" for old prompt IDs after a ComfyUI restart. The saved PNG file on disk persists regardless.

19 KiB Raw Blame History Unescape Escape