ComfyUI is NOT on PyPI — `pip install comfyui` fails with "No matching distribution found". Remove the wrong Option A. Replace with: - Warning note that pip install does not work - Only correct method: git clone from GitHub + pip install -r requirements.txt ROCm status confirmed: rocm-smi 3.1.0 / ROCm-SMI-LIB 7.7.0 installed.
19 KiB
mcp-image-gen — Usage Guide
Comprehensive reference for using the ComfyUI-backed image generation MCP server
Table of Contents
- Prerequisites — ComfyUI Setup
- Quick Start — Running the MCP Server
- How to Ask Lumen to Generate Images
- Available Tools
- Parameters Reference
- Output Format
- Environment Variables
- Test Status
- Prompt Tips for FLUX.1-schnell
- Known Limitations
1. Prerequisites — ComfyUI Setup
ComfyUI must be running before any image generation tool call succeeds.
The MCP server connects to ComfyUI's REST API at http://localhost:8188. If ComfyUI is not running, generate_image and list_available_models will return a graceful error message — no crash.
Install ComfyUI
⚠️ ComfyUI is NOT on PyPI —
pip install comfyuiwill fail with "No matching distribution found". It must be installed from source viagit clone.
# Clone from source (the only correct installation method)
git clone https://github.com/comfyanonymous/ComfyUI.git
cd ComfyUI
pip install -r requirements.txt
Install PyTorch with ROCm (AMD RX 7900 XTX)
Patrick's RX 7900 XTX (gfx1100, 24GB VRAM) uses the ROCm backend. Standard CUDA builds will not work on AMD hardware.
# PyTorch with ROCm 6.1 support
pip install torch torchvision --index-url https://download.pytorch.org/whl/rocm6.1
ROCm version note: ROCm 7.2.1 is the current production release as of April 2026. Check
rocm-smito confirm your ROCm version before installing torch.
Download FLUX.1-schnell (Primary Model)
FLUX.1-schnell is the recommended model — fast (4 steps), Apache 2.0 licensed, excellent quality.
# Download (~8GB) — place in ComfyUI/models/checkpoints/
wget https://huggingface.co/black-forest-labs/FLUX.1-schnell/resolve/main/flux1-schnell.safetensors \
-O ~/ComfyUI/models/checkpoints/flux1-schnell.safetensors
# Or use huggingface_hub:
huggingface-cli download black-forest-labs/FLUX.1-schnell \
flux1-schnell.safetensors \
--local-dir ~/ComfyUI/models/checkpoints/
You'll also need the CLIP and VAE models — see the ComfyUI FLUX guide for full model list.
Start ComfyUI (AMD ROCm)
# Standard start — listens on all interfaces at port 8188
HSA_OVERRIDE_GFX_VERSION=11.0.0 python main.py --listen
# Or with explicit port
HSA_OVERRIDE_GFX_VERSION=11.0.0 python main.py --listen --port 8188
HSA_OVERRIDE_GFX_VERSION=11.0.0— Required for RX 7900 XTX (gfx1100). Without this, ROCm may fail to detect the GPU correctly. This tells the HIP runtime to treat the GPU as gfx1100 architecture.
Verify ComfyUI is Running
curl -s http://localhost:8188/system_stats | python3 -m json.tool | head -20
Expected response includes system object with python_version, pytorch_version, embedded_python, and comfyui_version.
2. Quick Start — Running the MCP Server
Via run.sh (recommended)
cd /home/pplate/pi_mcps/mcp/mcp-image-gen
./run.sh
run.sh automatically:
- Sets
PATHto include~/.local/binforuv - Creates
IMAGE_OUTPUT_DIR(~/Pictures/mcp-generated) if it doesn't exist - Launches the FastMCP server via
uv run src/server.py(stdio transport)
Via uv directly
cd /home/pplate/pi_mcps/mcp/mcp-image-gen
uv run src/server.py
Wired into .roo/mcp.json
The server is already configured in .roo/mcp.json:
"mcp-image-gen": {
"command": "uv",
"args": [
"--directory", "/home/pplate/pi_mcps/mcp/mcp-image-gen",
"run", "src/server.py"
],
"env": {
"COMFYUI_URL": "http://localhost:8188",
"IMAGE_OUTPUT_DIR": "/home/pplate/Pictures/mcp-generated"
}
}
Roo Code / Claude Desktop will auto-start the server when any image generation tool is invoked. The MCP server itself starts in ~1 second — ComfyUI must already be running separately.
Install dependencies (first time)
cd /home/pplate/pi_mcps/mcp/mcp-image-gen
uv sync
3. How to Ask Lumen to Generate Images
Just speak naturally. Lumen will call the appropriate MCP tool automatically.
Basic generation
"Generate an image of a futuristic city at sunset"
→ generate_image(prompt="futuristic city at sunset", width=1024, height=1024, steps=4)
Specific style and size
"Create a portrait of a red fox in watercolor style, 1024x1024"
→ generate_image(
prompt="portrait of a red fox, watercolor style, detailed fur, soft brushstrokes",
width=1024, height=1024
)
Reproducible with a fixed seed
"Make an image with seed 42 so I can reproduce it"
→ generate_image(prompt="...", seed=42)
The seed is reported in the text output so you can use the same seed again.
Landscape format
"Generate a wide cinematic landscape of a Norwegian fjord, 1920x1080"
→ generate_image(prompt="Norwegian fjord, cinematic, golden hour", width=1920, height=1080)
Excluding unwanted elements
"Generate a clean product photo of a coffee mug, no background clutter, no text"
→ generate_image(
prompt="product photo of a ceramic coffee mug, studio lighting, white background",
negative_prompt="clutter, text, watermark, blurry, shadows"
)
More inference steps for higher quality
"Generate a highly detailed oil painting of a medieval castle, use 20 steps"
→ generate_image(
prompt="oil painting of a medieval castle, highly detailed, dramatic lighting",
steps=20,
model="flux1-dev.safetensors" # FLUX.1-dev supports higher step counts better
)
Check what models are available
"List what models are available in ComfyUI"
→ list_available_models()
Check status of a long-running job
"What's the status of prompt ID abc-123?"
→ get_generation_status(prompt_id="abc-123")
Find out where images are saved
"Where are my generated images being saved?"
→ get_output_directory()
4. Available Tools
generate_image
Generate an image from a text prompt using ComfyUI's FLUX.1-schnell workflow.
Full signature:
async def generate_image(
prompt: str,
width: int = 1024,
height: int = 1024,
steps: int = 4,
model: str = "flux1-schnell.safetensors",
seed: int = -1,
negative_prompt: str = "",
output_dir: str = "",
) -> list[TextContent | ImageContent]
What it does:
- Loads the bundled
flux_schnell.jsonComfyUI API workflow template - Injects your prompt, dimensions, seed, model into the correct workflow nodes
- Submits the workflow to ComfyUI via
POST /api/prompt - Polls
/api/queueevery 2 seconds until the job leaves the queue - Fetches history via
/api/history/{prompt_id}to find the output filename - Downloads the PNG from
/api/view - Saves the PNG to disk as
YYYYMMDD_HHMMSS_{seed}.png - Returns
[TextContent(path + metadata), ImageContent(base64 PNG)]
list_available_models
List all checkpoint models currently available in ComfyUI.
async def list_available_models() -> list[str]
Calls /object_info/CheckpointLoaderSimple and extracts the checkpoint name list. Use this to discover what models are installed before passing a model name to generate_image.
Example return:
["flux1-schnell.safetensors", "flux1-dev.safetensors", "sd_xl_base_1.0.safetensors"]
get_generation_status
Check the status of a queued or running generation job.
async def get_generation_status(prompt_id: str) -> dict
Return values:
status |
Meaning |
|---|---|
"pending" |
Job is in the queue, not yet started |
"running" |
Job is currently being processed |
"completed" |
Job finished — image is in ComfyUI's history |
"not_found" |
Unknown prompt_id — may have expired from history |
"error" |
ComfyUI was unreachable |
Useful when generate_image times out (default 120s) — the job may still be running in ComfyUI.
get_output_directory
Return the absolute path where generated images will be saved.
def get_output_directory() -> str
Returns the expanded, absolute path derived from IMAGE_OUTPUT_DIR env var (or ~/Pictures/mcp-generated default). The directory may not exist yet — generate_image creates it on first use.
5. Parameters Reference
Full parameter table for generate_image:
| Parameter | Type | Default | Description |
|---|---|---|---|
prompt |
str |
(required) | Text description of the image. Goes into the positive CLIP text encoder node. |
width |
int |
1024 |
Image width in pixels. FLUX.1-schnell: 512–2048 recommended. |
height |
int |
1024 |
Image height in pixels. FLUX.1-schnell: 512–2048 recommended. |
steps |
int |
4 |
Number of KSampler inference steps. FLUX.1-schnell is designed for 1–8 steps. |
model |
str |
"flux1-schnell.safetensors" |
Checkpoint model filename as listed by list_available_models. |
seed |
int |
-1 |
RNG seed for reproducibility. -1 = new random seed each call (0 to 2³²−1). |
negative_prompt |
str |
"" |
Text description of things to exclude. Goes into negative CLIP encoder node. |
output_dir |
str |
"" |
Override save directory. Empty = uses IMAGE_OUTPUT_DIR env var or default. |
Recommended dimensions
| Use case | Width | Height |
|---|---|---|
| Square (default) | 1024 | 1024 |
| Portrait | 768 | 1024 |
| Landscape | 1024 | 768 |
| Widescreen | 1280 | 720 |
| HD widescreen | 1920 | 1080 |
| Tall portrait | 512 | 768 |
VRAM note: Patrick's RX 7900 XTX has 24GB VRAM. FLUX.1-schnell requires ~8GB, so you can comfortably run 1920×1080 and even larger. FLUX.1-dev requires ~12GB.
6. Output Format
generate_image returns a list with two items when successful:
Item 1 — TextContent (file path + metadata)
Generated: /home/pplate/Pictures/mcp-generated/20260404_121500_3847291045.png
Seed: 3847291045
Elapsed: 8.3s
Size: 1024x1024, Steps: 4, Model: flux1-schnell.safetensors
The filename format is YYYYMMDD_HHMMSS_{seed}.png — the seed is embedded so you can reproduce the exact image by passing it back as the seed parameter.
Item 2 — ImageContent (inline base64 PNG)
The image displays directly in Roo Code / Claude Desktop chat as an inline image — no need to open a file browser. The same PNG is also saved to disk at the path shown in the TextContent.
{
"type": "image",
"mimeType": "image/png",
"data": "<base64-encoded PNG bytes>"
}
Error responses
When ComfyUI is unreachable or an error occurs, only one TextContent is returned (no ImageContent):
ComfyUI not reachable at http://localhost:8188. Start it with: python main.py --listen
Generation timed out after 120s. prompt_id=abc-123 — use get_generation_status to check
7. Environment Variables
Configure via environment variables in .roo/mcp.json or shell:
| Variable | Default | Description |
|---|---|---|
COMFYUI_URL |
http://localhost:8188 |
Base URL of the running ComfyUI REST API. Change this if ComfyUI runs on a different host or port. |
IMAGE_OUTPUT_DIR |
~/Pictures/mcp-generated |
Directory where generated PNG files are saved. Supports ~ expansion. Created automatically on first generation. |
COMFYUI_TIMEOUT |
120 |
Maximum seconds to wait for a generation job before returning a timeout error. Increase for very large images or slow hardware. |
Setting via shell
export COMFYUI_URL="http://localhost:8188"
export IMAGE_OUTPUT_DIR="/home/pplate/Pictures/ai-art"
export COMFYUI_TIMEOUT="300"
./run.sh
Setting via mcp.json env block
"mcp-image-gen": {
"command": "uv",
"args": ["--directory", "/home/pplate/pi_mcps/mcp/mcp-image-gen", "run", "src/server.py"],
"env": {
"COMFYUI_URL": "http://localhost:8188",
"IMAGE_OUTPUT_DIR": "/home/pplate/Pictures/mcp-generated",
"COMFYUI_TIMEOUT": "120"
}
}
8. Test Status
19 pytest tests — all passing. Tests mock all ComfyUI HTTP calls using respx. No running ComfyUI instance is needed to run the tests.
cd /home/pplate/pi_mcps/mcp/mcp-image-gen
uv run pytest tests/ -v
Test coverage breakdown
| Test file | Tests | Coverage area |
|---|---|---|
tests/test_server.py |
19 | All 4 tools + workflow builder |
| Test name | What it verifies |
|---|---|
test_build_flux_workflow_structure |
Workflow has correct node class_types |
test_build_flux_workflow_params_injected |
All params injected into correct nodes |
test_negative_prompt_included |
Negative prompt goes to node 33 |
test_random_seed_generated |
seed=-1 produces a valid integer in _meta |
test_list_available_models |
Returns model list from mocked /object_info |
test_list_available_models_comfyui_offline |
ConnectError → graceful error string |
test_get_generation_status_pending |
prompt_id in queue_pending → "pending" |
test_get_generation_status_running |
prompt_id in queue_running → "running" |
test_get_generation_status_complete |
Not in queue + in history → "completed" |
test_get_output_directory_default |
No env var → ~/Pictures/mcp-generated expanded |
test_get_output_directory_custom |
Custom env var → that path returned |
test_generate_image_success |
Full lifecycle: queue→poll→history→view→save |
test_generate_image_comfyui_unavailable |
ConnectError → single TextContent error |
test_generate_image_timeout |
COMFYUI_TIMEOUT=0 → timeout TextContent |
test_generate_image_empty_prompt |
Empty string prompt → still succeeds |
test_generate_image_long_prompt |
500-char prompt → not truncated, succeeds |
test_generate_image_invalid_model |
404 from /prompt → error TextContent, no file saved |
test_generate_image_custom_output_dir |
Custom output_dir param → saved there, dir created |
test_generate_image_random_seed_variance |
seed=-1 × 2 → different seeds, different filenames |
Test mock stack
- respx — HTTP-level mocking for all ComfyUI API endpoints
- Pillow (in conftest) — generates real PNG bytes for image response fixtures
- monkeypatch — env vars (
IMAGE_OUTPUT_DIR,COMFYUI_URL,COMFYUI_TIMEOUT) and server module attributes
Real image generation requires ComfyUI to be running. Tests prove the tool logic is correct at the protocol level.
9. Prompt Tips for FLUX.1-schnell
FLUX.1-schnell is a guidance-distilled model designed for speed at 1–8 steps. It responds differently from SDXL or SD1.5.
Prompt structure that works well
[subject], [style/medium], [lighting], [camera/composition], [mood/atmosphere], [quality modifiers]
Example:
ancient library at night, oil painting, warm candlelight, wide angle, mysterious atmosphere, highly detailed, sharp focus
Style keywords
| Style | Prompt keywords |
|---|---|
| Photography | cinematic photograph, DSLR, 85mm lens, shallow depth of field, bokeh |
| Oil painting | oil painting, thick brushstrokes, textured canvas, impressionist |
| Watercolor | watercolor painting, soft washes, paper texture, flowing colors |
| Digital art | digital art, concept art, artstation, octane render |
| Anime/illustration | anime style, cel shading, vibrant colors, clean linework |
| Sketch | pencil sketch, hand drawn, crosshatching, charcoal |
Lighting keywords
golden hour,blue hour,dramatic lighting,rim lightingstudio lighting,soft diffused light,volumetric lightneon glow,bioluminescent,moonlit,candlelight
What works well with FLUX.1-schnell
- Clear subject + style — "red panda in a cozy library, watercolor style"
- Landscape scenes — fjords, forests, cities, abstract environments
- Portrait shots — animals and characters with descriptive appearance
- Concept art — futuristic cities, sci-fi environments, fantasy scenes
- Low step counts — 4 steps is designed to be near-optimal for this model
What to avoid
- Booru-style tag dumps (FLUX handles natural language better than SD1.5)
- Contradictory instructions — "dark AND bright", "realistic AND cartoon"
- Overly complex scenes at very small resolutions
Using the negative prompt
FLUX.1-schnell has reduced CFG guidance so negative prompts have less impact than in SDXL. Use them for broad exclusions:
negative_prompt="blurry, out of focus, watermark, text, signature, low quality, artifacts"
Reproducibility
Always save the seed from the TextContent output if you want to reproduce a result:
Seed: 3847291045
Then pass it back: seed=3847291045
10. Known Limitations
ComfyUI must run locally
The MCP server connects to COMFYUI_URL (default: http://localhost:8188). ComfyUI is a local application — it does not have a cloud API. You must start it before requesting image generation. The server returns a clear error message if ComfyUI is not reachable.
Model must be pre-loaded
ComfyUI loads checkpoint models into VRAM on first use. The first generation with a model takes longer as VRAM is allocated (FLUX.1-schnell: ~8GB). Subsequent generations with the same model are faster.
# Verify model is installed before generation
# → ask Lumen: "list available models in ComfyUI"
AMD ROCm setup complexity
AMD GPU support requires:
- ROCm drivers installed (
rocm-smiworking) - PyTorch built with ROCm support (not the default CUDA build)
HSA_OVERRIDE_GFX_VERSION=11.0.0for RX 7900 XTX (gfx1100)
Without these, ComfyUI will fall back to CPU — very slow (minutes per image vs. ~8 seconds on RX 7900 XTX).
Check GPU is being used:
# In another terminal while generating:
watch -n 1 rocm-smi
# VRAM usage should spike to ~8GB during generation
Timeout on large images
The default COMFYUI_TIMEOUT=120 (2 minutes) may not be enough for:
- Very large resolutions (2048×2048+)
- High step counts (20+)
- First generation loading a new model
Increase via env var:
export COMFYUI_TIMEOUT=300 # 5 minutes
If generate_image returns a timeout error, the job may still be running in ComfyUI. Use get_generation_status(prompt_id) to check.
Ollama image gen is macOS-only (April 2026)
Ollama launched experimental image generation in January 2026, but it is macOS-only as of April 2026. Linux support is announced as "coming soon." When Linux support arrives, the server can switch backends via IMAGE_BACKEND=ollama without changing any tool signatures.
ComfyUI history is ephemeral
ComfyUI keeps generation history in memory — it is lost on restart. The get_generation_status tool will return "not_found" for old prompt IDs after a ComfyUI restart. The saved PNG file on disk persists regardless.