Files
pi_mcps/mcp/mcp-image-gen/EXPAND_GENERATE_IMAGE_Assessment.md
T
Patrick Plate 79f1e6d65f feat(mcp-image-gen): add name and count params to generate_image
- Add name (str) param: filename prefix saved as {name}_{timestamp}_{seed}.png
- Add count (int, 1-10) param: generate N images in one call
- Extract _sanitize_name() helper: strips special chars, collapses underscores, caps at 64 chars
- Extract _build_filename() helper: pure function for testable filename construction
- Extract _generate_single() coroutine: clean loop body for batch generation
- Fixed seed batches increment seed per image (seed+i-1) for deterministic variation
- random seed (-1) batches give independent random seeds per image
- Partial batch failures continue (error TextContent in slot, remaining images proceed)
- Returns flat interleaved [Text1, Image1, Text2, Image2, ...] list
- 34/34 tests passing (was 19, added 15 new tests)
2026-04-06 07:45:37 +02:00

15 KiB
Raw Blame History

Assessment: Expand generate_image with name and count Parameters

Author: Lumen | Date: 2026-04-06 | Ticket: — BigMind Session: 00070c37-b013-4342-a8ae-f81da0e3180d Status: 🔵 DRAFT — awaiting Patrick review


1. Problem Statement

The current generate_image() tool generates a single image and saves it with an auto-generated filename of {timestamp}_{seed}.png. Two common workflows are not yet supported:

  1. Named outputs — When generating thematic sets (Lumen profile images, wiki banners, concept art), the caller wants a meaningful prefix in the filename (e.g., lumen_profile_20260406_140236_2409122067.png) rather than a bare timestamp. This also enables grouping output by purpose in the directory listing.

  2. Batch generation — Generating multiple variations of the same prompt in one tool call is a common creative workflow. Currently, the caller must invoke generate_image N times with separate tool calls, which is verbose and loses the semantic grouping.

Goal: Add two optional parameters — name (filename prefix string) and count (integer repetitions) — to generate_image with minimal disruption to existing behaviour and test coverage.


2. Requirements

2.1 Functional Requirements

ID Requirement
F-1 name parameter (default "") prepends a sanitized label to the output filename
F-2 When name="" (default), filename format is unchanged: {timestamp}_{seed}.png
F-3 When name="lumen_profile", filename format is: lumen_profile_{timestamp}_{seed}.png
F-4 count parameter (default 1) generates N images sequentially
F-5 When count=1 (default), return value is identical to the current [TextContent, ImageContent]
F-6 When count=N > 1, return value is a flat list: [Text1, Image1, Text2, Image2, ..., TextN, ImageN]
F-7 When count>1 and seed=-1, each image gets an independently random seed
F-8 When count>1 and a fixed seed is provided, images use seed, seed+1, seed+2, … to produce deterministic variation
F-9 count is capped at a maximum (proposed: 10) to prevent runaway generation
F-10 name is sanitized: non-alphanumeric characters (except - and _) are stripped/replaced; max 64 chars
F-11 Partial success: if one image in a batch fails, the error is returned as a TextContent error item in that position rather than aborting the whole batch
F-12 The TextContent for each image in a batch includes the 1-of-N index: [1/3] Generated: ...

2.2 Non-Functional Requirements

ID Requirement
NF-1 Sequential generation — no concurrent ComfyUI submissions (ComfyUI queues internally; parallel MCP submissions would complicate polling)
NF-2 Backward compatibility — all existing callers with no name/count args produce identical output
NF-3 All existing 19 tests must continue to pass without modification
NF-4 New tests must cover: name prefix in filename, count=2 success, count with fixed seed increments, count with partial failure, name sanitization, count cap enforcement
NF-5 MCP tool schema (visible in Claude/Roo Code) must surface clear descriptions for the new params

3. Affected Files

File Change Type Description
mcp/mcp-image-gen/src/server.py Modify Add name: str = "" and count: int = 1 params to generate_image(); add _sanitize_name() helper; extract _generate_single() inner logic
mcp/mcp-image-gen/tests/test_server.py Modify Add 6+ new test cases covering new parameters
mcp/mcp-image-gen/README.md Modify Update generate_image tool documentation table
docs/wiki/pages/mcp-image-gen.md Modify Update tool reference table with new parameters

No schema changes, no new dependencies, no workflow JSON changes.


4. Design Decisions

4.1 Filename Convention with name

Current: {timestamp}_{seed}.png Proposed: {sanitized_name}_{timestamp}_{seed}.png (when name is provided)

The name is placed as a prefix rather than suffix so directory ls output groups named sets together alphabetically:

lumen_profile_20260406_140236_2409122067.png
lumen_profile_20260406_140258_764633840.png
wiki_banner_20260406_141000_1234567.png

Sanitization rule: re.sub(r'[^a-zA-Z0-9_-]', '_', name)[:64] — replaces any character that is not alphanumeric, dash, or underscore with _, then truncates to 64 chars.

4.2 Seed Behaviour for Batch Generation

Scenario Behaviour
count=3, seed=-1 Each call to build_flux_workflow gets seed=-1 → 3 independent random seeds
count=3, seed=42 Seeds are 42, 43, 44 — deterministic, reproducible variation

This follows the convention of most image generation tools (e.g., ComfyUI's own batch seed increment).

4.3 Return Structure for count > 1

Return a flat interleaved list: [Text1, Image1, Text2, Image2]

Rationale: MCP content lists are flat arrays. Claude/Roo Code renders them sequentially — a flat list means each image appears immediately below its metadata line. A nested structure would require the caller to unwrap it.

For count=1 (default): Behaviour is identical to today — [TextContent, ImageContent]. No caller breakage.

4.4 Refactoring: Extract _generate_single()

The current generate_image function is 180+ lines of inline logic. To support count, the inner pipeline (queue → poll → history → download → save → encode) will be extracted to a private async def _generate_single(prompt, ..., index, total) coroutine. generate_image then loops count times calling _generate_single and accumulates results.

This refactoring:

  • Makes the count loop clean (results.extend(await _generate_single(...)))
  • Makes partial failure handling straightforward (catch per iteration)
  • Improves testability of the single-image path

4.5 Maximum Count Cap

Cap count at 10. Rationale:

  • FLUX.1-schnell takes ~1035s per image on RX 7900 XTX → 10 images ≈ 100350s maximum
  • MCP tool call timeout in Roo Code defaults to 5 minutes — 10 images is safe margin
  • ComfyUI queues them internally; the MCP server polls sequentially, not in parallel

When count > 10, the tool returns a single TextContent error immediately (no images generated) with message: "count={N} exceeds maximum of 10. Reduce count and retry."


5. Implementation Plan

Step 1 — Add _sanitize_name() helper

import re

def _sanitize_name(name: str) -> str:
    """Sanitize a name for use as a filename prefix."""
    sanitized = re.sub(r'[^a-zA-Z0-9_-]', '_', name)
    return sanitized[:64]

Location: server.py, after build_flux_workflow() (pure function section).

Step 2 — Extract _generate_single() coroutine

Extract the body of the current generate_image (lines 162310) into:

async def _generate_single(
    prompt: str,
    width: int,
    height: int,
    steps: int,
    model: str,
    seed: int,
    negative_prompt: str,
    resolved_output_dir: Path,
    filename_prefix: str,
    index: int,
    total: int,
) -> list:

The filename construction changes to:

filename = f"{filename_prefix}{timestamp}_{actual_seed}.png"
# where filename_prefix = f"{sanitized_name}_" if sanitized_name else ""

The TextContent text changes when total > 1:

prefix_label = f"[{index}/{total}] " if total > 1 else ""
text = f"{prefix_label}Generated: {out_path}\nSeed: ..."

Step 3 — Update generate_image() signature

@mcp.tool()
async def generate_image(
    prompt: str,
    width: int = 1024,
    height: int = 1024,
    steps: int = 4,
    model: str = "flux1-schnell.safetensors",
    seed: int = -1,
    negative_prompt: str = "",
    output_dir: str = "",
    name: str = "",
    count: int = 1,
) -> list:

Body of generate_image becomes:

# Validate count
MAX_COUNT = 10
if count < 1 or count > MAX_COUNT:
    return [TextContent(type="text", text=f"count={count} is invalid. Must be 1{MAX_COUNT}.")]

sanitized_name = _sanitize_name(name) if name else ""
filename_prefix = f"{sanitized_name}_" if sanitized_name else ""
resolved_output_dir = Path(output_dir or IMAGE_OUTPUT_DIR).expanduser().resolve()

results = []
for i in range(1, count + 1):
    actual_seed = seed if seed == -1 else seed + (i - 1)
    items = await _generate_single(
        prompt=prompt, width=width, height=height, steps=steps,
        model=model, seed=actual_seed, negative_prompt=negative_prompt,
        resolved_output_dir=resolved_output_dir,
        filename_prefix=filename_prefix, index=i, total=count,
    )
    results.extend(items)

return results

Step 4 — Write new tests

Add to test_server.py:

Test Description
test_generate_image_with_name name="lumen" → filename starts with lumen_
test_generate_image_name_sanitization name="my image! v2"my_image__v2_ prefix
test_generate_image_count_2_success count=2 → 4 items in result, 2 files saved
test_generate_image_count_fixed_seed count=2, seed=42 → seeds 42 and 43 in filenames
test_generate_image_count_partial_failure count=2, second POST fails → 2 items (success) + 1 item (error)
test_generate_image_count_cap_exceeded count=11 → single TextContent error, no generation
test_generate_image_count_0_invalid count=0 → single TextContent error
test_generate_image_name_and_count_combined name="banner", count=2 → both files prefixed banner_

Step 5 — Update documentation

Step 6 — Run full test suite

cd mcp/mcp-image-gen && uv run pytest tests/ -v --tb=short

All 19 existing + 8 new = 27 tests must pass.

Step 7 — Commit and push

Branch: feat/mcp-image-gen/generate-image-name-count Commit: feat(mcp-image-gen): add name and count params to generate_image


6. Risks

Risk Likelihood Impact Mitigation
Partial batch failure leaves orphaned files on disk Medium Low Files for successful images are kept; error TextContent clearly identifies which index failed. No cleanup needed — partial results are useful.
count loop adds significant latency visible in Roo Code Medium Medium Document expected time: count × ~15s. MCP timeout is 5 min; max 10 images ≈ 150s. Still within limit.
Seed increment wraps around at 2^32 Very Low Low (seed + i - 1) % 2**32 — add modulo guard in _generate_single
_generate_single refactor introduces regression in existing tests Low High Existing test fixtures mock ComfyUI endpoints — as long as the HTTP call sequence is unchanged, respx mocks will match. Verify each existing test still passes before adding new ones.
name with only special chars becomes empty after sanitization Low Medium After sanitization, if result is empty string, treat as unnamed (no prefix). Add assertion in _sanitize_name to return "" for all-whitespace/special inputs.
MCP tool schema change breaks existing callers Very Low Low New params are optional with defaults — backward compatible. Roo Code re-reads schema on server restart.

7. Alternatives Considered

7.1 Separate generate_images_batch() Tool (Rejected)

Add a new tool instead of expanding generate_image.

Pros: Clean separation, no refactoring of existing tool. Cons: Two tools for the same backend; callers must learn two tool names; MCP tool list grows. The MCP convention favours extending existing tools with optional parameters rather than proliferating tools.

Verdict: Rejected. Optional parameters with backward-compatible defaults is the right pattern here.

7.2 Return Grouped List of Lists for count > 1 (Rejected)

Return [[Text1, Image1], [Text2, Image2]] for batch results.

Pros: Caller can index by image number cleanly. Cons: MCP content type is a flat list[ContentBlock]. FastMCP does not support nested lists in tool returns — they would be serialized as strings, not rendered. Roo Code renders content sequentially; flat interleaved is the idiomatic structure.

Verdict: Rejected. Flat interleaved list [Text1, Image1, Text2, Image2] is MCP-idiomatic.

7.3 Parallel ComfyUI Submission for Batch (Rejected)

Submit all count prompts to ComfyUI simultaneously (async tasks), then collect results in order.

Pros: Faster if ComfyUI supports parallel queue processing (it does). Cons: ComfyUI processes one job at a time on a single GPU regardless — parallel submission just fills the queue. Polling becomes complex (N polling loops). Error handling harder. Out-of-order completions break index alignment.

Verdict: Rejected for v1. Sequential submission is simpler, correct, and produces no worse throughput. Can revisit if ComfyUI gains true parallel processing support.

7.4 Name as Subdirectory Instead of Filename Prefix (Rejected)

When name="lumen", save to output_dir/lumen/ instead of output_dir/lumen_*.png.

Pros: Better directory organisation for large sets. Cons: Complicates the implementation (directory creation per name), changes the return path format, breaks callers who assume a flat output directory. Adds complexity for minimal gain at count ≤ 10.

Verdict: Rejected for v1. Prefix approach is simpler and equally readable.


8. Success Criteria

Criterion Measure
All 27 tests pass uv run pytest tests/ -v exits 0
name="lumen" → file starts with lumen_ Assert in test_generate_image_with_name
count=2 → 4 content items, 2 files Assert len(result) == 4, len(glob("*.png")) == 2
count=2, seed=42 → seeds 42 and 43 Assert seed values in TextContent
count=11 → error TextContent, no ComfyUI call Assert len(result) == 1, no /api/prompt mock hit
Backward compat: existing callers unaffected All 19 existing tests pass without modification
MCP tool schema shows name and count params Visible in Roo Code tool list after server restart

9. Open Questions

# Question Owner Priority
Q1 Should count=0 be an error, or silently return [] (empty list)? Patrick Low — assessment recommends error for clarity
Q2 Max count cap: 10 or higher? 10 ≈ 150s max at 15s/image — feels right, but could be raised to 20 for batch profile image sets. Patrick Medium
Q3 Should partial batch failure stop remaining iterations, or always complete all N? Patrick Medium — assessment recommends continue (partial success)
Q4 Should name parameter also tag the TextContent output text, e.g. [lumen_profile 1/3] Generated: ...? Patrick Low