Files

T

Patrick Plate 79f1e6d65f feat(mcp-image-gen): add name and count params to generate_image

- Add name (str) param: filename prefix saved as {name}_{timestamp}_{seed}.png
- Add count (int, 1-10) param: generate N images in one call
- Extract _sanitize_name() helper: strips special chars, collapses underscores, caps at 64 chars
- Extract _build_filename() helper: pure function for testable filename construction
- Extract _generate_single() coroutine: clean loop body for batch generation
- Fixed seed batches increment seed per image (seed+i-1) for deterministic variation
- random seed (-1) batches give independent random seeds per image
- Partial batch failures continue (error TextContent in slot, remaining images proceed)
- Returns flat interleaved [Text1, Image1, Text2, Image2, ...] list
- 34/34 tests passing (was 19, added 15 new tests)

2026-04-06 07:45:37 +02:00

15 KiB

Raw Blame History

Assessment: Expand `generate_image` with `name` and `count` Parameters

Author: Lumen | Date: 2026-04-06 | Ticket: — BigMind Session: 00070c37-b013-4342-a8ae-f81da0e3180d Status: 🔵 DRAFT — awaiting Patrick review

1. Problem Statement

The current generate_image() tool generates a single image and saves it with an auto-generated filename of {timestamp}_{seed}.png. Two common workflows are not yet supported:

Named outputs — When generating thematic sets (Lumen profile images, wiki banners, concept art), the caller wants a meaningful prefix in the filename (e.g., lumen_profile_20260406_140236_2409122067.png) rather than a bare timestamp. This also enables grouping output by purpose in the directory listing.
Batch generation — Generating multiple variations of the same prompt in one tool call is a common creative workflow. Currently, the caller must invoke generate_image N times with separate tool calls, which is verbose and loses the semantic grouping.

Goal: Add two optional parameters — name (filename prefix string) and count (integer repetitions) — to generate_image with minimal disruption to existing behaviour and test coverage.

2. Requirements

2.1 Functional Requirements

ID	Requirement
F-1	`name` parameter (default `""`) prepends a sanitized label to the output filename
F-2	When `name=""` (default), filename format is unchanged: `{timestamp}_{seed}.png`
F-3	When `name="lumen_profile"`, filename format is: `lumen_profile_{timestamp}_{seed}.png`
F-4	`count` parameter (default `1`) generates N images sequentially
F-5	When `count=1` (default), return value is identical to the current `[TextContent, ImageContent]`
F-6	When `count=N > 1`, return value is a flat list: `[Text1, Image1, Text2, Image2, ..., TextN, ImageN]`
F-7	When `count>1` and `seed=-1`, each image gets an independently random seed
F-8	When `count>1` and a fixed `seed` is provided, images use `seed`, `seed+1`, `seed+2`, … to produce deterministic variation
F-9	`count` is capped at a maximum (proposed: 10) to prevent runaway generation
F-10	`name` is sanitized: non-alphanumeric characters (except `-` and `_`) are stripped/replaced; max 64 chars
F-11	Partial success: if one image in a batch fails, the error is returned as a `TextContent` error item in that position rather than aborting the whole batch
F-12	The TextContent for each image in a batch includes the 1-of-N index: `[1/3] Generated: ...`

2.2 Non-Functional Requirements

ID	Requirement
NF-1	Sequential generation — no concurrent ComfyUI submissions (ComfyUI queues internally; parallel MCP submissions would complicate polling)
NF-2	Backward compatibility — all existing callers with no `name`/`count` args produce identical output
NF-3	All existing 19 tests must continue to pass without modification
NF-4	New tests must cover: name prefix in filename, count=2 success, count with fixed seed increments, count with partial failure, name sanitization, count cap enforcement
NF-5	MCP tool schema (visible in Claude/Roo Code) must surface clear descriptions for the new params

3. Affected Files

File	Change Type	Description
`mcp/mcp-image-gen/src/server.py`	Modify	Add `name: str = ""` and `count: int = 1` params to `generate_image()`; add `_sanitize_name()` helper; extract `_generate_single()` inner logic
`mcp/mcp-image-gen/tests/test_server.py`	Modify	Add 6+ new test cases covering new parameters
`mcp/mcp-image-gen/README.md`	Modify	Update `generate_image` tool documentation table
`docs/wiki/pages/mcp-image-gen.md`	Modify	Update tool reference table with new parameters

No schema changes, no new dependencies, no workflow JSON changes.

4. Design Decisions

4.1 Filename Convention with `name`

Current: {timestamp}_{seed}.png Proposed: {sanitized_name}_{timestamp}_{seed}.png (when name is provided)

The name is placed as a prefix rather than suffix so directory ls output groups named sets together alphabetically:

lumen_profile_20260406_140236_2409122067.png
lumen_profile_20260406_140258_764633840.png
wiki_banner_20260406_141000_1234567.png

Sanitization rule: re.sub(r'[^a-zA-Z0-9_-]', '_', name)[:64] — replaces any character that is not alphanumeric, dash, or underscore with _, then truncates to 64 chars.

4.2 Seed Behaviour for Batch Generation

Scenario	Behaviour
`count=3, seed=-1`	Each call to `build_flux_workflow` gets `seed=-1` → 3 independent random seeds
`count=3, seed=42`	Seeds are 42, 43, 44 — deterministic, reproducible variation

This follows the convention of most image generation tools (e.g., ComfyUI's own batch seed increment).

4.3 Return Structure for `count > 1`

Return a flat interleaved list: [Text1, Image1, Text2, Image2]

Rationale: MCP content lists are flat arrays. Claude/Roo Code renders them sequentially — a flat list means each image appears immediately below its metadata line. A nested structure would require the caller to unwrap it.

For count=1 (default): Behaviour is identical to today — [TextContent, ImageContent]. No caller breakage.

4.4 Refactoring: Extract `_generate_single()`

The current generate_image function is 180+ lines of inline logic. To support count, the inner pipeline (queue → poll → history → download → save → encode) will be extracted to a private async def _generate_single(prompt, ..., index, total) coroutine. generate_image then loops count times calling _generate_single and accumulates results.

This refactoring:

Makes the count loop clean (results.extend(await _generate_single(...)))
Makes partial failure handling straightforward (catch per iteration)
Improves testability of the single-image path

4.5 Maximum Count Cap

Cap count at 10. Rationale:

FLUX.1-schnell takes ~10–35s per image on RX 7900 XTX → 10 images ≈ 100–350s maximum
MCP tool call timeout in Roo Code defaults to 5 minutes — 10 images is safe margin
ComfyUI queues them internally; the MCP server polls sequentially, not in parallel

When count > 10, the tool returns a single TextContent error immediately (no images generated) with message: "count={N} exceeds maximum of 10. Reduce count and retry."

5. Implementation Plan

Step 1 — Add `_sanitize_name()` helper

import re

def _sanitize_name(name: str) -> str:
    """Sanitize a name for use as a filename prefix."""
    sanitized = re.sub(r'[^a-zA-Z0-9_-]', '_', name)
    return sanitized[:64]

Location: server.py, after build_flux_workflow() (pure function section).

Step 2 — Extract `_generate_single()` coroutine

Extract the body of the current generate_image (lines 162–310) into:

async def _generate_single(
    prompt: str,
    width: int,
    height: int,
    steps: int,
    model: str,
    seed: int,
    negative_prompt: str,
    resolved_output_dir: Path,
    filename_prefix: str,
    index: int,
    total: int,
) -> list:

The filename construction changes to:

filename = f"{filename_prefix}{timestamp}_{actual_seed}.png"
# where filename_prefix = f"{sanitized_name}_" if sanitized_name else ""

The TextContent text changes when total > 1:

prefix_label = f"[{index}/{total}] " if total > 1 else ""
text = f"{prefix_label}Generated: {out_path}\nSeed: ..."

Step 3 — Update `generate_image()` signature

@mcp.tool()
async def generate_image(
    prompt: str,
    width: int = 1024,
    height: int = 1024,
    steps: int = 4,
    model: str = "flux1-schnell.safetensors",
    seed: int = -1,
    negative_prompt: str = "",
    output_dir: str = "",
    name: str = "",
    count: int = 1,
) -> list:

Body of generate_image becomes:

# Validate count
MAX_COUNT = 10
if count < 1 or count > MAX_COUNT:
    return [TextContent(type="text", text=f"count={count} is invalid. Must be 1–{MAX_COUNT}.")]

sanitized_name = _sanitize_name(name) if name else ""
filename_prefix = f"{sanitized_name}_" if sanitized_name else ""
resolved_output_dir = Path(output_dir or IMAGE_OUTPUT_DIR).expanduser().resolve()

results = []
for i in range(1, count + 1):
    actual_seed = seed if seed == -1 else seed + (i - 1)
    items = await _generate_single(
        prompt=prompt, width=width, height=height, steps=steps,
        model=model, seed=actual_seed, negative_prompt=negative_prompt,
        resolved_output_dir=resolved_output_dir,
        filename_prefix=filename_prefix, index=i, total=count,
    )
    results.extend(items)

return results

Step 4 — Write new tests

Add to test_server.py:

Test	Description
`test_generate_image_with_name`	`name="lumen"` → filename starts with `lumen_`
`test_generate_image_name_sanitization`	`name="my image! v2"` → `my_image__v2_` prefix
`test_generate_image_count_2_success`	`count=2` → 4 items in result, 2 files saved
`test_generate_image_count_fixed_seed`	`count=2, seed=42` → seeds 42 and 43 in filenames
`test_generate_image_count_partial_failure`	`count=2`, second POST fails → 2 items (success) + 1 item (error)
`test_generate_image_count_cap_exceeded`	`count=11` → single TextContent error, no generation
`test_generate_image_count_0_invalid`	`count=0` → single TextContent error
`test_generate_image_name_and_count_combined`	`name="banner", count=2` → both files prefixed `banner_`

Step 5 — Update documentation

Update generate_image docstring in server.py to document name and count
Update parameter table in README.md
Update tool reference in docs/wiki/pages/mcp-image-gen.md

Step 6 — Run full test suite

cd mcp/mcp-image-gen && uv run pytest tests/ -v --tb=short

All 19 existing + 8 new = 27 tests must pass.

Step 7 — Commit and push

Branch: feat/mcp-image-gen/generate-image-name-count Commit: feat(mcp-image-gen): add name and count params to generate_image

6. Risks

Risk	Likelihood	Impact	Mitigation
Partial batch failure leaves orphaned files on disk	Medium	Low	Files for successful images are kept; error TextContent clearly identifies which index failed. No cleanup needed — partial results are useful.
`count` loop adds significant latency visible in Roo Code	Medium	Medium	Document expected time: `count × ~15s`. MCP timeout is 5 min; max 10 images ≈ 150s. Still within limit.
Seed increment wraps around at `2^32`	Very Low	Low	`(seed + i - 1) % 2**32` — add modulo guard in `_generate_single`
`_generate_single` refactor introduces regression in existing tests	Low	High	Existing test fixtures mock ComfyUI endpoints — as long as the HTTP call sequence is unchanged, respx mocks will match. Verify each existing test still passes before adding new ones.
`name` with only special chars becomes empty after sanitization	Low	Medium	After sanitization, if result is empty string, treat as unnamed (no prefix). Add assertion in `_sanitize_name` to return `""` for all-whitespace/special inputs.
MCP tool schema change breaks existing callers	Very Low	Low	New params are optional with defaults — backward compatible. Roo Code re-reads schema on server restart.

7. Alternatives Considered

7.1 Separate `generate_images_batch()` Tool (Rejected)

Add a new tool instead of expanding generate_image.

Pros: Clean separation, no refactoring of existing tool. Cons: Two tools for the same backend; callers must learn two tool names; MCP tool list grows. The MCP convention favours extending existing tools with optional parameters rather than proliferating tools.

Verdict: Rejected. Optional parameters with backward-compatible defaults is the right pattern here.

7.2 Return Grouped List of Lists for `count > 1` (Rejected)

Return [[Text1, Image1], [Text2, Image2]] for batch results.

Pros: Caller can index by image number cleanly. Cons: MCP content type is a flat list[ContentBlock]. FastMCP does not support nested lists in tool returns — they would be serialized as strings, not rendered. Roo Code renders content sequentially; flat interleaved is the idiomatic structure.

Verdict: Rejected. Flat interleaved list [Text1, Image1, Text2, Image2] is MCP-idiomatic.

7.3 Parallel ComfyUI Submission for Batch (Rejected)

Submit all count prompts to ComfyUI simultaneously (async tasks), then collect results in order.

Pros: Faster if ComfyUI supports parallel queue processing (it does). Cons: ComfyUI processes one job at a time on a single GPU regardless — parallel submission just fills the queue. Polling becomes complex (N polling loops). Error handling harder. Out-of-order completions break index alignment.

Verdict: Rejected for v1. Sequential submission is simpler, correct, and produces no worse throughput. Can revisit if ComfyUI gains true parallel processing support.

7.4 Name as Subdirectory Instead of Filename Prefix (Rejected)

When name="lumen", save to output_dir/lumen/ instead of output_dir/lumen_*.png.

Pros: Better directory organisation for large sets. Cons: Complicates the implementation (directory creation per name), changes the return path format, breaks callers who assume a flat output directory. Adds complexity for minimal gain at count ≤ 10.

Verdict: Rejected for v1. Prefix approach is simpler and equally readable.

8. Success Criteria

Criterion	Measure
All 27 tests pass	`uv run pytest tests/ -v` exits 0
`name="lumen"` → file starts with `lumen_`	Assert in `test_generate_image_with_name`
`count=2` → 4 content items, 2 files	Assert `len(result) == 4`, `len(glob("*.png")) == 2`
`count=2, seed=42` → seeds 42 and 43	Assert seed values in TextContent
`count=11` → error TextContent, no ComfyUI call	Assert `len(result) == 1`, no `/api/prompt` mock hit
Backward compat: existing callers unaffected	All 19 existing tests pass without modification
MCP tool schema shows `name` and `count` params	Visible in Roo Code tool list after server restart

9. Open Questions

#	Question	Owner	Priority
Q1	Should `count=0` be an error, or silently return `[]` (empty list)?	Patrick	Low — assessment recommends error for clarity
Q2	Max count cap: 10 or higher? 10 ≈ 150s max at 15s/image — feels right, but could be raised to 20 for batch profile image sets.	Patrick	Medium
Q3	Should partial batch failure stop remaining iterations, or always complete all N?	Patrick	Medium — assessment recommends continue (partial success)
Q4	Should `name` parameter also tag the TextContent output text, e.g. `[lumen_profile 1/3] Generated: ...`?	Patrick	Low

15 KiB Raw Blame History Unescape Escape

Assessment: Expand generate_image with name and count Parameters