79f1e6d65f
- Add name (str) param: filename prefix saved as {name}_{timestamp}_{seed}.png
- Add count (int, 1-10) param: generate N images in one call
- Extract _sanitize_name() helper: strips special chars, collapses underscores, caps at 64 chars
- Extract _build_filename() helper: pure function for testable filename construction
- Extract _generate_single() coroutine: clean loop body for batch generation
- Fixed seed batches increment seed per image (seed+i-1) for deterministic variation
- random seed (-1) batches give independent random seeds per image
- Partial batch failures continue (error TextContent in slot, remaining images proceed)
- Returns flat interleaved [Text1, Image1, Text2, Image2, ...] list
- 34/34 tests passing (was 19, added 15 new tests)
320 lines
15 KiB
Markdown
320 lines
15 KiB
Markdown
# Assessment: Expand `generate_image` with `name` and `count` Parameters
|
||
|
||
*Author: Lumen | Date: 2026-04-06 | Ticket: —*
|
||
*BigMind Session: `00070c37-b013-4342-a8ae-f81da0e3180d`*
|
||
*Status: 🔵 DRAFT — awaiting Patrick review*
|
||
|
||
---
|
||
|
||
## 1. Problem Statement
|
||
|
||
The current [`generate_image()`](mcp/mcp-image-gen/src/server.py:133) tool generates a single image and saves it with an auto-generated filename of `{timestamp}_{seed}.png`. Two common workflows are not yet supported:
|
||
|
||
1. **Named outputs** — When generating thematic sets (Lumen profile images, wiki banners, concept art), the caller wants a meaningful prefix in the filename (e.g., `lumen_profile_20260406_140236_2409122067.png`) rather than a bare timestamp. This also enables grouping output by purpose in the directory listing.
|
||
|
||
2. **Batch generation** — Generating multiple variations of the same prompt in one tool call is a common creative workflow. Currently, the caller must invoke `generate_image` N times with separate tool calls, which is verbose and loses the semantic grouping.
|
||
|
||
**Goal:** Add two optional parameters — `name` (filename prefix string) and `count` (integer repetitions) — to `generate_image` with minimal disruption to existing behaviour and test coverage.
|
||
|
||
---
|
||
|
||
## 2. Requirements
|
||
|
||
### 2.1 Functional Requirements
|
||
|
||
| ID | Requirement |
|
||
|----|-------------|
|
||
| F-1 | `name` parameter (default `""`) prepends a sanitized label to the output filename |
|
||
| F-2 | When `name=""` (default), filename format is unchanged: `{timestamp}_{seed}.png` |
|
||
| F-3 | When `name="lumen_profile"`, filename format is: `lumen_profile_{timestamp}_{seed}.png` |
|
||
| F-4 | `count` parameter (default `1`) generates N images sequentially |
|
||
| F-5 | When `count=1` (default), return value is identical to the current `[TextContent, ImageContent]` |
|
||
| F-6 | When `count=N > 1`, return value is a flat list: `[Text1, Image1, Text2, Image2, ..., TextN, ImageN]` |
|
||
| F-7 | When `count>1` and `seed=-1`, each image gets an independently random seed |
|
||
| F-8 | When `count>1` and a fixed `seed` is provided, images use `seed`, `seed+1`, `seed+2`, … to produce deterministic variation |
|
||
| F-9 | `count` is capped at a maximum (proposed: 10) to prevent runaway generation |
|
||
| F-10 | `name` is sanitized: non-alphanumeric characters (except `-` and `_`) are stripped/replaced; max 64 chars |
|
||
| F-11 | Partial success: if one image in a batch fails, the error is returned as a `TextContent` error item in that position rather than aborting the whole batch |
|
||
| F-12 | The TextContent for each image in a batch includes the 1-of-N index: `[1/3] Generated: ...` |
|
||
|
||
### 2.2 Non-Functional Requirements
|
||
|
||
| ID | Requirement |
|
||
|----|-------------|
|
||
| NF-1 | Sequential generation — no concurrent ComfyUI submissions (ComfyUI queues internally; parallel MCP submissions would complicate polling) |
|
||
| NF-2 | Backward compatibility — all existing callers with no `name`/`count` args produce identical output |
|
||
| NF-3 | All existing 19 tests must continue to pass without modification |
|
||
| NF-4 | New tests must cover: name prefix in filename, count=2 success, count with fixed seed increments, count with partial failure, name sanitization, count cap enforcement |
|
||
| NF-5 | MCP tool schema (visible in Claude/Roo Code) must surface clear descriptions for the new params |
|
||
|
||
---
|
||
|
||
## 3. Affected Files
|
||
|
||
| File | Change Type | Description |
|
||
|------|-------------|-------------|
|
||
| [`mcp/mcp-image-gen/src/server.py`](mcp/mcp-image-gen/src/server.py:133) | Modify | Add `name: str = ""` and `count: int = 1` params to `generate_image()`; add `_sanitize_name()` helper; extract `_generate_single()` inner logic |
|
||
| [`mcp/mcp-image-gen/tests/test_server.py`](mcp/mcp-image-gen/tests/test_server.py:1) | Modify | Add 6+ new test cases covering new parameters |
|
||
| [`mcp/mcp-image-gen/README.md`](mcp/mcp-image-gen/README.md) | Modify | Update `generate_image` tool documentation table |
|
||
| [`docs/wiki/pages/mcp-image-gen.md`](docs/wiki/pages/mcp-image-gen.md) | Modify | Update tool reference table with new parameters |
|
||
|
||
No schema changes, no new dependencies, no workflow JSON changes.
|
||
|
||
---
|
||
|
||
## 4. Design Decisions
|
||
|
||
### 4.1 Filename Convention with `name`
|
||
|
||
**Current:** `{timestamp}_{seed}.png`
|
||
**Proposed:** `{sanitized_name}_{timestamp}_{seed}.png` (when `name` is provided)
|
||
|
||
The `name` is placed as a **prefix** rather than suffix so directory `ls` output groups named sets together alphabetically:
|
||
```
|
||
lumen_profile_20260406_140236_2409122067.png
|
||
lumen_profile_20260406_140258_764633840.png
|
||
wiki_banner_20260406_141000_1234567.png
|
||
```
|
||
|
||
**Sanitization rule:** `re.sub(r'[^a-zA-Z0-9_-]', '_', name)[:64]` — replaces any character that is not alphanumeric, dash, or underscore with `_`, then truncates to 64 chars.
|
||
|
||
### 4.2 Seed Behaviour for Batch Generation
|
||
|
||
| Scenario | Behaviour |
|
||
|----------|-----------|
|
||
| `count=3, seed=-1` | Each call to `build_flux_workflow` gets `seed=-1` → 3 independent random seeds |
|
||
| `count=3, seed=42` | Seeds are 42, 43, 44 — deterministic, reproducible variation |
|
||
|
||
This follows the convention of most image generation tools (e.g., ComfyUI's own batch seed increment).
|
||
|
||
### 4.3 Return Structure for `count > 1`
|
||
|
||
Return a **flat interleaved list**: `[Text1, Image1, Text2, Image2]`
|
||
|
||
**Rationale:** MCP content lists are flat arrays. Claude/Roo Code renders them sequentially — a flat list means each image appears immediately below its metadata line. A nested structure would require the caller to unwrap it.
|
||
|
||
**For `count=1` (default):** Behaviour is identical to today — `[TextContent, ImageContent]`. No caller breakage.
|
||
|
||
### 4.4 Refactoring: Extract `_generate_single()`
|
||
|
||
The current `generate_image` function is 180+ lines of inline logic. To support `count`, the inner pipeline (queue → poll → history → download → save → encode) will be extracted to a private `async def _generate_single(prompt, ..., index, total)` coroutine. `generate_image` then loops `count` times calling `_generate_single` and accumulates results.
|
||
|
||
This refactoring:
|
||
- Makes the count loop clean (`results.extend(await _generate_single(...))`)
|
||
- Makes partial failure handling straightforward (catch per iteration)
|
||
- Improves testability of the single-image path
|
||
|
||
### 4.5 Maximum Count Cap
|
||
|
||
Cap `count` at **10**. Rationale:
|
||
- FLUX.1-schnell takes ~10–35s per image on RX 7900 XTX → 10 images ≈ 100–350s maximum
|
||
- MCP tool call timeout in Roo Code defaults to 5 minutes — 10 images is safe margin
|
||
- ComfyUI queues them internally; the MCP server polls sequentially, not in parallel
|
||
|
||
When `count > 10`, the tool returns a single `TextContent` error immediately (no images generated) with message: `"count={N} exceeds maximum of 10. Reduce count and retry."`
|
||
|
||
---
|
||
|
||
## 5. Implementation Plan
|
||
|
||
### Step 1 — Add `_sanitize_name()` helper
|
||
|
||
```python
|
||
import re
|
||
|
||
def _sanitize_name(name: str) -> str:
|
||
"""Sanitize a name for use as a filename prefix."""
|
||
sanitized = re.sub(r'[^a-zA-Z0-9_-]', '_', name)
|
||
return sanitized[:64]
|
||
```
|
||
|
||
Location: [`server.py`](mcp/mcp-image-gen/src/server.py:95), after `build_flux_workflow()` (pure function section).
|
||
|
||
### Step 2 — Extract `_generate_single()` coroutine
|
||
|
||
Extract the body of the current `generate_image` (lines 162–310) into:
|
||
|
||
```python
|
||
async def _generate_single(
|
||
prompt: str,
|
||
width: int,
|
||
height: int,
|
||
steps: int,
|
||
model: str,
|
||
seed: int,
|
||
negative_prompt: str,
|
||
resolved_output_dir: Path,
|
||
filename_prefix: str,
|
||
index: int,
|
||
total: int,
|
||
) -> list:
|
||
```
|
||
|
||
The `filename` construction changes to:
|
||
```python
|
||
filename = f"{filename_prefix}{timestamp}_{actual_seed}.png"
|
||
# where filename_prefix = f"{sanitized_name}_" if sanitized_name else ""
|
||
```
|
||
|
||
The `TextContent` text changes when `total > 1`:
|
||
```python
|
||
prefix_label = f"[{index}/{total}] " if total > 1 else ""
|
||
text = f"{prefix_label}Generated: {out_path}\nSeed: ..."
|
||
```
|
||
|
||
### Step 3 — Update `generate_image()` signature
|
||
|
||
```python
|
||
@mcp.tool()
|
||
async def generate_image(
|
||
prompt: str,
|
||
width: int = 1024,
|
||
height: int = 1024,
|
||
steps: int = 4,
|
||
model: str = "flux1-schnell.safetensors",
|
||
seed: int = -1,
|
||
negative_prompt: str = "",
|
||
output_dir: str = "",
|
||
name: str = "",
|
||
count: int = 1,
|
||
) -> list:
|
||
```
|
||
|
||
Body of `generate_image` becomes:
|
||
|
||
```python
|
||
# Validate count
|
||
MAX_COUNT = 10
|
||
if count < 1 or count > MAX_COUNT:
|
||
return [TextContent(type="text", text=f"count={count} is invalid. Must be 1–{MAX_COUNT}.")]
|
||
|
||
sanitized_name = _sanitize_name(name) if name else ""
|
||
filename_prefix = f"{sanitized_name}_" if sanitized_name else ""
|
||
resolved_output_dir = Path(output_dir or IMAGE_OUTPUT_DIR).expanduser().resolve()
|
||
|
||
results = []
|
||
for i in range(1, count + 1):
|
||
actual_seed = seed if seed == -1 else seed + (i - 1)
|
||
items = await _generate_single(
|
||
prompt=prompt, width=width, height=height, steps=steps,
|
||
model=model, seed=actual_seed, negative_prompt=negative_prompt,
|
||
resolved_output_dir=resolved_output_dir,
|
||
filename_prefix=filename_prefix, index=i, total=count,
|
||
)
|
||
results.extend(items)
|
||
|
||
return results
|
||
```
|
||
|
||
### Step 4 — Write new tests
|
||
|
||
Add to [`test_server.py`](mcp/mcp-image-gen/tests/test_server.py:550):
|
||
|
||
| Test | Description |
|
||
|------|-------------|
|
||
| `test_generate_image_with_name` | `name="lumen"` → filename starts with `lumen_` |
|
||
| `test_generate_image_name_sanitization` | `name="my image! v2"` → `my_image__v2_` prefix |
|
||
| `test_generate_image_count_2_success` | `count=2` → 4 items in result, 2 files saved |
|
||
| `test_generate_image_count_fixed_seed` | `count=2, seed=42` → seeds 42 and 43 in filenames |
|
||
| `test_generate_image_count_partial_failure` | `count=2`, second POST fails → 2 items (success) + 1 item (error) |
|
||
| `test_generate_image_count_cap_exceeded` | `count=11` → single TextContent error, no generation |
|
||
| `test_generate_image_count_0_invalid` | `count=0` → single TextContent error |
|
||
| `test_generate_image_name_and_count_combined` | `name="banner", count=2` → both files prefixed `banner_` |
|
||
|
||
### Step 5 — Update documentation
|
||
|
||
- Update `generate_image` docstring in [`server.py`](mcp/mcp-image-gen/src/server.py:144) to document `name` and `count`
|
||
- Update parameter table in [`README.md`](mcp/mcp-image-gen/README.md)
|
||
- Update tool reference in [`docs/wiki/pages/mcp-image-gen.md`](docs/wiki/pages/mcp-image-gen.md)
|
||
|
||
### Step 6 — Run full test suite
|
||
|
||
```bash
|
||
cd mcp/mcp-image-gen && uv run pytest tests/ -v --tb=short
|
||
```
|
||
|
||
All 19 existing + 8 new = **27 tests** must pass.
|
||
|
||
### Step 7 — Commit and push
|
||
|
||
Branch: `feat/mcp-image-gen/generate-image-name-count`
|
||
Commit: `feat(mcp-image-gen): add name and count params to generate_image`
|
||
|
||
---
|
||
|
||
## 6. Risks
|
||
|
||
| Risk | Likelihood | Impact | Mitigation |
|
||
|------|------------|--------|------------|
|
||
| Partial batch failure leaves orphaned files on disk | Medium | Low | Files for successful images are kept; error TextContent clearly identifies which index failed. No cleanup needed — partial results are useful. |
|
||
| `count` loop adds significant latency visible in Roo Code | Medium | Medium | Document expected time: `count × ~15s`. MCP timeout is 5 min; max 10 images ≈ 150s. Still within limit. |
|
||
| Seed increment wraps around at `2^32` | Very Low | Low | `(seed + i - 1) % 2**32` — add modulo guard in `_generate_single` |
|
||
| `_generate_single` refactor introduces regression in existing tests | Low | High | Existing test fixtures mock ComfyUI endpoints — as long as the HTTP call sequence is unchanged, respx mocks will match. Verify each existing test still passes before adding new ones. |
|
||
| `name` with only special chars becomes empty after sanitization | Low | Medium | After sanitization, if result is empty string, treat as unnamed (no prefix). Add assertion in `_sanitize_name` to return `""` for all-whitespace/special inputs. |
|
||
| MCP tool schema change breaks existing callers | Very Low | Low | New params are optional with defaults — backward compatible. Roo Code re-reads schema on server restart. |
|
||
|
||
---
|
||
|
||
## 7. Alternatives Considered
|
||
|
||
### 7.1 Separate `generate_images_batch()` Tool (Rejected)
|
||
|
||
Add a new tool instead of expanding `generate_image`.
|
||
|
||
**Pros:** Clean separation, no refactoring of existing tool.
|
||
**Cons:** Two tools for the same backend; callers must learn two tool names; MCP tool list grows. The MCP convention favours extending existing tools with optional parameters rather than proliferating tools.
|
||
|
||
**Verdict:** Rejected. Optional parameters with backward-compatible defaults is the right pattern here.
|
||
|
||
### 7.2 Return Grouped List of Lists for `count > 1` (Rejected)
|
||
|
||
Return `[[Text1, Image1], [Text2, Image2]]` for batch results.
|
||
|
||
**Pros:** Caller can index by image number cleanly.
|
||
**Cons:** MCP content type is a flat `list[ContentBlock]`. FastMCP does not support nested lists in tool returns — they would be serialized as strings, not rendered. Roo Code renders content sequentially; flat interleaved is the idiomatic structure.
|
||
|
||
**Verdict:** Rejected. Flat interleaved list `[Text1, Image1, Text2, Image2]` is MCP-idiomatic.
|
||
|
||
### 7.3 Parallel ComfyUI Submission for Batch (Rejected)
|
||
|
||
Submit all `count` prompts to ComfyUI simultaneously (async tasks), then collect results in order.
|
||
|
||
**Pros:** Faster if ComfyUI supports parallel queue processing (it does).
|
||
**Cons:** ComfyUI processes one job at a time on a single GPU regardless — parallel submission just fills the queue. Polling becomes complex (N polling loops). Error handling harder. Out-of-order completions break index alignment.
|
||
|
||
**Verdict:** Rejected for v1. Sequential submission is simpler, correct, and produces no worse throughput. Can revisit if ComfyUI gains true parallel processing support.
|
||
|
||
### 7.4 Name as Subdirectory Instead of Filename Prefix (Rejected)
|
||
|
||
When `name="lumen"`, save to `output_dir/lumen/` instead of `output_dir/lumen_*.png`.
|
||
|
||
**Pros:** Better directory organisation for large sets.
|
||
**Cons:** Complicates the implementation (directory creation per name), changes the return path format, breaks callers who assume a flat output directory. Adds complexity for minimal gain at `count ≤ 10`.
|
||
|
||
**Verdict:** Rejected for v1. Prefix approach is simpler and equally readable.
|
||
|
||
---
|
||
|
||
## 8. Success Criteria
|
||
|
||
| Criterion | Measure |
|
||
|-----------|---------|
|
||
| All 27 tests pass | `uv run pytest tests/ -v` exits 0 |
|
||
| `name="lumen"` → file starts with `lumen_` | Assert in `test_generate_image_with_name` |
|
||
| `count=2` → 4 content items, 2 files | Assert `len(result) == 4`, `len(glob("*.png")) == 2` |
|
||
| `count=2, seed=42` → seeds 42 and 43 | Assert seed values in TextContent |
|
||
| `count=11` → error TextContent, no ComfyUI call | Assert `len(result) == 1`, no `/api/prompt` mock hit |
|
||
| Backward compat: existing callers unaffected | All 19 existing tests pass without modification |
|
||
| MCP tool schema shows `name` and `count` params | Visible in Roo Code tool list after server restart |
|
||
|
||
---
|
||
|
||
## 9. Open Questions
|
||
|
||
| # | Question | Owner | Priority |
|
||
|---|----------|-------|----------|
|
||
| Q1 | Should `count=0` be an error, or silently return `[]` (empty list)? | Patrick | Low — assessment recommends error for clarity |
|
||
| Q2 | Max count cap: 10 or higher? 10 ≈ 150s max at 15s/image — feels right, but could be raised to 20 for batch profile image sets. | Patrick | Medium |
|
||
| Q3 | Should partial batch failure stop remaining iterations, or always complete all N? | Patrick | Medium — assessment recommends continue (partial success) |
|
||
| Q4 | Should `name` parameter also tag the TextContent output text, e.g. `[lumen_profile 1/3] Generated: ...`? | Patrick | Low |
|