Compare commits

...

6 Commits

Author SHA1 Message Date
Patrick Plate 79f1e6d65f feat(mcp-image-gen): add name and count params to generate_image
- Add name (str) param: filename prefix saved as {name}_{timestamp}_{seed}.png
- Add count (int, 1-10) param: generate N images in one call
- Extract _sanitize_name() helper: strips special chars, collapses underscores, caps at 64 chars
- Extract _build_filename() helper: pure function for testable filename construction
- Extract _generate_single() coroutine: clean loop body for batch generation
- Fixed seed batches increment seed per image (seed+i-1) for deterministic variation
- random seed (-1) batches give independent random seeds per image
- Partial batch failures continue (error TextContent in slot, remaining images proceed)
- Returns flat interleaved [Text1, Image1, Text2, Image2, ...] list
- 34/34 tests passing (was 19, added 15 new tests)
2026-04-06 07:45:37 +02:00
Patrick Plate 79a2e1d10a Merge branch 'feat/roo/ollama-backed-modes' 2026-04-05 10:27:37 +02:00
Patrick Plate 78de59243c feat(roo): add Ollama-backed doc-writer and ask-lite modes 2026-04-05 10:27:26 +02:00
Patrick Plate db8505fef1 merge: docs/wiki/promote-webscraper-search-hint → main 2026-04-05 10:11:37 +02:00
Patrick Plate 4107b8ede2 docs: promote webscraper_search_hint in wiki and mode rules 2026-04-05 10:11:33 +02:00
Patrick Plate 4202094f01 merge: fix/webscraper/search-hint-quality → main 2026-04-05 09:57:47 +02:00
8 changed files with 1354 additions and 54 deletions
+159
View File
@@ -0,0 +1,159 @@
# Ask Lite Mode — Behavior Rules
## Identity
You are Lumen, Patrick's AI colleague, operating in **Ask Lite** mode. Same personality, same BigMind integration — optimized for quick, direct answers to factual questions without burning Claude API budget. You answer questions about Patrick's tech stack concisely and accurately.
---
## 1. Model Awareness
This mode runs on a **local Ollama model (glm-4.7-flash, 30B params, 202k context)**. This model is excellent for:
- **Factual recall**: What does X do? What's the difference between A and B?
- **Concept explanation**: How does Y work? Explain Z.
- **How-to lookups**: How do I use W? What's the syntax for V?
- **Stack-specific Q&A**: Patrick's tools, libraries, and frameworks
It is NOT suitable for:
- Multi-step code debugging (use Debug mode)
- Code implementation tasks (use Code mode)
- System design decisions (use Architect mode)
- Deep reasoning chains that require Claude
**Redirect rule**: If answering requires writing or modifying code, analyzing a bug, or making architectural decisions → tell Patrick to switch modes (see §5).
---
## 2. BigMind Lite — Session Ritual
### Session Start (execute in order)
1. `memory_start_session()` — load prior context
2. `memory_list_hypotheses()` — review open hypotheses (rarely relevant for Q&A, but check)
3. `memory_announce_focus(session_id, "Quick Q&A session", [], ide_hint="VS Code")`
4. `memory_close_stale_sessions(session_id)` — clean orphaned sessions
### Before Answering Every Non-Trivial Question
Always search memory first — Patrick's preferences and stack details are often already stored:
- `memory_search_facts("2-3 focused keywords")` — user preferences, codebase facts
- `memory_search_chunks("related topic")` — past session context
**FTS5 rules**: Use 2-3 keywords max. Every token must match. If 0 results, drop the most specific word.
Example searches:
- `"FastMCP tool decorator"` → stored FastMCP patterns
- `"uv package management"` → how Patrick manages deps
- `"TrueNAS Docker"` → homelab infrastructure facts
Memory hits save tokens AND give Patrick's actual preferences, not generic answers.
### Session End
`memory_end_session(session_id, one_liner, topics, outcome, summary, importance=2)`
Q&A sessions are typically importance 1-3.
---
## 3. Web Research First
For questions about external libraries, APIs, frameworks, error messages, or current documentation — **search before answering from memory**:
```
webscraper_search_hint("2-3 keyword query")
```
Then if needed:
```
webscraper_fetch(best_url, max_chars=8000)
```
### When to search
- "How do I use [library X]?" → search `"library X feature"`
- "What's the error [message]?" → search distinctive phrase from error
- "What's new in [framework] version Y?" → search `"framework Y changelog"`
- "What's the difference between A and B?" → often answerable from memory, but verify if unsure
### Query crafting
| ✅ Good | ❌ Bad |
|---------|--------|
| `"FastMCP lifespan"` | `"how to use FastMCP lifespan context manager in Python"` |
| `"SQLite WAL mode"` | `"sqlite performance concurrent reads write ahead logging"` |
| `"httpx async timeout"` | `"how to configure timeout settings in httpx library"` |
Use Brave Search — it works without API keys or CAPTCHAs. One search per question topic.
---
## 4. Response Style
### Structure
1. **Direct answer first** — no preamble, no "Great question!", no restating the question
2. Short paragraphs or bullet points as appropriate
3. Code snippets only when they materially clarify the answer
4. Cite source if you looked something up (e.g., "Per FastMCP docs:")
### Length
- Simple factual questions: 1-3 sentences
- Concept explanations: 3-10 sentences or a short bulleted list
- Comparative questions: a short table or two-column list
### Honesty
If unsure: say so clearly.
> "I'm not certain — you should verify with the docs at [URL]."
Never guess and present it as fact.
### Patrick's Stack (no lookup needed for these)
| Domain | Technologies |
|--------|-------------|
| Python MCP | FastMCP, uv, pytest, httpx, respx |
| Python general | SQLite, Flask, Pydantic, asyncio |
| Java | Spring Boot 3.x, Jakarta EE, JPA/EclipseLink, PrimeFaces, Maven |
| Java ADP | Paisy monorepo, euBP, EAU, FEX, Oracle DB |
| Containers | Docker, Docker Compose (on TrueNAS.local) |
| Version control | Git, Gitea (http://192.168.188.119:30008/) |
| Local AI | Ollama (local), ComfyUI (image gen, localhost:8188) |
| OS | Fedora Linux (workstation), TrueNAS SCALE (server) |
| IDE | VS Code + Roo Code extension |
---
## 5. Escalation Triggers
Tell Patrick to switch modes when:
| Situation | Recommended mode |
|-----------|-----------------|
| "Write me a function that..." | Code mode |
| "Fix this bug..." | Debug mode |
| "I'm getting this error..." | Debug mode |
| "Design a system for..." | Architect mode |
| "How should I architect..." | Architect mode |
| "ADP/Paisy/euBP/EAU Java..." | Paisy mode |
| "Write docs/README/wiki..." | Doc Writer mode |
| "My Docker container / TrueNAS..." | Homelab mode |
| "Add a feature to BigMind..." | BigMind mode |
| "Build an MCP server..." | MCP Builder mode |
**Escalation message format** (direct, not apologetic):
> "That needs Code mode — Ask Lite is for Q&A only."
---
## 6. No File Editing
Ask Lite **reads** files for context but **never modifies** them.
If Patrick asks you to make a change:
> "Ask Lite is read-only. Switch to Code or Doc Writer mode to make that change."
Reading files is fine — use targeted reads and memory to minimize token usage:
1. Check memory first
2. Use grep/search for specific patterns rather than reading entire files
3. Read file sections (line ranges) rather than full files
4. Log token savings with `memory_log_token_save` when you avoid full reads
---
Lumen's identity, BigMind rituals, and memory patterns are unchanged — they apply in every mode. See `.roo/rules/` for those constants.
@@ -0,0 +1,208 @@
# Doc Writer Mode — Behavior Rules
## Identity
You are Lumen, Patrick's AI colleague, operating in **Doc Writer** mode. Same personality, same BigMind integration — just focused exclusively on producing clear, well-structured documentation. You write for Patrick's projects: pi_mcps (FastMCP Python MCP servers), BigMind (Flask + SQLite memory server), Paisy/ADP (Java payroll compliance), and homelab (TrueNAS, Docker, Gitea).
---
## 1. Model Awareness
This mode runs on a **local Ollama model (glm-4.7-flash, 30B params, 202k context)**. Optimize accordingly:
- **Do**: Structured writing, markdown formatting, templates, outlines, prose, docstrings, changelogs
- **Do**: Follow documentation patterns and style guides precisely
- **Avoid**: Multi-step reasoning chains, complex debugging analysis, architectural decision-making
- **Avoid**: Tasks requiring Claude-level reasoning (code analysis, root cause investigation, system design)
If Patrick asks for something outside documentation scope (implement a feature, debug an error, design architecture):
> "This needs more than Doc Writer mode. Switch to Code/Debug/Architect mode for that."
---
## 2. BigMind Lite — Session Ritual
### Session Start (execute in order)
1. `memory_start_session()` — load context
2. `memory_list_hypotheses()` — review open hypotheses (skip hypothesis formation for doc tasks < 5 min effort)
3. `memory_announce_focus(session_id, description, files, ide_hint="VS Code")` — declare files you'll touch
4. `memory_close_stale_sessions(session_id)` — clean orphaned sessions
### Before Writing
Always search memory before writing anything substantial:
- `memory_search_facts("project doc conventions")` — picks up style preferences
- `memory_search_facts("readme wiki style")` — existing format decisions
- `memory_search_chunks("documentation format")` — past session context
This avoids re-reading files for context that's already stored.
### Session End
`memory_end_session(session_id, one_liner, topics, outcome, summary, importance=2)`
Doc sessions are typically importance 2-4 unless you wrote something architecturally significant.
---
## 3. Documentation Standards
### README Files
Structure (in order):
1. `# Title` — project name, one-line tagline
2. Badges (if applicable: build status, coverage, PyPI version)
3. **Description** — what it does and why it exists (3-5 sentences)
4. **Installation** — step-by-step, assume fresh environment
5. **Usage** — most common use case first, with code examples
6. **Configuration** — environment variables, config files (if applicable)
7. **Examples** — additional usage patterns
8. **Development** — how to run tests, contribute
9. **License** (if applicable)
Do NOT write marketing fluff. Be concise and technical.
### Wiki Pages (Gitea Format)
- Use standard GitHub/Gitea markdown
- Check `docs/wiki/pages/` for existing page examples before writing
- Header image convention: `![Banner](../images/pagename-banner.png)` at top
- Use `##` for main sections, `###` for subsections
- Sidebar links managed separately in `docs/wiki/pages/_Sidebar.md`
- Keep page titles matching filename (e.g., `MCP-Servers-Overview.md` → title `# MCP Servers Overview`)
- Wiki deploy workflow: edit `docs/wiki/pages/*.md` → run `./docs/wiki/deploy_wiki.sh`
### Python Docstrings (Google Style)
```python
def function_name(param1: str, param2: int) -> bool:
"""One-line summary.
Longer description if needed. Explain what the function does,
not how it does it.
Args:
param1: Description of param1.
param2: Description of param2.
Returns:
True if successful, False otherwise.
Raises:
ValueError: If param1 is empty.
RuntimeError: If the operation fails.
Example:
>>> function_name("hello", 42)
True
"""
```
### Java Javadoc
```java
/**
* One-line summary.
*
* <p>Longer description if needed. Explain behavior and side effects.
*
* @param param1 description of param1
* @param param2 description of param2
* @return description of return value
* @throws IllegalArgumentException if param1 is null or empty
* @since 1.0
*/
```
### Changelogs (Keep a Changelog Format)
```markdown
# Changelog
## [Unreleased]
## [1.2.0] - 2026-04-05
### Added
- New feature description
### Changed
- Modified behavior description
### Fixed
- Bug fix description
### Removed
- Deprecated feature removed
```
Always use ISO 8601 dates (YYYY-MM-DD). Follow keepachangelog.com conventions exactly.
### Code Comments
- Explain **why**, not **what** — the code shows what; comments show intent
- Flag non-obvious behavior: `# Must flush before close — SQLite WAL mode requires it`
- Mark TODOs: `# TODO(pplate): migrate to async when FastMCP supports it`
- Keep inline comments short (< 80 chars); use block comments for complex logic
---
## 4. Output Directly
**Write the document. Don't explain what you're about to write.**
❌ Bad: "I'll write a README for your MCP server. Here's what I'll include..."
✅ Good: (write the README directly)
For very short tasks (< 10 lines), just output the result with no preamble at all.
For longer documents, a single intro line is acceptable:
✅ OK: "README for mcp-webscraper:"
Do NOT ask clarifying questions for straightforward doc tasks. Make reasonable assumptions based on what you read from the codebase and memory. If genuinely ambiguous (e.g., changelog format, license type), make a sensible choice and note it briefly at the end.
---
## 5. Token Efficiency
Before reading any file for context, check memory:
1. `memory_search_facts("project conventions")` — often has the answer
2. `memory_search_chunks("relevant topic")` — has past session context
When you avoid a file read via memory or targeted grep, log it:
```
memory_log_token_save(session_id, "Used stored conventions instead of reading README", 2000, "memory_hit")
```
When you must read files, prefer targeted reads:
- Read only the section you need (use line ranges)
- Use `grep` for specific patterns rather than reading entire files
---
## 6. File Restrictions
This mode edits **documentation files only**:
| File type | Examples | Allowed |
|-----------|----------|---------|
| Markdown | `README.md`, `CHANGELOG.md`, `docs/**/*.md` | ✅ |
| reStructuredText | `*.rst` | ✅ |
| Plain text | `*.txt` | ✅ |
| Python (docstrings only) | `*.py` | ✅ read + limited edit |
| Java (Javadoc only) | `*.java` | ✅ read + limited edit |
| Wiki pages | `docs/wiki/pages/*.md` | ✅ |
**Do NOT**:
- Implement features in `.py` or `.java` files
- Fix bugs in source code
- Modify configuration files (`.yaml`, `.json`, `.toml`, `pyproject.toml`)
- Make changes that affect runtime behavior
If asked to implement something: redirect to Code mode.
---
## 7. Project Context
| Project | Stack | Doc locations |
|---------|-------|--------------|
| pi_mcps | Python, FastMCP, uv | `mcp/*/README.md`, `docs/wiki/pages/` |
| BigMind | Python, Flask, SQLite | `mcp/bigmind/README.md`, wiki BigMind page |
| Paisy/ADP | Java, Maven, JPA | ADP internal (handle with care — confidential) |
| Homelab | TrueNAS, Docker, Gitea | `docs/wiki/pages/`, Gitea wiki |
Lumen's identity, BigMind rituals, and memory patterns are unchanged — they apply in every mode. See `.roo/rules/` for those constants.
+99
View File
@@ -0,0 +1,99 @@
# Web Research Rules — Use webscraper_search_hint Proactively
## Rule: Search Before Asking
Before asking Patrick for information about a library, framework, API, technology, or error —
**always try `webscraper_search_hint` first**.
This applies to **all modes**: Architect, Code, Debug, MCP Builder, Homelab, Paisy.
### Why
- `webscraper_search_hint` uses Brave Search — no API key, no setup, always available
- Brave returns real results without CAPTCHA or consent walls (Google/DuckDuckGo both block)
- Handles special characters correctly (C++, &, %, etc. — URL-encoded automatically)
- The `hint` field gives immediately actionable title + URL + snippet without further calls
---
## The Two-Step Pattern
```
Step 1: webscraper_search_hint("2-3 keyword query") → structured results + hint string
Step 2: webscraper_fetch(best_url, max_chars=8000) → full page content
```
**Never skip Step 1.** It costs one tool call and often reveals the exact page to read.
### Step 1 Output
The tool returns:
- `hint` — pipe-separated `"Title (url): snippet[:120]"` — read this first
- `results[]` — array of `{title, url, snippet}` — pick the most relevant URL
- `search_url` — the Brave search URL used (useful for debugging)
- `result_count` — number of results returned
### Step 2 Output
`webscraper_fetch(url)` returns full page as Markdown. Use `max_chars` to control size
(default 5000; use 800012000 for deep doc reads).
---
## Mode-Specific Guidance
### 🏗️ Architect Mode
- Before designing any system or feature: search for existing patterns, reference architectures, and official docs
- Example: planning a new MCP server → `webscraper_search_hint("FastMCP server patterns 2025")`
- Example: choosing between two libraries → search both and read their official comparison pages
### 🪲 Debug Mode
- Search the **exact error message** before forming hypotheses
- Example: `webscraper_search_hint("sqlite3 ProgrammingError Cannot operate closed database Python")`
- If the error is long, take the most distinctive phrase (2-5 words) as the query
### 💻 Code Mode
- Before implementing a feature using an unfamiliar API: search the official docs URL pattern first
- Example: `webscraper_search_hint("httpx async client connection pool settings")`
### 🔧 MCP Builder Mode
- Check FastMCP changelog/docs before implementing new patterns
- Example: `webscraper_search_hint("FastMCP tool decorator async 2025")`
- Example: `webscraper_search_hint("FastMCP context lifespan")`
### 🏠 Homelab Mode
- Look up Docker/TrueNAS configs, package versions, service docs before asking Patrick
- Example: `webscraper_search_hint("Gitea webhook payload format")`
---
## Query Crafting Tips
| ✅ Good queries | ❌ Bad queries |
|---|---|
| `"httpx timeout settings"` | `"how do I configure httpx timeouts in Python async code"` |
| `"FastMCP tool decorator"` | `"mcp server python tool registration method"` |
| `"sqlite WAL mode enable"` | `"sqlite performance mode for concurrent reads"` |
| `"Brave Search API no key"` | `"search engine that works without api key or captcha"` |
- Use 24 keywords, not full sentences
- Prefer library/framework name + specific feature
- For errors: distinctive phrase from the message, not the full stack trace
---
## Known Limitations
- **Reddit / Stack Overflow snippets** — these platforms block snippet extraction; you may get empty snippets. The URL is still valid — fetch it directly if needed.
- **Brave CSS selector fragility** — Brave uses Svelte-generated class names that change. If `webscraper_search_hint` returns 0 results unexpectedly, the scraper's CSS selectors may need updating. Last verified working: 2026-04-05.
- **Use sparingly** — one search call per research task to orient; then fetch specific pages. Don't call it in a loop.
---
## Anti-Patterns to Avoid
- ❌ Asking Patrick "what's the FastMCP syntax for X?" before searching
- ❌ Designing architecture without looking up existing solutions first
- ❌ Forming a debug hypothesis without searching the error message
- ❌ Writing code against an API from memory without verifying current docs
- ❌ Calling `webscraper_search_hint` more than 2-3 times for the same topic (broaden/narrow the query instead)
@@ -145,6 +145,38 @@ Use the `new-mcp-server` Roo skill in MCP Builder mode for full scaffolding:
3. Roo will load the new-mcp-server skill and scaffold everything
```
## Web Research with mcp-webscraper
Before asking Patrick for information about a library, framework, API, or technology — **search first**.
The webscraper MCP server provides `webscraper_search_hint` (Brave Search, no API key, always available) as the entry point for all research tasks. Use the two-step pattern:
```
Step 1: webscraper_search_hint("topic or error message") → get candidate URLs
Step 2: webscraper_fetch(best_url) → read the full page
```
### When to search
| Situation | Action |
|---|---|
| Need docs for a library or framework | `webscraper_search_hint("library-name official docs")` |
| Investigating an error or stack trace | `webscraper_search_hint("exact error message language")` |
| Planning a feature — need design patterns | `webscraper_search_hint("pattern-name best practices")` |
| Checking latest version / changelog | `webscraper_search_hint("library-name changelog release")` |
| Looking up API contracts | `webscraper_fetch(official_docs_url)` directly |
### Especially useful in
- **🏗️ Architect mode** — look up patterns and docs *before* designing. Don't design blind.
- **🪲 Debug mode** — search the exact error message before forming hypotheses.
- **🔧 MCP Builder mode** — check FastMCP changelog for new patterns before implementing.
### Known caveats
- Reddit and Stack Overflow may return empty snippets (platform blocks)
- Brave uses Svelte CSS classes that can change — if `webscraper_search_hint` returns 0 results, selectors may need updating (last verified: 2026-04-05)
## Gitea Repository
Code is hosted at: `http://192.168.188.119:30008/pplate/pi_mcps`
+62 -9
View File
@@ -25,20 +25,70 @@
- **Search backend:** Brave Search (`search.brave.com`) — works without CAPTCHA
- **SSL:** Custom cert bundle for Fedora 43 compatibility
## Search Hint Strategy
---
`webscraper_search_hint` uses Brave Search because:
## 🔍 Search: The Two-Step Research Pattern
`webscraper_search_hint` is the **entry point for all web research**. The recommended workflow is:
```
Step 1: webscraper_search_hint("your query") → get candidate URLs + snippets
Step 2: webscraper_fetch(best_url) → get full page content
```
This avoids scraping irrelevant pages and gives you an overview before committing to a deep read.
### Why Brave Search?
`webscraper_search_hint` uses Brave Search (`search.brave.com`) because:
- ✅ Returns real results without CAPTCHA or consent walls
- ✅ No API key required — works with plain HTTP GET
- ✅ Handles special characters (C++, &, %, etc.) via URL encoding
- ❌ Google blocks plain HTTP with 302 consent redirect
- ❌ DuckDuckGo blocks with CAPTCHA
Use it sparingly — once per research task — to get oriented before deep-scraping individual pages.
### Return Value
The tool returns a structured dict:
```json
{
"query": "FastMCP tool decorator",
"search_url": "https://search.brave.com/search?q=FastMCP+tool+decorator&source=web",
"result_count": 5,
"hint": "FastMCP Docs (https://docs.fastmcp.dev): The @mcp.tool() decorator registers a function as... | PyPI FastMCP (https://pypi.org/project/fastmcp/): FastMCP 2.x — modern MCP server framework... | ...",
"results": [
{
"title": "FastMCP Docs",
"url": "https://docs.fastmcp.dev",
"snippet": "The @mcp.tool() decorator registers a function as an MCP tool..."
},
...
]
}
```
The `hint` field is a pipe-separated string of `"Title (url): snippet[:120]"` entries — immediately actionable for deciding which URL to fetch next.
### Example: Two-Step Research Flow
```python
# Get top 5 results for a query
webscraper_search_hint("FastMCP tool decorator syntax", max_results=5)
# Step 1: Orient — what pages exist about this topic?
result = webscraper_search_hint("httpx async client timeout settings", max_results=5)
# hint: "HTTPX Docs (https://www.python-httpx.org/...): Configure timeout... | ..."
# Step 2: Deep-dive the most relevant result
content = webscraper_fetch("https://www.python-httpx.org/advanced/timeouts/", max_chars=8000)
```
### Known Limitations
- **Reddit / Stack Overflow snippets** may be empty — these platforms block snippet extraction
- **Brave CSS selectors** use Svelte-generated class names that may change. If you get 0 results, the scraper's selectors may need updating (last verified: 2026-04-05)
- **Use sparingly** — once per research task to get oriented, not for every query
---
## SSL Note — Fedora 43 Comodo Root CA
Fedora 43 is missing the **Comodo AAA Services Root CA** needed for Cloudflare-protected sites. The fix is bundled at [`mcp/webscraper/certs/comodo-aaa-services-root.pem`](../src/branch/main/mcp/webscraper/certs/).
@@ -58,13 +108,16 @@ uv run python src/server.py
```bash
cd mcp/webscraper
uv run pytest tests/ -v
# 23/23 tests passing
# 28/28 tests passing
```
## Usage Examples
```python
# Fetch a page as Markdown
# Step 1: Search — get candidate URLs for a topic
webscraper_search_hint("FastMCP tool decorator syntax", max_results=5)
# Step 2: Deep-dive the most relevant URL
webscraper_fetch("https://docs.fastmcp.dev", max_chars=10000)
# Extract all links from Gitea repo
@@ -79,6 +132,6 @@ webscraper_fetch_meta("https://github.com/comfyanonymous/ComfyUI")
# Fetch specific section by CSS selector
webscraper_fetch_section("https://docs.python.org", "#content")
# Quick search orientation
webscraper_search_hint("Gitea wiki git clone", max_results=3)
# Search with special characters (C++, &, % all work)
webscraper_search_hint("C++ std::optional usage", max_results=3)
```
@@ -0,0 +1,319 @@
# Assessment: Expand `generate_image` with `name` and `count` Parameters
*Author: Lumen | Date: 2026-04-06 | Ticket: —*
*BigMind Session: `00070c37-b013-4342-a8ae-f81da0e3180d`*
*Status: 🔵 DRAFT — awaiting Patrick review*
---
## 1. Problem Statement
The current [`generate_image()`](mcp/mcp-image-gen/src/server.py:133) tool generates a single image and saves it with an auto-generated filename of `{timestamp}_{seed}.png`. Two common workflows are not yet supported:
1. **Named outputs** — When generating thematic sets (Lumen profile images, wiki banners, concept art), the caller wants a meaningful prefix in the filename (e.g., `lumen_profile_20260406_140236_2409122067.png`) rather than a bare timestamp. This also enables grouping output by purpose in the directory listing.
2. **Batch generation** — Generating multiple variations of the same prompt in one tool call is a common creative workflow. Currently, the caller must invoke `generate_image` N times with separate tool calls, which is verbose and loses the semantic grouping.
**Goal:** Add two optional parameters — `name` (filename prefix string) and `count` (integer repetitions) — to `generate_image` with minimal disruption to existing behaviour and test coverage.
---
## 2. Requirements
### 2.1 Functional Requirements
| ID | Requirement |
|----|-------------|
| F-1 | `name` parameter (default `""`) prepends a sanitized label to the output filename |
| F-2 | When `name=""` (default), filename format is unchanged: `{timestamp}_{seed}.png` |
| F-3 | When `name="lumen_profile"`, filename format is: `lumen_profile_{timestamp}_{seed}.png` |
| F-4 | `count` parameter (default `1`) generates N images sequentially |
| F-5 | When `count=1` (default), return value is identical to the current `[TextContent, ImageContent]` |
| F-6 | When `count=N > 1`, return value is a flat list: `[Text1, Image1, Text2, Image2, ..., TextN, ImageN]` |
| F-7 | When `count>1` and `seed=-1`, each image gets an independently random seed |
| F-8 | When `count>1` and a fixed `seed` is provided, images use `seed`, `seed+1`, `seed+2`, … to produce deterministic variation |
| F-9 | `count` is capped at a maximum (proposed: 10) to prevent runaway generation |
| F-10 | `name` is sanitized: non-alphanumeric characters (except `-` and `_`) are stripped/replaced; max 64 chars |
| F-11 | Partial success: if one image in a batch fails, the error is returned as a `TextContent` error item in that position rather than aborting the whole batch |
| F-12 | The TextContent for each image in a batch includes the 1-of-N index: `[1/3] Generated: ...` |
### 2.2 Non-Functional Requirements
| ID | Requirement |
|----|-------------|
| NF-1 | Sequential generation — no concurrent ComfyUI submissions (ComfyUI queues internally; parallel MCP submissions would complicate polling) |
| NF-2 | Backward compatibility — all existing callers with no `name`/`count` args produce identical output |
| NF-3 | All existing 19 tests must continue to pass without modification |
| NF-4 | New tests must cover: name prefix in filename, count=2 success, count with fixed seed increments, count with partial failure, name sanitization, count cap enforcement |
| NF-5 | MCP tool schema (visible in Claude/Roo Code) must surface clear descriptions for the new params |
---
## 3. Affected Files
| File | Change Type | Description |
|------|-------------|-------------|
| [`mcp/mcp-image-gen/src/server.py`](mcp/mcp-image-gen/src/server.py:133) | Modify | Add `name: str = ""` and `count: int = 1` params to `generate_image()`; add `_sanitize_name()` helper; extract `_generate_single()` inner logic |
| [`mcp/mcp-image-gen/tests/test_server.py`](mcp/mcp-image-gen/tests/test_server.py:1) | Modify | Add 6+ new test cases covering new parameters |
| [`mcp/mcp-image-gen/README.md`](mcp/mcp-image-gen/README.md) | Modify | Update `generate_image` tool documentation table |
| [`docs/wiki/pages/mcp-image-gen.md`](docs/wiki/pages/mcp-image-gen.md) | Modify | Update tool reference table with new parameters |
No schema changes, no new dependencies, no workflow JSON changes.
---
## 4. Design Decisions
### 4.1 Filename Convention with `name`
**Current:** `{timestamp}_{seed}.png`
**Proposed:** `{sanitized_name}_{timestamp}_{seed}.png` (when `name` is provided)
The `name` is placed as a **prefix** rather than suffix so directory `ls` output groups named sets together alphabetically:
```
lumen_profile_20260406_140236_2409122067.png
lumen_profile_20260406_140258_764633840.png
wiki_banner_20260406_141000_1234567.png
```
**Sanitization rule:** `re.sub(r'[^a-zA-Z0-9_-]', '_', name)[:64]` — replaces any character that is not alphanumeric, dash, or underscore with `_`, then truncates to 64 chars.
### 4.2 Seed Behaviour for Batch Generation
| Scenario | Behaviour |
|----------|-----------|
| `count=3, seed=-1` | Each call to `build_flux_workflow` gets `seed=-1` → 3 independent random seeds |
| `count=3, seed=42` | Seeds are 42, 43, 44 — deterministic, reproducible variation |
This follows the convention of most image generation tools (e.g., ComfyUI's own batch seed increment).
### 4.3 Return Structure for `count > 1`
Return a **flat interleaved list**: `[Text1, Image1, Text2, Image2]`
**Rationale:** MCP content lists are flat arrays. Claude/Roo Code renders them sequentially — a flat list means each image appears immediately below its metadata line. A nested structure would require the caller to unwrap it.
**For `count=1` (default):** Behaviour is identical to today — `[TextContent, ImageContent]`. No caller breakage.
### 4.4 Refactoring: Extract `_generate_single()`
The current `generate_image` function is 180+ lines of inline logic. To support `count`, the inner pipeline (queue → poll → history → download → save → encode) will be extracted to a private `async def _generate_single(prompt, ..., index, total)` coroutine. `generate_image` then loops `count` times calling `_generate_single` and accumulates results.
This refactoring:
- Makes the count loop clean (`results.extend(await _generate_single(...))`)
- Makes partial failure handling straightforward (catch per iteration)
- Improves testability of the single-image path
### 4.5 Maximum Count Cap
Cap `count` at **10**. Rationale:
- FLUX.1-schnell takes ~1035s per image on RX 7900 XTX → 10 images ≈ 100350s maximum
- MCP tool call timeout in Roo Code defaults to 5 minutes — 10 images is safe margin
- ComfyUI queues them internally; the MCP server polls sequentially, not in parallel
When `count > 10`, the tool returns a single `TextContent` error immediately (no images generated) with message: `"count={N} exceeds maximum of 10. Reduce count and retry."`
---
## 5. Implementation Plan
### Step 1 — Add `_sanitize_name()` helper
```python
import re
def _sanitize_name(name: str) -> str:
"""Sanitize a name for use as a filename prefix."""
sanitized = re.sub(r'[^a-zA-Z0-9_-]', '_', name)
return sanitized[:64]
```
Location: [`server.py`](mcp/mcp-image-gen/src/server.py:95), after `build_flux_workflow()` (pure function section).
### Step 2 — Extract `_generate_single()` coroutine
Extract the body of the current `generate_image` (lines 162310) into:
```python
async def _generate_single(
prompt: str,
width: int,
height: int,
steps: int,
model: str,
seed: int,
negative_prompt: str,
resolved_output_dir: Path,
filename_prefix: str,
index: int,
total: int,
) -> list:
```
The `filename` construction changes to:
```python
filename = f"{filename_prefix}{timestamp}_{actual_seed}.png"
# where filename_prefix = f"{sanitized_name}_" if sanitized_name else ""
```
The `TextContent` text changes when `total > 1`:
```python
prefix_label = f"[{index}/{total}] " if total > 1 else ""
text = f"{prefix_label}Generated: {out_path}\nSeed: ..."
```
### Step 3 — Update `generate_image()` signature
```python
@mcp.tool()
async def generate_image(
prompt: str,
width: int = 1024,
height: int = 1024,
steps: int = 4,
model: str = "flux1-schnell.safetensors",
seed: int = -1,
negative_prompt: str = "",
output_dir: str = "",
name: str = "",
count: int = 1,
) -> list:
```
Body of `generate_image` becomes:
```python
# Validate count
MAX_COUNT = 10
if count < 1 or count > MAX_COUNT:
return [TextContent(type="text", text=f"count={count} is invalid. Must be 1{MAX_COUNT}.")]
sanitized_name = _sanitize_name(name) if name else ""
filename_prefix = f"{sanitized_name}_" if sanitized_name else ""
resolved_output_dir = Path(output_dir or IMAGE_OUTPUT_DIR).expanduser().resolve()
results = []
for i in range(1, count + 1):
actual_seed = seed if seed == -1 else seed + (i - 1)
items = await _generate_single(
prompt=prompt, width=width, height=height, steps=steps,
model=model, seed=actual_seed, negative_prompt=negative_prompt,
resolved_output_dir=resolved_output_dir,
filename_prefix=filename_prefix, index=i, total=count,
)
results.extend(items)
return results
```
### Step 4 — Write new tests
Add to [`test_server.py`](mcp/mcp-image-gen/tests/test_server.py:550):
| Test | Description |
|------|-------------|
| `test_generate_image_with_name` | `name="lumen"` → filename starts with `lumen_` |
| `test_generate_image_name_sanitization` | `name="my image! v2"``my_image__v2_` prefix |
| `test_generate_image_count_2_success` | `count=2` → 4 items in result, 2 files saved |
| `test_generate_image_count_fixed_seed` | `count=2, seed=42` → seeds 42 and 43 in filenames |
| `test_generate_image_count_partial_failure` | `count=2`, second POST fails → 2 items (success) + 1 item (error) |
| `test_generate_image_count_cap_exceeded` | `count=11` → single TextContent error, no generation |
| `test_generate_image_count_0_invalid` | `count=0` → single TextContent error |
| `test_generate_image_name_and_count_combined` | `name="banner", count=2` → both files prefixed `banner_` |
### Step 5 — Update documentation
- Update `generate_image` docstring in [`server.py`](mcp/mcp-image-gen/src/server.py:144) to document `name` and `count`
- Update parameter table in [`README.md`](mcp/mcp-image-gen/README.md)
- Update tool reference in [`docs/wiki/pages/mcp-image-gen.md`](docs/wiki/pages/mcp-image-gen.md)
### Step 6 — Run full test suite
```bash
cd mcp/mcp-image-gen && uv run pytest tests/ -v --tb=short
```
All 19 existing + 8 new = **27 tests** must pass.
### Step 7 — Commit and push
Branch: `feat/mcp-image-gen/generate-image-name-count`
Commit: `feat(mcp-image-gen): add name and count params to generate_image`
---
## 6. Risks
| Risk | Likelihood | Impact | Mitigation |
|------|------------|--------|------------|
| Partial batch failure leaves orphaned files on disk | Medium | Low | Files for successful images are kept; error TextContent clearly identifies which index failed. No cleanup needed — partial results are useful. |
| `count` loop adds significant latency visible in Roo Code | Medium | Medium | Document expected time: `count × ~15s`. MCP timeout is 5 min; max 10 images ≈ 150s. Still within limit. |
| Seed increment wraps around at `2^32` | Very Low | Low | `(seed + i - 1) % 2**32` — add modulo guard in `_generate_single` |
| `_generate_single` refactor introduces regression in existing tests | Low | High | Existing test fixtures mock ComfyUI endpoints — as long as the HTTP call sequence is unchanged, respx mocks will match. Verify each existing test still passes before adding new ones. |
| `name` with only special chars becomes empty after sanitization | Low | Medium | After sanitization, if result is empty string, treat as unnamed (no prefix). Add assertion in `_sanitize_name` to return `""` for all-whitespace/special inputs. |
| MCP tool schema change breaks existing callers | Very Low | Low | New params are optional with defaults — backward compatible. Roo Code re-reads schema on server restart. |
---
## 7. Alternatives Considered
### 7.1 Separate `generate_images_batch()` Tool (Rejected)
Add a new tool instead of expanding `generate_image`.
**Pros:** Clean separation, no refactoring of existing tool.
**Cons:** Two tools for the same backend; callers must learn two tool names; MCP tool list grows. The MCP convention favours extending existing tools with optional parameters rather than proliferating tools.
**Verdict:** Rejected. Optional parameters with backward-compatible defaults is the right pattern here.
### 7.2 Return Grouped List of Lists for `count > 1` (Rejected)
Return `[[Text1, Image1], [Text2, Image2]]` for batch results.
**Pros:** Caller can index by image number cleanly.
**Cons:** MCP content type is a flat `list[ContentBlock]`. FastMCP does not support nested lists in tool returns — they would be serialized as strings, not rendered. Roo Code renders content sequentially; flat interleaved is the idiomatic structure.
**Verdict:** Rejected. Flat interleaved list `[Text1, Image1, Text2, Image2]` is MCP-idiomatic.
### 7.3 Parallel ComfyUI Submission for Batch (Rejected)
Submit all `count` prompts to ComfyUI simultaneously (async tasks), then collect results in order.
**Pros:** Faster if ComfyUI supports parallel queue processing (it does).
**Cons:** ComfyUI processes one job at a time on a single GPU regardless — parallel submission just fills the queue. Polling becomes complex (N polling loops). Error handling harder. Out-of-order completions break index alignment.
**Verdict:** Rejected for v1. Sequential submission is simpler, correct, and produces no worse throughput. Can revisit if ComfyUI gains true parallel processing support.
### 7.4 Name as Subdirectory Instead of Filename Prefix (Rejected)
When `name="lumen"`, save to `output_dir/lumen/` instead of `output_dir/lumen_*.png`.
**Pros:** Better directory organisation for large sets.
**Cons:** Complicates the implementation (directory creation per name), changes the return path format, breaks callers who assume a flat output directory. Adds complexity for minimal gain at `count ≤ 10`.
**Verdict:** Rejected for v1. Prefix approach is simpler and equally readable.
---
## 8. Success Criteria
| Criterion | Measure |
|-----------|---------|
| All 27 tests pass | `uv run pytest tests/ -v` exits 0 |
| `name="lumen"` → file starts with `lumen_` | Assert in `test_generate_image_with_name` |
| `count=2` → 4 content items, 2 files | Assert `len(result) == 4`, `len(glob("*.png")) == 2` |
| `count=2, seed=42` → seeds 42 and 43 | Assert seed values in TextContent |
| `count=11` → error TextContent, no ComfyUI call | Assert `len(result) == 1`, no `/api/prompt` mock hit |
| Backward compat: existing callers unaffected | All 19 existing tests pass without modification |
| MCP tool schema shows `name` and `count` params | Visible in Roo Code tool list after server restart |
---
## 9. Open Questions
| # | Question | Owner | Priority |
|---|----------|-------|----------|
| Q1 | Should `count=0` be an error, or silently return `[]` (empty list)? | Patrick | Low — assessment recommends error for clarity |
| Q2 | Max count cap: 10 or higher? 10 ≈ 150s max at 15s/image — feels right, but could be raised to 20 for batch profile image sets. | Patrick | Medium |
| Q3 | Should partial batch failure stop remaining iterations, or always complete all N? | Patrick | Medium — assessment recommends continue (partial success) |
| Q4 | Should `name` parameter also tag the TextContent output text, e.g. `[lumen_profile 1/3] Generated: ...`? | Patrick | Low |
+156 -44
View File
@@ -6,6 +6,7 @@ import copy
import json
import os
import random
import re
import time
from datetime import datetime
from pathlib import Path
@@ -22,6 +23,9 @@ COMFYUI_URL = os.environ.get("COMFYUI_URL", "http://localhost:8188").rstrip("/")
IMAGE_OUTPUT_DIR = os.environ.get("IMAGE_OUTPUT_DIR", "~/Pictures/mcp-generated")
COMFYUI_TIMEOUT = int(os.environ.get("COMFYUI_TIMEOUT", "120"))
# Maximum number of images allowed in a single batch call
MAX_COUNT = 10
# Path to the bundled FLUX.1-schnell workflow template
_WORKFLOW_PATH = Path(__file__).parent / "workflows" / "flux_schnell.json"
@@ -126,46 +130,59 @@ def build_flux_workflow(
# ---------------------------------------------------------------------------
# Tools
# Helpers
# ---------------------------------------------------------------------------
@mcp.tool()
async def generate_image(
prompt: str,
width: int = 1024,
height: int = 1024,
steps: int = 4,
model: str = "flux1-schnell.safetensors",
seed: int = -1,
negative_prompt: str = "",
output_dir: str = "",
) -> list:
"""Generate an image from a text prompt using ComfyUI.
def _sanitize_name(name: str) -> str:
"""Sanitize a user-provided name for safe use in filenames.
Returns both a file path (for persistence) and an inline base64 image
(for display in Claude / Roo Code chat).
Replaces whitespace with underscores, strips any characters that are not
alphanumeric, underscores, or hyphens, and collapses consecutive
underscores/hyphens. Returns empty string if nothing usable remains.
"""
name = name.strip()
name = re.sub(r"\s+", "_", name) # spaces → underscores
name = re.sub(r"[^\w\-]", "", name) # strip non-alphanum/underscore/hyphen
name = re.sub(r"[_\-]{2,}", "_", name) # collapse runs
name = name.strip("_-") # trim leading/trailing separators
return name[:64] # cap at 64 chars
def _build_filename(name: str, timestamp: str, actual_seed: int) -> str:
"""Build an output filename from optional name, timestamp and seed."""
sanitized = _sanitize_name(name)
if sanitized:
return f"{sanitized}_{timestamp}_{actual_seed}.png"
return f"{timestamp}_{actual_seed}.png"
async def _generate_single(
client: ComfyUIClient,
prompt: str,
negative_prompt: str,
width: int,
height: int,
steps: int,
seed: int,
model: str,
resolved_output_dir: Path,
name: str,
label: str,
) -> list:
"""Generate a single image and return [TextContent, ImageContent] or [TextContent] on error.
Args:
prompt: Text description of the image to generate.
width: Image width in pixels (default: 1024).
height: Image height in pixels (default: 1024).
steps: Number of inference steps. FLUX.1-schnell works well at 4.
model: ComfyUI model filename (default: flux1-schnell.safetensors).
seed: Random seed for reproducibility. -1 = random.
negative_prompt: Things to exclude from the image (optional).
output_dir: Override output directory. Defaults to IMAGE_OUTPUT_DIR env var
or ~/Pictures/mcp-generated.
Returns:
[TextContent(path + metadata), ImageContent(base64 PNG)]
client: ComfyUIClient instance.
prompt: Positive text prompt.
negative_prompt: Negative text prompt.
width / height: Image dimensions.
steps: Inference steps.
seed: Seed value (-1 = random).
model: ComfyUI model filename.
resolved_output_dir: Resolved output directory Path.
name: User-supplied name prefix (unsanitized).
label: Human-readable label for TextContent prefix (e.g. "[lumen 1/3]").
"""
# Resolve output directory
resolved_output_dir = Path(
output_dir or IMAGE_OUTPUT_DIR
).expanduser().resolve()
client = ComfyUIClient(COMFYUI_URL)
# Build and submit workflow
try:
workflow = build_flux_workflow(
@@ -178,14 +195,13 @@ async def generate_image(
model=model,
)
actual_seed = workflow["_meta"]["actual_seed"]
prompt_id = await client.queue_prompt(workflow)
except httpx.ConnectError:
return [
TextContent(
type="text",
text=(
f"ComfyUI not reachable at {COMFYUI_URL}. "
f"{label} ComfyUI not reachable at {COMFYUI_URL}. "
"Start it with: python main.py --listen"
),
)
@@ -194,7 +210,7 @@ async def generate_image(
return [
TextContent(
type="text",
text=f"ComfyUI returned an error: {e.response.status_code}{e.response.text}",
text=f"{label} ComfyUI returned an error: {e.response.status_code}{e.response.text}",
)
]
@@ -207,7 +223,7 @@ async def generate_image(
TextContent(
type="text",
text=(
f"Generation timed out after {COMFYUI_TIMEOUT}s. "
f"{label} Generation timed out after {COMFYUI_TIMEOUT}s. "
f"prompt_id={prompt_id} — use get_generation_status to check"
),
)
@@ -236,7 +252,7 @@ async def generate_image(
return [
TextContent(
type="text",
text=f"Failed to retrieve generation history: {e}",
text=f"{label} Failed to retrieve generation history: {e}",
)
]
@@ -255,7 +271,7 @@ async def generate_image(
return [
TextContent(
type="text",
text=f"No output image found in history for prompt_id={prompt_id}",
text=f"{label} No output image found in history for prompt_id={prompt_id}",
)
]
@@ -270,7 +286,7 @@ async def generate_image(
return [
TextContent(
type="text",
text=f"Failed to download generated image: {e}",
text=f"{label} Failed to download generated image: {e}",
)
]
@@ -278,14 +294,14 @@ async def generate_image(
try:
resolved_output_dir.mkdir(parents=True, exist_ok=True)
timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
filename = f"{timestamp}_{actual_seed}.png"
filename = _build_filename(name, timestamp, actual_seed)
out_path = resolved_output_dir / filename
out_path.write_bytes(image_bytes)
except OSError as e:
return [
TextContent(
type="text",
text=f"Cannot write to output directory: {resolved_output_dir}{e}",
text=f"{label} Cannot write to output directory: {resolved_output_dir}{e}",
)
]
@@ -296,7 +312,7 @@ async def generate_image(
TextContent(
type="text",
text=(
f"Generated: {out_path}\n"
f"{label} Generated: {out_path}\n"
f"Seed: {actual_seed}\n"
f"Elapsed: {elapsed:.1f}s\n"
f"Size: {width}x{height}, Steps: {steps}, Model: {model}"
@@ -310,6 +326,102 @@ async def generate_image(
]
# ---------------------------------------------------------------------------
# Tools
# ---------------------------------------------------------------------------
@mcp.tool()
async def generate_image(
prompt: str,
width: int = 1024,
height: int = 1024,
steps: int = 4,
model: str = "flux1-schnell.safetensors",
seed: int = -1,
negative_prompt: str = "",
output_dir: str = "",
name: str = "",
count: int = 1,
) -> list:
"""Generate an image from a text prompt using ComfyUI.
Returns both a file path (for persistence) and an inline base64 image
(for display in Claude / Roo Code chat).
Args:
prompt: Text description of the image to generate.
width: Image width in pixels (default: 1024).
height: Image height in pixels (default: 1024).
steps: Number of inference steps. FLUX.1-schnell works well at 4.
model: ComfyUI model filename (default: flux1-schnell.safetensors).
seed: Random seed for reproducibility. -1 = random.
When count > 1 and seed != -1, seeds are incremented per image
(seed, seed+1, seed+2, ...) to produce deterministic variation.
negative_prompt: Things to exclude from the image (optional).
output_dir: Override output directory. Defaults to IMAGE_OUTPUT_DIR env var
or ~/Pictures/mcp-generated.
name: Optional filename prefix. Saved as {name}_{timestamp}_{seed}.png.
Useful to avoid confusion with auto-generated timestamp filenames.
count: Number of images to generate (110). Each image is generated
sequentially. Partial failures are returned inline — the batch
continues even if one image fails.
Returns:
Flat interleaved list: [TextContent1, ImageContent1, TextContent2, ImageContent2, ...]
On error for any single image, that slot contains only [TextContent(error)].
"""
# Validate count
if count < 1:
return [
TextContent(
type="text",
text=f"count must be at least 1 (got {count}).",
)
]
if count > MAX_COUNT:
return [
TextContent(
type="text",
text=f"count must be at most {MAX_COUNT} (got {count}). Use multiple calls for larger batches.",
)
]
# Resolve output directory once
resolved_output_dir = Path(
output_dir or IMAGE_OUTPUT_DIR
).expanduser().resolve()
client = ComfyUIClient(COMFYUI_URL)
results = []
for i in range(1, count + 1):
# Compute seed for this image:
# - seed=-1 → each image gets an independent random seed
# - fixed seed → increment by i-1 for deterministic variation across the batch
image_seed = seed if seed == -1 else seed + (i - 1)
label = f"[{_sanitize_name(name) or 'image'} {i}/{count}]" if count > 1 else (
f"[{_sanitize_name(name)}]" if _sanitize_name(name) else ""
)
single_result = await _generate_single(
client=client,
prompt=prompt,
negative_prompt=negative_prompt,
width=width,
height=height,
steps=steps,
seed=image_seed,
model=model,
resolved_output_dir=resolved_output_dir,
name=name,
label=label,
)
results.extend(single_result)
return results
@mcp.tool()
async def list_available_models() -> list[str]:
"""List all checkpoint models available in ComfyUI.
+319 -1
View File
@@ -14,6 +14,8 @@ import respx
import server
from server import (
ComfyUIClient,
_build_filename,
_sanitize_name,
build_flux_workflow,
generate_image,
get_generation_status,
@@ -100,6 +102,74 @@ def test_random_seed_generated():
assert "_meta" in wf2
# ---------------------------------------------------------------------------
# _sanitize_name — pure function
# ---------------------------------------------------------------------------
def test_sanitize_name_basic():
"""Simple alphanumeric name passes through unchanged."""
assert _sanitize_name("lumen_profile") == "lumen_profile"
def test_sanitize_name_spaces_to_underscores():
"""Spaces are converted to underscores."""
assert _sanitize_name("my cool image") == "my_cool_image"
def test_sanitize_name_special_chars_stripped():
"""Special characters (!, @, #, etc.) are stripped."""
result = _sanitize_name("hello! world@2024#")
assert "!" not in result
assert "@" not in result
assert "#" not in result
assert "hello" in result
assert "world" in result
def test_sanitize_name_empty_returns_empty():
"""Empty string or whitespace-only returns empty string."""
assert _sanitize_name("") == ""
assert _sanitize_name(" ") == ""
def test_sanitize_name_collapse_underscores():
"""Multiple consecutive underscores/hyphens are collapsed to one."""
result = _sanitize_name("lumen__profile")
assert "__" not in result
def test_sanitize_name_truncates_at_64():
"""Names longer than 64 chars are truncated."""
long_name = "a" * 100
result = _sanitize_name(long_name)
assert len(result) <= 64
# ---------------------------------------------------------------------------
# _build_filename — pure function
# ---------------------------------------------------------------------------
def test_build_filename_with_name():
"""When name is provided, filename includes it as prefix."""
filename = _build_filename("lumen", "20260406_120000", 12345)
assert filename == "lumen_20260406_120000_12345.png"
def test_build_filename_without_name():
"""When name is empty, filename is timestamp_seed.png."""
filename = _build_filename("", "20260406_120000", 12345)
assert filename == "20260406_120000_12345.png"
assert not filename.startswith("_")
def test_build_filename_sanitizes_name():
"""Name with spaces and special chars is sanitized before use in filename."""
filename = _build_filename("my image!", "20260406_120000", 99)
assert "!" not in filename
assert "my_image" in filename
assert filename.endswith("_20260406_120000_99.png")
# ---------------------------------------------------------------------------
# list_available_models
# ---------------------------------------------------------------------------
@@ -211,7 +281,255 @@ def test_get_output_directory_custom(monkeypatch, tmp_path):
# ---------------------------------------------------------------------------
# generate_image
# generate_image — count/name validation
# ---------------------------------------------------------------------------
@pytest.mark.asyncio
async def test_generate_image_count_zero_returns_error():
"""count=0 → returns error TextContent without calling ComfyUI."""
result = await generate_image(prompt="a cat", count=0)
assert len(result) == 1
assert "count must be at least 1" in result[0].text
@pytest.mark.asyncio
async def test_generate_image_count_exceeds_max_returns_error():
"""count=11 (> MAX_COUNT=10) → returns error TextContent without calling ComfyUI."""
result = await generate_image(prompt="a cat", count=11)
assert len(result) == 1
assert "at most 10" in result[0].text
# ---------------------------------------------------------------------------
# generate_image — name parameter
# ---------------------------------------------------------------------------
@respx.mock
@pytest.mark.asyncio
async def test_generate_image_with_name(
tmp_path, sample_image_bytes, mock_history_response, queue_empty, monkeypatch
):
"""name param → saved file has name as prefix."""
monkeypatch.setattr(server, "IMAGE_OUTPUT_DIR", str(tmp_path))
respx.post(f"{COMFYUI_BASE}/api/prompt").mock(
return_value=httpx.Response(200, json={"prompt_id": "name-test-uuid"})
)
respx.get(f"{COMFYUI_BASE}/api/queue").mock(
return_value=httpx.Response(200, json=queue_empty)
)
mock_history_named = {
"name-test-uuid": {
"outputs": {
"9": {
"images": [
{
"filename": "mcp-image-gen_00001_.png",
"subfolder": "",
"type": "output",
}
]
}
},
"status": {"completed": True},
}
}
respx.get(f"{COMFYUI_BASE}/api/history/name-test-uuid").mock(
return_value=httpx.Response(200, json=mock_history_named)
)
respx.get(f"{COMFYUI_BASE}/api/view").mock(
return_value=httpx.Response(200, content=sample_image_bytes)
)
result = await generate_image(
prompt="lumen portrait",
name="lumen_profile",
output_dir=str(tmp_path),
)
assert len(result) == 2
saved_files = list(tmp_path.glob("lumen_profile_*.png"))
assert len(saved_files) == 1, f"Expected 1 file with 'lumen_profile_' prefix, got: {list(tmp_path.glob('*.png'))}"
# Path in TextContent also has the name prefix
assert "lumen_profile_" in result[0].text
# ---------------------------------------------------------------------------
# generate_image — count=2 batch
# ---------------------------------------------------------------------------
@respx.mock
@pytest.mark.asyncio
async def test_generate_image_count_2(
tmp_path, sample_image_bytes, queue_empty, monkeypatch
):
"""count=2 → returns 4 content items (Text+Image per image), 2 files saved."""
monkeypatch.setattr(server, "IMAGE_OUTPUT_DIR", str(tmp_path))
# First image
mock_history_1 = {
"uuid-batch-1": {
"outputs": {"9": {"images": [{"filename": "img1.png", "subfolder": "", "type": "output"}]}},
"status": {"completed": True},
}
}
# Second image
mock_history_2 = {
"uuid-batch-2": {
"outputs": {"9": {"images": [{"filename": "img2.png", "subfolder": "", "type": "output"}]}},
"status": {"completed": True},
}
}
respx.post(f"{COMFYUI_BASE}/api/prompt").mock(
side_effect=[
httpx.Response(200, json={"prompt_id": "uuid-batch-1"}),
httpx.Response(200, json={"prompt_id": "uuid-batch-2"}),
]
)
respx.get(f"{COMFYUI_BASE}/api/queue").mock(
return_value=httpx.Response(200, json=queue_empty)
)
respx.get(f"{COMFYUI_BASE}/api/history/uuid-batch-1").mock(
return_value=httpx.Response(200, json=mock_history_1)
)
respx.get(f"{COMFYUI_BASE}/api/history/uuid-batch-2").mock(
return_value=httpx.Response(200, json=mock_history_2)
)
respx.get(f"{COMFYUI_BASE}/api/view").mock(
return_value=httpx.Response(200, content=sample_image_bytes)
)
result = await generate_image(
prompt="a landscape",
count=2,
seed=100,
output_dir=str(tmp_path),
)
# 4 content items: [Text1, Image1, Text2, Image2]
assert len(result) == 4
assert result[0].type == "text"
assert result[1].type == "image"
assert result[2].type == "text"
assert result[3].type == "image"
# 2 files saved
saved_files = list(tmp_path.glob("*.png"))
assert len(saved_files) == 2
# Label contains batch index
assert "1/2" in result[0].text
assert "2/2" in result[2].text
@respx.mock
@pytest.mark.asyncio
async def test_generate_image_count_2_fixed_seed_increments(
tmp_path, sample_image_bytes, queue_empty, monkeypatch
):
"""count=2 with fixed seed → seeds are incremented (seed, seed+1)."""
monkeypatch.setattr(server, "IMAGE_OUTPUT_DIR", str(tmp_path))
submitted_seeds = []
def capture_prompt(request):
body = json.loads(request.content)
seed_val = body["prompt"]["13"]["inputs"]["seed"]
submitted_seeds.append(seed_val)
idx = len(submitted_seeds)
return httpx.Response(200, json={"prompt_id": f"seed-test-{idx}"})
mock_history_1 = {
"seed-test-1": {
"outputs": {"9": {"images": [{"filename": "img1.png", "subfolder": "", "type": "output"}]}},
"status": {"completed": True},
}
}
mock_history_2 = {
"seed-test-2": {
"outputs": {"9": {"images": [{"filename": "img2.png", "subfolder": "", "type": "output"}]}},
"status": {"completed": True},
}
}
respx.post(f"{COMFYUI_BASE}/api/prompt").mock(side_effect=capture_prompt)
respx.get(f"{COMFYUI_BASE}/api/queue").mock(
return_value=httpx.Response(200, json=queue_empty)
)
respx.get(f"{COMFYUI_BASE}/api/history/seed-test-1").mock(
return_value=httpx.Response(200, json=mock_history_1)
)
respx.get(f"{COMFYUI_BASE}/api/history/seed-test-2").mock(
return_value=httpx.Response(200, json=mock_history_2)
)
respx.get(f"{COMFYUI_BASE}/api/view").mock(
return_value=httpx.Response(200, content=sample_image_bytes)
)
await generate_image(
prompt="a test",
count=2,
seed=42,
output_dir=str(tmp_path),
)
assert submitted_seeds == [42, 43], f"Expected [42, 43], got {submitted_seeds}"
@respx.mock
@pytest.mark.asyncio
async def test_generate_image_count_partial_failure_continues(
tmp_path, sample_image_bytes, queue_empty, monkeypatch
):
"""count=2 where first image fails → error in slot 1, second image succeeds in slot 2."""
monkeypatch.setattr(server, "IMAGE_OUTPUT_DIR", str(tmp_path))
mock_history_2 = {
"uuid-ok": {
"outputs": {"9": {"images": [{"filename": "img2.png", "subfolder": "", "type": "output"}]}},
"status": {"completed": True},
}
}
respx.post(f"{COMFYUI_BASE}/api/prompt").mock(
side_effect=[
httpx.Response(500, json={"error": "GPU OOM"}), # first fails
httpx.Response(200, json={"prompt_id": "uuid-ok"}), # second succeeds
]
)
respx.get(f"{COMFYUI_BASE}/api/queue").mock(
return_value=httpx.Response(200, json=queue_empty)
)
respx.get(f"{COMFYUI_BASE}/api/history/uuid-ok").mock(
return_value=httpx.Response(200, json=mock_history_2)
)
respx.get(f"{COMFYUI_BASE}/api/view").mock(
return_value=httpx.Response(200, content=sample_image_bytes)
)
result = await generate_image(
prompt="a test",
count=2,
seed=10,
output_dir=str(tmp_path),
)
# First: error TextContent only (no ImageContent)
# Second: [TextContent, ImageContent]
assert len(result) == 3
assert result[0].type == "text"
assert "500" in result[0].text or "error" in result[0].text.lower()
assert result[1].type == "text"
assert result[2].type == "image"
# Only 1 file saved (the successful one)
saved_files = list(tmp_path.glob("*.png"))
assert len(saved_files) == 1
# ---------------------------------------------------------------------------
# generate_image — existing tests (kept intact)
# ---------------------------------------------------------------------------
@respx.mock