chore(homelab): add homelab plans, frpc deploy script, odysseus workspace, heretic docs
This commit is contained in:
@@ -0,0 +1,194 @@
|
||||
# Task: Add ESRGAN Upscaler to mcp-image-gen
|
||||
|
||||
**Datum:** 2026-04-10
|
||||
**Status:** Ready to implement
|
||||
**Depends on:** mcp-image-gen working ✅, FLUX.2 Klein Heretic working ✅
|
||||
|
||||
---
|
||||
|
||||
## Goal
|
||||
|
||||
Add an `upscale_image()` MCP tool that takes an existing PNG path (from a previous `generate_image()` call) and upscales it 2× or 4× using a Real-ESRGAN model — **no diffusion re-generation**, just fast post-processing (~5–10s).
|
||||
|
||||
Result: A 1024×1024 → 4096×4096 pipeline in two tool calls:
|
||||
```python
|
||||
result = generate_image("...", model="flux-2-klein-4b.safetensors", steps=20)
|
||||
# → ~/Pictures/mcp-generated/foo_20260410_123456_12345.png
|
||||
|
||||
upscaled = upscale_image(
|
||||
input_path="~/Pictures/mcp-generated/foo_20260410_123456_12345.png",
|
||||
scale=4
|
||||
)
|
||||
# → ~/Pictures/mcp-generated/foo_20260410_123456_12345_4x.png (4096×4096)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Why ESRGAN (Option B) over Latent Upscale
|
||||
|
||||
| Method | Time overhead | Quality | Requires diffusion? |
|
||||
|--------|--------------|---------|---------------------|
|
||||
| ESRGAN image upscale | ~5–10s | ✅ Very sharp details | ❌ No |
|
||||
| Latent upscale + KSampler | ~50% extra gen time | ✅ Good, consistent style | ✅ Yes |
|
||||
| UltimateSDUpscale (tiled) | ~4× gen time | ✅ Highest quality | ✅ Yes |
|
||||
|
||||
ESRGAN is the clear winner for "I want a bigger version of this image quickly."
|
||||
|
||||
---
|
||||
|
||||
## Model to Use
|
||||
|
||||
**`4x-UltraSharp.pth`** — the community standard for photorealistic upscaling.
|
||||
|
||||
- Source: https://huggingface.co/Kim2091/UltraSharp
|
||||
- Download: `huggingface-cli download Kim2091/UltraSharp 4x-UltraSharp.pth --local-dir ~/ComfyUI/models/upscale_models/`
|
||||
- Size: ~67MB
|
||||
- Scale factor: 4× (can also be used for 2× via image resize after)
|
||||
|
||||
Alternative: `RealESRGAN_x4plus.pth` (in ComfyUI's model downloader, general purpose)
|
||||
|
||||
---
|
||||
|
||||
## ComfyUI Workflow: `esrgan_upscale.json`
|
||||
|
||||
Minimal workflow — 3 nodes:
|
||||
|
||||
```
|
||||
LoadImage → UpscaleModelLoader + ImageUpscaleWithModel → SaveImage
|
||||
```
|
||||
|
||||
Node layout:
|
||||
|
||||
```json
|
||||
{
|
||||
"1": {
|
||||
"class_type": "LoadImage",
|
||||
"inputs": {
|
||||
"image": "__INPUT_PATH__"
|
||||
}
|
||||
},
|
||||
"2": {
|
||||
"class_type": "UpscaleModelLoader",
|
||||
"inputs": {
|
||||
"model_name": "4x-UltraSharp.pth"
|
||||
}
|
||||
},
|
||||
"3": {
|
||||
"class_type": "ImageUpscaleWithModel",
|
||||
"inputs": {
|
||||
"upscale_model": ["2", 0],
|
||||
"image": ["1", 0]
|
||||
}
|
||||
},
|
||||
"4": {
|
||||
"class_type": "SaveImage",
|
||||
"inputs": {
|
||||
"images": ["3", 0],
|
||||
"filename_prefix": "__OUTPUT_PREFIX__"
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Note:** `LoadImage` in ComfyUI requires the image to be in `~/ComfyUI/input/` — the workflow builder must copy the input file there first (or use `ETN_LoadImageBase64` if available). See "Implementation Notes" below.
|
||||
|
||||
---
|
||||
|
||||
## MCP Tool Signature
|
||||
|
||||
Add to [`mcp/mcp-image-gen/src/server.py`](../mcp/mcp-image-gen/src/server.py):
|
||||
|
||||
```python
|
||||
@mcp.tool()
|
||||
async def upscale_image(
|
||||
input_path: Annotated[str, Field(description="Path to input PNG (absolute or ~-relative). Must be a file previously generated by generate_image().")],
|
||||
scale: Annotated[int, Field(description="Upscale factor: 2 or 4 (default: 4). 4x-UltraSharp always runs at 4x; scale=2 applies a 0.5 resize after.")] = 4,
|
||||
output_dir: Annotated[str, Field(description="Override output directory. Defaults to same dir as input_path.")] = "",
|
||||
name: Annotated[str, Field(description="Optional output filename prefix. Defaults to input filename + _4x or _2x.")] = "",
|
||||
) -> list:
|
||||
"""Upscale an existing image using Real-ESRGAN (4x-UltraSharp).
|
||||
|
||||
No diffusion re-generation — pure post-processing (~5-10s).
|
||||
Input must be a PNG file. Output is saved alongside the input by default.
|
||||
|
||||
Returns both a file path and an inline base64 image for display.
|
||||
"""
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Implementation Notes
|
||||
|
||||
### The `LoadImage` ComfyUI constraint
|
||||
|
||||
ComfyUI's built-in `LoadImage` node only accepts filenames relative to `~/ComfyUI/input/`, not arbitrary paths. Two solutions:
|
||||
|
||||
**Solution A (simplest):** Copy input to `~/ComfyUI/input/` before submitting workflow, use basename as `image` param, delete after.
|
||||
|
||||
**Solution B:** Use `ETN_LoadImageBase64` node (part of `ComfyUI-ETN` custom node extension) — accepts a base64-encoded image directly. Check if installed:
|
||||
```bash
|
||||
ls ~/ComfyUI/custom_nodes/ | grep -i etn
|
||||
```
|
||||
|
||||
**Recommended:** Start with Solution A (copy to input dir) — no dependencies. If `ComfyUI-ETN` is present, prefer Solution B for cleanliness.
|
||||
|
||||
### Scale=2 handling
|
||||
|
||||
`4x-UltraSharp.pth` always outputs 4×. For `scale=2`, upscale at 4× then resize the result image to 50% with PIL before saving. This is still sharper than native 2× bilinear upscaling.
|
||||
|
||||
### Output filename convention
|
||||
|
||||
Input: `foo_20260410_123456_12345.png`
|
||||
Output `scale=4`: `foo_20260410_123456_12345_4x.png`
|
||||
Output `scale=2`: `foo_20260410_123456_12345_2x.png`
|
||||
|
||||
---
|
||||
|
||||
## Files to Create/Modify
|
||||
|
||||
| File | Change |
|
||||
|------|--------|
|
||||
| [`mcp/mcp-image-gen/src/workflows/esrgan_upscale.json`](../mcp/mcp-image-gen/src/workflows/esrgan_upscale.json) | New — ESRGAN workflow |
|
||||
| [`mcp/mcp-image-gen/src/server.py`](../mcp/mcp-image-gen/src/server.py) | Add `upscale_image()` tool + helpers |
|
||||
| [`mcp/mcp-image-gen/tests/test_upscale.py`](../mcp/mcp-image-gen/tests/test_upscale.py) | New test file |
|
||||
|
||||
**No changes to:** workflow registry, existing tools, `generate_image()`.
|
||||
|
||||
---
|
||||
|
||||
## Pre-flight: Download Model
|
||||
|
||||
```bash
|
||||
huggingface-cli download Kim2091/UltraSharp \
|
||||
4x-UltraSharp.pth \
|
||||
--local-dir ~/ComfyUI/models/upscale_models/
|
||||
```
|
||||
|
||||
Verify ComfyUI sees it:
|
||||
```bash
|
||||
curl -s http://localhost:8188/object_info/UpscaleModelLoader | \
|
||||
python3 -c "import sys,json; d=json.load(sys.stdin); print('\n'.join(d['UpscaleModelLoader']['input']['required']['model_name'][0]))"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Test Cases
|
||||
|
||||
| Test | Input | Expected |
|
||||
|------|-------|----------|
|
||||
| `test_upscale_4x` | 1024×1024 PNG | 4096×4096 PNG, `_4x.png` suffix |
|
||||
| `test_upscale_2x` | 1024×1024 PNG | 2048×2048 PNG, `_2x.png` suffix |
|
||||
| `test_invalid_path` | nonexistent path | Error TextContent returned |
|
||||
| `test_output_dir_override` | valid PNG + `output_dir=/tmp` | saved to /tmp |
|
||||
| `test_default_output_dir` | valid PNG, no output_dir | saved alongside input |
|
||||
|
||||
---
|
||||
|
||||
## Success Criteria
|
||||
|
||||
- [ ] `4x-UltraSharp.pth` present in `~/ComfyUI/models/upscale_models/`
|
||||
- [ ] `upscale_image("path/to/1024.png", scale=4)` returns 4096×4096 PNG
|
||||
- [ ] Output file saved with `_4x.png` suffix
|
||||
- [ ] Inline base64 image returned for display in chat
|
||||
- [ ] All 5 test cases pass
|
||||
- [ ] No changes to existing `generate_image()` tests
|
||||
Reference in New Issue
Block a user