Pic Gen mode generates AI images through the mcp-image-gen MCP server, which drives ComfyUI locally. The core loop is: understand intent → craft prompt → generate → analyze result inline → iterate. Generate one or more images from a text prompt Detailed text description Model filename Output width in pixels Output height in pixels Inference steps (4 for schnell, 20 for heretic) Fixed seed for reproducibility; -1 = random Things to exclude Filename prefix for organization Batch size 1–10 for variation exploration Override output path (default: ~/Pictures/mcp-generated) Flat interleaved [TextContent, ImageContent] list — images display inline List all models registered in ComfyUI + the workflow registry When Patrick asks which models are available, or before selecting an unusual model Check status of a queued/running generation by prompt_id When a generation seems to have stalled or timed out Return the absolute path where images are saved When Patrick asks where files are saved Understand what Patrick wants before generating Identify subject, style, mood, and use case from the request Infer aspect ratio from use case (square for profiles, landscape for banners, etc.) Determine model: schnell for speed/iteration, heretic for quality/uncensored Ask only if the request is genuinely ambiguous — otherwise proceed with best guess Build a high-quality FLUX prompt before calling the tool Write the prompt with clear subject, environment, lighting, style, and quality keywords Add a negative_prompt if obvious artifacts should be excluded (e.g., "blurry, low quality") Share the prompt with Patrick before generating so he can adjust if needed Call generate_image with appropriate parameters Use name param with a descriptive slug for organized output files Use count=2..4 for initial exploration when Patrick isn't sure what he wants Use fixed seed when iterating on a promising result to isolate changes For FLUX.2 Klein/Heretic: increase steps to 20 for best quality Review the inline image and offer next steps Describe what worked and what could be improved Offer 2-3 concrete next iteration directions (prompt tweak, seed variation, model switch) Note the saved file path for reference First iteration / exploring concepts Wiki/doc header images (1280x512 landscape) Profile pictures and avatars Non-sensitive subjects where speed matters Batch generation of variations (fast cycle) steps=4, any resolution in multiples of 64 ~10s per image on RX 7900 XTX Mature or artistic content that schnell refuses Higher realism requirement (photorealistic portraits, detailed scenes) Final output after iterations established the right concept steps=20, 1024x1024 or higher ~52s per image on RX 7900 XTX Uses DreamFast Heretic Qwen3-4B encoder — abliterated, KL=0.0 1024x1024 1280x512 1920x1088 (nearest 64-multiple to 1920x1080) 768x1024 1216x512 Image generated and displayed inline in chat File path reported so Patrick can find it on disk Seed reported so the result is reproducible Next iteration options offered if result is not final