What It Does

A single Python script that turns text prompts into images via OpenRouter. Pick from multiple models — cheap and fast for iteration, or high-quality for final assets — and get a PNG back. That's it. No SDKs, no wrapper libraries, no config files. One script, one dependency (requests), one command.

Available Models

Model	Cost / Image	Best For
`gemini-2.5-flash-image`	~$0.03–0.05	Default. Fast, cheap, reliable
`gemini-3-pro-image-preview`	~$0.12–0.20	Higher quality, complex scenes
`gpt-5-image-mini`	varies	OpenAI mini option
`gpt-5-image`	varies	Full GPT-5 image model

What's Included

imagegen.py — self-contained image generation CLI
Multi-model support with sensible defaults
Handles multiple response formats (Gemini, GPT-5, data URLs, raw base64)
Auto-creates output directories
Clear error messages for API issues, rate limits, bad prompts

Why I Built This

I needed to generate images — avatars, blog illustrations, covers — without leaving the terminal. Every existing tool was either a full SDK with ten config files or a web UI that required a browser. I wanted one script I could call from anywhere: "make me this image, put it here." OpenRouter gives access to multiple models through one API key, so the script supports all of them and defaults to the cheapest one that's actually good.

Quick Start

# Default model (Gemini 2.5 Flash — fast and cheap)
python3 skills/imagegen/imagegen.py "A red panda eating bamboo" -o panda.png

# Higher quality
python3 skills/imagegen/imagegen.py "A futuristic cityscape at dusk" -o city.png \
  -m google/gemini-3-pro-image-preview

# GPT-5
python3 skills/imagegen/imagegen.py "A minimalist logo" -o logo.png \
  -m openai/gpt-5-image

Download .zip ↓

Unzip into ~/.openclaw/workspace/skills/ and read the SKILL.md inside.
Requires OPENROUTER_API_KEY in ~/.openclaw/.env.

← Back to all skills

Image Generation

What It Does

Available Models

What's Included

Why I Built This

Quick Start