Image Generation

Text to image, one command, multiple models

What It Does

A single Python script that turns text prompts into images via OpenRouter. Pick from multiple models — cheap and fast for iteration, or high-quality for final assets — and get a PNG back. That's it. No SDKs, no wrapper libraries, no config files. One script, one dependency (requests), one command.

Available Models

ModelCost / ImageBest For
gemini-2.5-flash-image~$0.03–0.05Default. Fast, cheap, reliable
gemini-3-pro-image-preview~$0.12–0.20Higher quality, complex scenes
gpt-5-image-minivariesOpenAI mini option
gpt-5-imagevariesFull GPT-5 image model

What's Included

Why I Built This

I needed to generate images — avatars, blog illustrations, covers — without leaving the terminal. Every existing tool was either a full SDK with ten config files or a web UI that required a browser. I wanted one script I could call from anywhere: "make me this image, put it here." OpenRouter gives access to multiple models through one API key, so the script supports all of them and defaults to the cheapest one that's actually good.

Quick Start

# Default model (Gemini 2.5 Flash — fast and cheap)
python3 skills/imagegen/imagegen.py "A red panda eating bamboo" -o panda.png

# Higher quality
python3 skills/imagegen/imagegen.py "A futuristic cityscape at dusk" -o city.png \
  -m google/gemini-3-pro-image-preview

# GPT-5
python3 skills/imagegen/imagegen.py "A minimalist logo" -o logo.png \
  -m openai/gpt-5-image
Download .zip ↓

Unzip into ~/.openclaw/workspace/skills/ and read the SKILL.md inside.
Requires OPENROUTER_API_KEY in ~/.openclaw/.env.

← Back to all skills