A general-purpose image generation and editing skill supporting six providers: OpenAI, Gemini, Seedream (Volcengine Ark), Qwen (DashScope), MiniMax, and LinkAI. Configure any one provider’s key to start using it; configure multiple to enable automatic fallback.
Supported Models
| Provider | Models / Aliases | Notes |
|---|
| OpenAI | gpt-image-2, gpt-image-1 | General-purpose, high quality, supports quality parameter |
| Gemini Nano Banana | nano-banana-2, nano-banana-pro, nano-banana | Corresponds to the image variants of gemini-3.1-flash, gemini-3-pro, gemini-2.5-flash |
| Seedream (Volcengine Ark) | seedream-5.0-lite, seedream-4.5 | Native 2K–4K, up to 14 reference images for fusion |
| Qwen (DashScope) | qwen-image-2.0, qwen-image-2.0-pro | Strong with Chinese text rendering and text-image layouts |
| MiniMax | image-01 | Fast and simple |
| LinkAI | Any model | Universal gateway, used as fallback |
Model Selection
By default, “auto routing + automatic fallback” is used:
- Pick the first configured provider in the order
OpenAI → Gemini → Seedream → Qwen → MiniMax → LinkAI
- On errors such as 401, model not enabled, or network issues, automatically switch to the next provider
- If the user specifies a model in the conversation (e.g. “use seedream to draw a cat”), the corresponding provider is promoted to the front
To pin a specific model:
{
"skills": {
"image-generation": {
"model": "seedream-5.0-lite"
}
}
}
Configuring API Keys
It is recommended to configure providers from the “Model Management” page in the Web console. Chat model keys configured there are automatically reused by the image generation skill — no need to set them twice. You can also edit the configuration file manually or temporarily set keys in a conversation using the env_config tool.
Credentials are shared with the main model providers:
| Field | Provider |
|---|
openai_api_key | OpenAI |
gemini_api_key | Gemini |
ark_api_key | Volcengine Ark (Seedream) |
dashscope_api_key | Alibaba DashScope (Qwen) |
minimax_api_key | MiniMax |
linkai_api_key | LinkAI |
Enabling and Disabling
The skill automatically adjusts its status based on API keys:
- Key configured: the Agent calls the skill directly when it receives a drawing request
- Key not configured: the skill still appears in context (marked as “needs configuration”) — the Agent will guide the user to set up a key
To control it manually:
/skill disable image-generation # Disable
/skill enable image-generation # Re-enable
Equivalent terminal commands: cow skill disable image-generation / cow skill enable image-generation.
Parameters
| Parameter | Type | Required | Default | Description |
|---|
prompt | string | Yes | — | Image description |
image_url | string / list | No | null | Input image for editing — local path or URL; pass a list for multi-image fusion |
quality | string | No | auto | low / medium / high, supported only by some providers |
size | string | No | auto | 512 / 1K / 2K / 3K / 4K, or pixel value like 1024x1024 |
aspect_ratio | string | No | null | 1:1 / 3:2 / 2:3 / 16:9 / 9:16 / 21:9; Gemini also supports 1:4 / 4:1 / 1:8 / 8:1 |
Higher quality and larger size cost more and take longer. For everyday conversations, use the defaults (auto) or quality=low + size=1K — about 20 seconds per image. For posters or when high resolution is explicitly requested, use quality=high + size=2K/4K — may take 1–5 minutes.
Common Use Cases
- Text-to-image: generate illustrations, posters, icons, avatars, storyboards, etc. from a description
- Image-to-image: change styles, swap elements, add decorations or text on an existing image
- Multi-image fusion: combine multiple reference images into one (outfit swaps, character group photos, etc.)
- Bash timeout should be set to 600 seconds: each provider has a 300-second HTTP timeout, and the script may try multiple providers sequentially
- Input images are automatically compressed to ≤ 4 MB with the longest edge ≤ 4096 px
- Gemini / Seedream / Qwen / MiniMax do not support the
quality parameter
- Seedream defaults to 2K;
seedream-5.0-lite supports up to 3K; seedream-4.5 supports up to 4K