Skip to main content
A general-purpose image generation and editing skill supporting six providers: OpenAI, Gemini, Seedream (Volcengine Ark), Qwen (DashScope), MiniMax, and LinkAI. Configure any one provider’s key to start using it; configure multiple to enable automatic fallback.

Supported Models

ProviderModels / AliasesNotes
OpenAIgpt-image-2, gpt-image-1General-purpose, high quality, supports quality parameter
Gemini Nano Banananano-banana-2, nano-banana-pro, nano-bananaCorresponds to the image variants of gemini-3.1-flash, gemini-3-pro, gemini-2.5-flash
Seedream (Volcengine Ark)seedream-5.0-lite, seedream-4.5Native 2K–4K, up to 14 reference images for fusion
Qwen (DashScope)qwen-image-2.0, qwen-image-2.0-proStrong with Chinese text rendering and text-image layouts
MiniMaximage-01Fast and simple
LinkAIAny modelUniversal gateway, used as fallback

Model Selection

By default, “auto routing + automatic fallback” is used:
  1. Pick the first configured provider in the order OpenAI → Gemini → Seedream → Qwen → MiniMax → LinkAI
  2. On errors such as 401, model not enabled, or network issues, automatically switch to the next provider
  3. If the user specifies a model in the conversation (e.g. “use seedream to draw a cat”), the corresponding provider is promoted to the front
To pin a specific model:
{
  "skills": {
    "image-generation": {
      "model": "seedream-5.0-lite"
    }
  }
}

Configuring API Keys

It is recommended to configure providers from the “Model Management” page in the Web console. Chat model keys configured there are automatically reused by the image generation skill — no need to set them twice. You can also edit the configuration file manually or temporarily set keys in a conversation using the env_config tool.
Credentials are shared with the main model providers:
FieldProvider
openai_api_keyOpenAI
gemini_api_keyGemini
ark_api_keyVolcengine Ark (Seedream)
dashscope_api_keyAlibaba DashScope (Qwen)
minimax_api_keyMiniMax
linkai_api_keyLinkAI

Enabling and Disabling

The skill automatically adjusts its status based on API keys:
  • Key configured: the Agent calls the skill directly when it receives a drawing request
  • Key not configured: the skill still appears in context (marked as “needs configuration”) — the Agent will guide the user to set up a key
To control it manually:
/skill disable image-generation    # Disable
/skill enable image-generation     # Re-enable
Equivalent terminal commands: cow skill disable image-generation / cow skill enable image-generation.

Parameters

ParameterTypeRequiredDefaultDescription
promptstringYesImage description
image_urlstring / listNonullInput image for editing — local path or URL; pass a list for multi-image fusion
qualitystringNoautolow / medium / high, supported only by some providers
sizestringNoauto512 / 1K / 2K / 3K / 4K, or pixel value like 1024x1024
aspect_ratiostringNonull1:1 / 3:2 / 2:3 / 16:9 / 9:16 / 21:9; Gemini also supports 1:4 / 4:1 / 1:8 / 8:1
Higher quality and larger size cost more and take longer. For everyday conversations, use the defaults (auto) or quality=low + size=1K — about 20 seconds per image. For posters or when high resolution is explicitly requested, use quality=high + size=2K/4K — may take 1–5 minutes.

Common Use Cases

  • Text-to-image: generate illustrations, posters, icons, avatars, storyboards, etc. from a description
  • Image-to-image: change styles, swap elements, add decorations or text on an existing image
  • Multi-image fusion: combine multiple reference images into one (outfit swaps, character group photos, etc.)
  • Bash timeout should be set to 600 seconds: each provider has a 300-second HTTP timeout, and the script may try multiple providers sequentially
  • Input images are automatically compressed to ≤ 4 MB with the longest edge ≤ 4096 px
  • Gemini / Seedream / Qwen / MiniMax do not support the quality parameter
  • Seedream defaults to 2K; seedream-5.0-lite supports up to 3K; seedream-4.5 supports up to 4K