Gemini - CowAgent

Google Gemini supports text chat, image understanding, and image generation (Nano Banana series). A single gemini_api_key enables all capabilities.

All capabilities below can be configured in one place via the “Model Management” page in the Web Console, with no need to manually edit the configuration file.

Text Chat

{
  "model": "gemini-3.5-flash",
  "gemini_api_key": "YOUR_API_KEY"
}

Parameter	Description
`model`	Recommended: `gemini-3.5-flash`; also supports `gemini-3.1-pro-preview`, `gemini-3.1-flash-lite-preview`, `gemini-3-flash-preview`, `gemini-3-pro-preview`, etc. See official docs
`gemini_api_key`	Create one in Google AI Studio
`gemini_api_base`	Optional, defaults to `https://generativelanguage.googleapis.com`. Can be changed to a third-party proxy

Image Understanding

All Gemini models natively support vision. Once gemini_api_key is configured, the Agent’s Vision tool automatically uses the main model to recognize images, with no extra setup required. To manually specify a Vision model:

{
  "tools": {
    "vision": {
      "model": "gemini-3.1-flash-lite-preview"
    }
  }
}

Image Generation

{
  "skills": {
    "image-generation": {
      "model": "gemini-3.1-flash-image-preview"
    }
  }
}

Model ID	Alias
`gemini-3.1-flash-image-preview`	Nano Banana 2
`gemini-3-pro-image-preview`	Nano Banana Pro
`gemini-2.5-flash-image`	Nano Banana

​Text Chat

​Image Understanding

​Image Generation

Text Chat

Image Understanding

Image Generation