novita
google/gemma-4-26b-a4b-it
Context
256K
Input
text, image
Output
text
Tool calling
Supported
About this model
Gemma 4 26B A4B is an efficient, open-weights Mixture-of-Experts (MoE) multimodal model from Google DeepMind. With only 3.8B active parameters out of 25.2B total, it runs nearly as fast as a 4B model while delivering performance close to the 31B dense model. Features a 256K token context window, supports interleaved text and image inputs, and excels at reasoning, coding, and agentic workflows.
Capabilities
text
Available through the unified API
image
Available through the unified API
Quick start
View API docscurl https://api.oneinfer.ai/v1/chat/completions \
-H "Authorization: Bearer $ONEINFER_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "64ddc56799ba4c8b88fdcea066e92e22",
"messages": [
{ "role": "user", "content": "Hello!" }
]
}'Providers
Available routing options for this model through OneInfer.
novita
64ddc56799ba4c8b88fdcea066e92e22
Input
$0.130 / 1M
Output
$0.400 / 1M
Routing
OneInfer optimized
Pricing
Current OneInfer pricing for this model.
| Usage | Price |
|---|---|
| Input tokens | $0.130 / 1M |
| Output tokens | $0.400 / 1M |
Performance
Published evaluation results associated with this model.
Reasoning
MMLU Pro82.6
GPQA Diamond82.3
AIME 202688.3
Vision
MMMU Pro73.8
MATH-Vision82.4
Agentic
Tau2-bench85.5
LiveCodeBench v677.1
API example
curl https://api.oneinfer.ai/v1/chat/completions \
-H "Authorization: Bearer $ONEINFER_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "64ddc56799ba4c8b88fdcea066e92e22",
"messages": [
{ "role": "user", "content": "Hello!" }
]
}'