novita

google/gemma-4-26b-a4b-it

Context

256K

Input

text, image

Output

text

Tool calling

Supported

About this model

Gemma 4 26B A4B is an efficient, open-weights Mixture-of-Experts (MoE) multimodal model from Google DeepMind. With only 3.8B active parameters out of 25.2B total, it runs nearly as fast as a 4B model while delivering performance close to the 31B dense model. Features a 256K token context window, supports interleaved text and image inputs, and excels at reasoning, coding, and agentic workflows.

Capabilities

text

Available through the unified API

image

Available through the unified API

Quick start

View API docs

curl https://api.oneinfer.ai/v1/chat/completions \
  -H "Authorization: Bearer $ONEINFER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "64ddc56799ba4c8b88fdcea066e92e22",
    "messages": [
      { "role": "user", "content": "Hello!" }
    ]
  }'

Providers

Available routing options for this model through OneInfer.

novita

64ddc56799ba4c8b88fdcea066e92e22

Available

Input

$0.130 / 1M

Output

$0.400 / 1M

Routing

OneInfer optimized

Pricing

Current OneInfer pricing for this model.

Usage	Price
Input tokens	$0.130 / 1M
Output tokens	$0.400 / 1M

Performance

Published evaluation results associated with this model.

Reasoning

MMLU Pro82.6

GPQA Diamond82.3

AIME 202688.3

Vision

MMMU Pro73.8

MATH-Vision82.4

Agentic

Tau2-bench85.5

LiveCodeBench v677.1

API example

curl https://api.oneinfer.ai/v1/chat/completions \
  -H "Authorization: Bearer $ONEINFER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "64ddc56799ba4c8b88fdcea066e92e22",
    "messages": [
      { "role": "user", "content": "Hello!" }
    ]
  }'