novita

zai-org/glm-4.6v

Context

128K

Input

text, image

Output

text

Tool calling

Supported

About this model

GLM-4.6V is a vision-language model designed for cloud and high-performance clusters. It introduces native multimodal function calling, interleaved image-text generation, and advanced document understanding. It supports a 128K context window and achieves SoTA visual understanding among models of similar scale.

Capabilities

text

Available through the unified API

image

Available through the unified API

Quick start

View API docs

curl https://api.oneinfer.ai/v1/chat/completions \
  -H "Authorization: Bearer $ONEINFER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "915240c9cb044d678dd7dd8ae0386802",
    "messages": [
      { "role": "user", "content": "Hello!" }
    ]
  }'

Providers

Available routing options for this model through OneInfer.

novita

915240c9cb044d678dd7dd8ae0386802

Available

Input

$0.300 / 1M

Output

$0.900 / 1M

Routing

OneInfer optimized

Pricing

Current OneInfer pricing for this model.

Usage	Price
Input tokens	$0.300 / 1M
Output tokens	$0.900 / 1M

Performance

Published evaluation results associated with this model.

MMMU

score82.3

MathVista

score87.8

RealWorldQA

score83.7

API example

curl https://api.oneinfer.ai/v1/chat/completions \
  -H "Authorization: Bearer $ONEINFER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "915240c9cb044d678dd7dd8ae0386802",
    "messages": [
      { "role": "user", "content": "Hello!" }
    ]
  }'