novita
zai-org/glm-4.6v
Context
128K
Input
text, image
Output
text
Tool calling
Supported
About this model
GLM-4.6V is a vision-language model designed for cloud and high-performance clusters. It introduces native multimodal function calling, interleaved image-text generation, and advanced document understanding. It supports a 128K context window and achieves SoTA visual understanding among models of similar scale.
Capabilities
text
Available through the unified API
image
Available through the unified API
Quick start
View API docscurl https://api.oneinfer.ai/v1/chat/completions \
-H "Authorization: Bearer $ONEINFER_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "915240c9cb044d678dd7dd8ae0386802",
"messages": [
{ "role": "user", "content": "Hello!" }
]
}'Providers
Available routing options for this model through OneInfer.
novita
915240c9cb044d678dd7dd8ae0386802
Input
$0.300 / 1M
Output
$0.900 / 1M
Routing
OneInfer optimized
Pricing
Current OneInfer pricing for this model.
| Usage | Price |
|---|---|
| Input tokens | $0.300 / 1M |
| Output tokens | $0.900 / 1M |
Performance
Published evaluation results associated with this model.
MMMU
score82.3
MathVista
score87.8
RealWorldQA
score83.7
API example
curl https://api.oneinfer.ai/v1/chat/completions \
-H "Authorization: Bearer $ONEINFER_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "915240c9cb044d678dd7dd8ae0386802",
"messages": [
{ "role": "user", "content": "Hello!" }
]
}'