novita

zai-org/glm-4.7-flash

Context

200K

Input

text

Output

text

Tool calling

Supported

About this model

GLM-4.7-Flash is a 30B-A3B Mixture-of-Experts (MoE) model designed for lightweight deployment that balances performance and efficiency. It is positioned as the strongest model in the 30B class, with strong performance on reasoning and agentic benchmarks.

Capabilities

text

Available through the unified API

Quick start

View API docs

curl https://api.oneinfer.ai/v1/chat/completions \
  -H "Authorization: Bearer $ONEINFER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "e751721d7289420d8f6269af08d096a9",
    "messages": [
      { "role": "user", "content": "Hello!" }
    ]
  }'

Providers

Available routing options for this model through OneInfer.

novita

e751721d7289420d8f6269af08d096a9

Available

Input

$0.070 / 1M

Output

$0.400 / 1M

Routing

OneInfer optimized

Pricing

Current OneInfer pricing for this model.

Usage	Price
Input tokens	$0.070 / 1M
Output tokens	$0.400 / 1M

Performance

Published evaluation results associated with this model.

AIME 25

score91.6

GPQA

score75.2

SWE-bench Verified

score59.2

API example

curl https://api.oneinfer.ai/v1/chat/completions \
  -H "Authorization: Bearer $ONEINFER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "e751721d7289420d8f6269af08d096a9",
    "messages": [
      { "role": "user", "content": "Hello!" }
    ]
  }'