novita

qwen/qwen3-30b-a3b-fp8

Context

128K

Input

text

Output

text

Tool calling

Supported

About this model

Qwen3-30B-A3B-FP8 is a 30B-parameter MoE model with 3B active parameters per token, optimized with FP8 quantization. Balances high performance with practical deployment requirements across reasoning, multilingual, and coding tasks.

Capabilities

text

Available through the unified API

Quick start

View API docs
curl https://api.oneinfer.ai/v1/chat/completions \
  -H "Authorization: Bearer $ONEINFER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "97ec49ff3b8649f2babf211a9abf58e2",
    "messages": [
      { "role": "user", "content": "Hello!" }
    ]
  }'

Providers

Available routing options for this model through OneInfer.

novita

97ec49ff3b8649f2babf211a9abf58e2

Available

Input

$0.100 / 1M

Output

$0.450 / 1M

Routing

OneInfer optimized

Pricing

Current OneInfer pricing for this model.

UsagePrice
Input tokens$0.100 / 1M
Output tokens$0.450 / 1M

Performance

Published evaluation results associated with this model.

General

MMLU81.5
MMLU-Redux80.3
BBH75
AGIEval64.2

Mathematics and Science Tasks

GSM8K93.9
MATH47.6
GPQA36.8

Multilingual tasks

MGSM80
XCOPA87.4
Flores68.9

Code tasks

HumanEval75
MBPP70.8
MultiPL-E63.1

API example

curl https://api.oneinfer.ai/v1/chat/completions \
  -H "Authorization: Bearer $ONEINFER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "97ec49ff3b8649f2babf211a9abf58e2",
    "messages": [
      { "role": "user", "content": "Hello!" }
    ]
  }'