novita

qwen/qwen3-30b-a3b-fp8

Context

128K

Input

text

Output

text

Tool calling

Supported

About this model

Qwen3-30B-A3B-FP8 is a 30B-parameter MoE model with 3B active parameters per token, optimized with FP8 quantization. Balances high performance with practical deployment requirements across reasoning, multilingual, and coding tasks.

Capabilities

text

Available through the unified API

Quick start

View API docs

curl https://api.oneinfer.ai/v1/chat/completions \
  -H "Authorization: Bearer $ONEINFER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "97ec49ff3b8649f2babf211a9abf58e2",
    "messages": [
      { "role": "user", "content": "Hello!" }
    ]
  }'

Providers

Available routing options for this model through OneInfer.

novita

97ec49ff3b8649f2babf211a9abf58e2

Available

Input

$0.100 / 1M

Output

$0.450 / 1M

Routing

OneInfer optimized

Pricing

Current OneInfer pricing for this model.

Usage	Price
Input tokens	$0.100 / 1M
Output tokens	$0.450 / 1M

Performance

Published evaluation results associated with this model.

General

MMLU81.5

MMLU-Redux80.3

BBH75

AGIEval64.2

Mathematics and Science Tasks

GSM8K93.9

MATH47.6

GPQA36.8

Multilingual tasks

MGSM80

XCOPA87.4

Flores68.9

Code tasks

HumanEval75

MBPP70.8

MultiPL-E63.1

API example

curl https://api.oneinfer.ai/v1/chat/completions \
  -H "Authorization: Bearer $ONEINFER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "97ec49ff3b8649f2babf211a9abf58e2",
    "messages": [
      { "role": "user", "content": "Hello!" }
    ]
  }'