novita
qwen/qwen3.5-122b-a10b
Context
262K
Input
text, image
Output
text
Tool calling
Supported
About this model
Qwen3.5-122B-A10B is a mid-tier multimodal vision-language MoE model designed for native multimodal agent applications. It utilizes a hybrid 'Gated DeltaNet' and sparse MoE architecture to process text, image, and video inputs with early fusion, balancing high performance with computational efficiency.
Capabilities
text
Available through the unified API
image
Available through the unified API
Quick start
View API docscurl https://api.oneinfer.ai/v1/chat/completions \
-H "Authorization: Bearer $ONEINFER_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "fe61df11759e4899b5de8a7404bb5658",
"messages": [
{ "role": "user", "content": "Hello!" }
]
}'Providers
Available routing options for this model through OneInfer.
novita
fe61df11759e4899b5de8a7404bb5658
Input
$0.400 / 1M
Output
$3.200 / 1M
Routing
OneInfer optimized
Pricing
Current OneInfer pricing for this model.
| Usage | Price |
|---|---|
| Input tokens | $0.400 / 1M |
| Output tokens | $3.200 / 1M |
Performance
Published evaluation results associated with this model.
General
MMLU-Pro86.7
GPQA Diamond86.6
Agentic
BFCL-V472.2
TAU2-Bench79.5
Vision
MathVision86.2
MMMU-Pro76.9
API example
curl https://api.oneinfer.ai/v1/chat/completions \
-H "Authorization: Bearer $ONEINFER_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "fe61df11759e4899b5de8a7404bb5658",
"messages": [
{ "role": "user", "content": "Hello!" }
]
}'