groq

deepseek-r1-distill-llama-70b

Context

128K

Input

text

Output

text

Tool calling

Supported

About this model

DeepSeek-R1-Distill-Llama-70B is a hybrid 70B-parameter model combining DeepSeek's reasoning capabilities with Llama-70B's architectural efficiency. Knowledge-distilled from DeepSeek-R1 to Llama-3 architecture, delivering 92% of R1's performance with 60% lower inference cost and enhanced tool integration.

Capabilities

text

Available through the unified API

Quick start

View API docs

curl https://api.oneinfer.ai/v1/chat/completions \
  -H "Authorization: Bearer $ONEINFER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "d2053a988fdc4a789b16f9442ede5a83",
    "messages": [
      { "role": "user", "content": "Hello!" }
    ]
  }'

Providers

Available routing options for this model through OneInfer.

groq

d2053a988fdc4a789b16f9442ede5a83

Available

Input

$0.750 / 1M

Output

$0.990 / 1M

Routing

OneInfer optimized

Pricing

Current OneInfer pricing for this model.

Usage	Price
Input tokens	$0.750 / 1M
Output tokens	$0.990 / 1M

Performance

Published evaluation results associated with this model.

Reasoning Capabilities

MMLU81.4

GSM8K94.3

ARC-Challenge89.7

TheoremQA54.8

Efficiency Metrics

Tokens/sec280

Latency (ms)180

VRAM Utilization128.6

Tokens/$38.4

Knowledge Retention

TruthfulQA88.7

Natural Questions84.3

Knowledge Recall92.6

Hallucination Rate1.2

Tool Integration

API Call Success93.7

Code Execution87.9

Tool Composition85.4

Plugin Compatibility28

API example

curl https://api.oneinfer.ai/v1/chat/completions \
  -H "Authorization: Bearer $ONEINFER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "d2053a988fdc4a789b16f9442ede5a83",
    "messages": [
      { "role": "user", "content": "Hello!" }
    ]
  }'