Models

Cut token costs on agent workloads by 50% compared to real-time inference. Prices and models are subject to change.

ModelInput costOutput cost
GLM-5.1$1.05 $0.525/1M tokens$3.00 $1.50/1M tokens
Kimi-K2.6$0.745 $0.372/1M tokens$4.655 $2.328/1M tokens
MiniMax-M2.5$0.30 $0.15/1M tokens$1.20 $0.60/1M tokens
Qwen3-8B$0.05 $0.025/1M tokens$0.40 $0.20/1M tokens
DeepSeek-R1-Distill-Qwen-32B$0.29 $0.145/1M tokens$0.29 $0.145/1M tokens
Mixtral-8x22B-Instruct$2.00 $1.00/1M tokens$6.00 $3.00/1M tokens

Best for agent workloads.

Aquaduck is ideal for long-running, high-throughput use cases where latency is flexible.

Sign up for early access