NVIDIA-Nemotron-3-Super-120B-A12B
NVIDIA Nemotron 3 Super is a hybrid Mixture-of-Experts (MoE) model engineered for highest compute efficiency and accuracy in multi-agent applications and specialized agentic systems. It is optimized to run many collaborating agents per application on a single GPU, delivering high accuracy for reasoning, tool use, and instruction following.
bfloat16
256k
$0.10 cached, $0.10 in, $0.50 out / 1M