DeepInfra raises $107M Series B to scale the inference cloud — read the announcement
Published on 2026.05.25 by DeepInfraGLM-5.1 Pricing Guide: API Cost Comparison & AnalysisProvider choice for GLM-5.1 is a real economic decision. Across 10 benchmarked API providers, blended pricing runs from $0.74 to $1.70 per 1M tokens, output speed from 33.8 to 175.2 t/s, and the fastest provider is 5.2x quicker than the slowest. For teams deploying at scale, that spread determines whether this model fits a production […]
Published on 2026.05.25 by DeepInfraGLM-5.1 on DeepInfra: Z.AI’s Agentic Engineering ModelZ.AI’s GLM-5.1 scores 58.4 on SWE-Bench Pro — ahead of both Claude Opus 4.6 (57.3) and GPT-5.4 (57.7) on real-world software engineering tasks. It’s the direct successor to GLM-5, designed for agentic engineering: long-horizon coding tasks, terminal operations, and repository-level work. The core design premise is that previous models, including GLM-5, tend to plateau after […]
Published on 2026.05.04 by Yessen KanapinDeepInfra Raises $107M Series B to Scale Inference InfrastructureDeepInfra has raised $107 million in Series B funding to scale its inference cloud, expand global capacity, and support the next generation of open-source and agentic AI workloads.
Published on 2026.04.30 by DeepInfraKimi K2.6 Model Overview: Architecture, Features & CapabilitiesKimi K2.6 is Moonshot AI’s latest flagship open-source model, released on April 20, 2026 under a Modified MIT license. It is a native multimodal agentic model built on a 1-trillion parameter Mixture-of-Experts (MoE) architecture, with 32 billion parameters activated per token. The model is designed for long-horizon coding, autonomous execution, and multi-agent orchestration, and is […]
Published on 2026.04.30 by DeepInfraKimi K2.6 API Benchmarks: Latency, TPS & Cost Analysis (2026)About Kimi K2.6 Kimi K2.6 is an open-source frontier model from Moonshot AI, released on April 20, 2026. It is a native multimodal agentic model built for long-horizon coding, autonomous execution, and swarm-based task orchestration. The model uses a Mixture-of-Experts (MoE) architecture with 1 trillion total parameters and 32 billion activated parameters per token, using […]
Published on 2026.04.30 by DeepInfraKimi K2.6 Pricing Guide 2026: Compare Costs & Deployment StrategiesKimi K2.6 matters because it sits in a rare spot: open weights, broad provider availability, and a real spread in pricing and runtime performance depending on where you buy it. Artificial Analysis tracks the model across nine API providers, with blended pricing ranging from $1.15 to $2.15 per 1M tokens and major differences in throughput […]
© 2026 DeepInfra. All rights reserved.