LLaMa 2 is a collections of LLMs trained by Meta. This is the 70B chat optimized version. This endpoint has per token pricing.
LLaMa 2 is a collections of LLMs trained by Meta. This is the 70B chat optimized version. This endpoint has per token pricing.
Run models at scale with our fully managed GPU infrastructure, delivering enterprise-grade uptime at the industry's best rates.