Arcee AI is a US-based AI research lab focused on building state-of-the-art open weight foundation models and developer tooling. Arcee’s models enable customers to progress from using AI to owning their AI, providing an alternative to closed model APIs and Chinese open weight models that can be deployed on secure infrastructure and customized on customer data.
Arcee works with ISVs, public sector, and regulated industries. Their models are commonly deployed for coding and agentic workflows.
Dedicated inference for SLMs
AI-Native
Arcee AI
Before deploying on Clarifai, Arcee faced a critical infrastructure challenge: operating and scaling GPU-backed inference reliably while maintaining the rapid release velocity that differentiated them in the market.
As a research lab releasing foundation models faster than any other AI lab, Arcee needed infrastructure that could match their pace. This meant getting new models from development to production quickly without sacrificing performance. As adoption of Trinity Mini grew across OpenRouter and Arcee's native API, the model needed to handle rapidly increasing production traffic without performance degradation.
Supporting this growth required GPU infrastructure that could scale dynamically to handle usage spikes while enabling zero-downtime deployments, all without introducing operational complexity that would slow down their release cycles.

Arcee deployed Trinity Mini, its 26B-parameter sparse mixture-of-experts language model with 3B active parameters, on Clarifai's Compute Orchestration to support production inference at scale. As the sole hosting partner for Trinity Mini, Clarifai enables Arcee to serve the model across OpenRouter and its native API platform.
Clarifai's Compute Orchestration provides fully managed GPU infrastructure with autoscaling that dynamically provisions resources based on real-time traffic patterns. The platform enables Arcee to evaluate and deploy different GPU instance types to optimize for performance and cost. This eliminated the operational overhead of manual infrastructure management.
Since deploying Trinity Mini on Clarifai, Arcee has scaled production inference across OpenRouter and its native API platform without performance degradation.
From January 6 to February 5, usage on OpenRouter increased 6× month over month. During this surge, Clarifai's Compute Orchestration maintained stable throughput of 170 to 195 tokens per second, demonstrating the infrastructure's ability to handle rapid growth while preserving performance.
Davis Stone Head of Growth
If you are encountering a technical or payment issue, the customer support team will be happy to assist you.
© 2026 Clarifai, Inc. Terms of Service Content TakedownPrivacy Policy