Clarifai 12.1: Building Production-Ready Agentic AI at Scale

Building Production-Ready Agentic AI at Scale

Agentic AI systems are moving from research prototypes to production workloads. These systems don't just generate responses. They reason over multi-step tasks, call external tools, interact with APIs, and execute long-running workflows autonomously.

But production agentic AI requires more than powerful models. It requires infrastructure that can deploy agents reliably, manage the tools they depend on, handle state across complex workflows, and scale across cloud, on-prem, or hybrid environments without vendor lock-in.

Clarifai's Compute Orchestration was built for this. It provides the infrastructure layer to deploy any model on any compute, at any scale, with built-in autoscaling, multi-environment support, and centralized control. This release extends those capabilities specifically for agentic workloads, making it easier to build, deploy, and manage production agentic AI systems.

With Clarifai 12.1, you can now deploy public MCP (Model Context Protocol) servers directly on the platform, giving agentic models access to browsing capabilities, real-time data, and developer tools without managing server infrastructure. Combined with support for custom MCP servers and agentic model uploads, Clarifai provides a complete orchestration layer for agentic AI: from development to production deployment.

This release also introduces Artifacts, a versioned storage system for files produced by pipelines, and Pipeline UI improvements that streamline monitoring and control of long-running workflows.

Let's walk through what's new and how to get started.

Deploying Public MCP Servers for Agentic AI

Agentic AI systems break when models can't access the tools they need. A reasoning model might know how to browse the web, execute code, or query a database, but without the infrastructure to actually call those tools, it's limited to generating text.

Model Context Protocol (MCP) servers solve this. They're specialized web services that expose tools, data sources, and APIs to LLMs in a standardized way. An MCP server acts as the bridge between a model's reasoning capabilities and real-world actions, like fetching live weather data, navigating web pages, or interacting with external systems.

Clarifai has already been supporting custom MCP servers, allowing teams to build their own tool servers and run them on the platform using Compute Orchestration. This gives full control over what tools agents can access, but it requires writing and maintaining custom server code.

With 12.1, we're making it easier to get started by adding support for public MCP servers. These are open-source, community-maintained MCP servers that you can deploy on Clarifai with a simple configuration, without writing or hosting the server yourself.

How Public MCP Servers Work

Public MCP servers are deployed as models on the Clarifai platform. Once deployed, they run as managed API endpoints on Compute Orchestration infrastructure, handling tool execution and returning results to agentic models during inference.

Here's what the workflow looks like:

Deploy a public MCP server as a model on Clarifai using the CLI or SDK
Connect it to an agentic model that supports tool calling and MCP integration
The model discovers available tools from the MCP server during inference
The model calls tools as needed, and the MCP server executes them and returns results
The model uses those results to continue reasoning or complete the task

The entire flow is managed by Compute Orchestration. The MCP server runs as a containerized deployment, scales based on demand, and can be deployed across any compute environment (cloud, on-prem, or hybrid) just like any other model on the platform.

Available Public MCP Servers

We've published several open-source MCP servers on the Clarifai Community that you can deploy today:

Browser MCP Server
Gives agentic models the ability to navigate web pages, extract content, take screenshots, and interact with web forms. Useful for research tasks, data gathering, or any workflow that requires real-time web interaction.

Weather MCP Server
Provides real-time weather data lookup by location. A simple example of how MCP servers can connect models to external APIs without requiring the model to handle authentication or API-specific logic.

These servers are already deployed and running on the platform. You can use them directly with any agentic model, or reference them as examples when deploying your own public MCP servers.

Deploying Your Own Public MCP Server

If you want to deploy an open-source MCP server from the community, the process is straightforward. You provide a configuration pointing to the MCP server repository, and Clarifai handles containerization, deployment, and scaling.

Here's an example of deploying the Browser MCP server using the same workflow as uploading a custom model. The full example is available in the Clarifai runners-examples repository.

The configuration follows the same structure as any other model upload on Clarifai. You define the server's runtime, dependencies, and compute requirements, then upload it using the CLI:

clarifai model upload

Once deployed, the MCP server becomes a callable API endpoint.

Using MCP Servers with Agentic Models

Several models on the Clarifai platform natively support agentic capabilities and can integrate with MCP servers during inference. These models are built with tool calling and iterative reasoning, allowing them to discover, call, and process results from MCP servers without additional configuration.

Models with agentic MCP support include:

Qwen3-30B-A3B-Instruct-2507 – Enhanced for comprehension, coding, and tool use with 256K context length
Qwen3-30B-A3B-Thinking-2507 – Optimized for reasoning tasks with extended tool calling capabilities
Qwen3-Coder-30B-A3B-Instruct – A coding-focused model with strong agentic capabilities for development workflows

When you call one of these models through the Clarifai API, you can specify which MCP servers it should have access to. The model handles tool discovery and execution during inference, iterating until the task is complete.

You can also upload your own agentic models with MCP support using the AgenticModelClass. This extends the standard model upload workflow with built-in support for tool discovery and execution. A complete example is available in the agentic-gpt-oss-20b repository, showing how to upload an agentic reasoning model that integrates with MCP servers.

Why This Matters for Production Agentic AI

Deploying MCP servers on Compute Orchestration means you get the same infrastructure benefits as any other workload on the platform:

Deploy anywhere: MCP servers can run on Clarifai's shared compute, dedicated instances, or your own infrastructure (VPC, on-prem, air-gapped)
Autoscaling: Servers scale up or down based on demand, with support for scale-to-zero when idle
Centralized control: Monitor performance, manage costs, and control access through the Clarifai Control Center
No vendor lock-in: Run the same MCP servers across different environments without reconfiguration

This is production-grade orchestration for agentic AI. MCP servers aren't just running locally or on a single cloud provider. They're deployed as managed services with the same reliability, scaling, and control you'd expect from any enterprise AI infrastructure.

For a step-by-step guide on deploying public MCP servers, connecting them to agentic models, and building your own tool-enabled workflows, check out the Clarifai MCP documentation and the examples in the runners-examples repository.

Artifacts: Versioned Storage for Pipeline Outputs

Clarifai Pipelines, introduced in 12.0, allow you to define and execute long-running, multi-step AI workflows directly on the platform. These workflows handle tasks like model training, batch processing, evaluations, and data preprocessing as containerized steps that run asynchronously on Clarifai's infrastructure.

Pipelines are currently in Public Preview as we continue iterating based on user feedback.

Pipelines produce files. Model checkpoints, training logs, evaluation metrics, preprocessed datasets, configuration files. These outputs are valuable, but until now, there was no standardized way to store, version, and retrieve them within the platform.

With 12.1, we're introducing Artifacts, a versioned storage system designed specifically for files produced by pipelines or user workloads.

What Are Artifacts

An Artifact is a container for any binary or structured file. Each Artifact can have multiple ArtifactVersions, capturing distinct snapshots over time. Every version is immutable and references the actual file stored in object storage, while metadata like timestamps, descriptions, and visibility settings are tracked in the control plane.

This separation keeps lookups fast and storage costs low.

Why Artifacts Matter

Reproducibility: Save the exact files (weights, checkpoints, configs, logs) that produced results, making experiments reproducible and auditable.

Resume and checkpointing: Pipelines can resume from stored checkpoints instead of recomputing, saving time and cost on long-running jobs.

Version control: Track how model checkpoints evolve over time or compare outputs across different pipeline runs.

Using Artifacts with the CLI

The Clarifai CLI provides a simple interface for managing artifacts, modeled after familiar commands like cp for upload and download.

Upload a file as an artifact:

Upload with description and visibility:

Download the latest version:

Download a specific version:

List all artifacts in an app:

List versions of a specific artifact:

The CLI handles multipart uploads for large files automatically, ensuring efficient transfers even for multi-gigabyte checkpoints.

Using Artifacts with the Python SDK

The SDK provides programmatic access to artifact management, useful for integrating artifact uploads and downloads directly into training scripts or pipeline steps.

Upload a file:

Download a specific version:

List all versions of an artifact:

Artifact Use Cases

Model training workflows: Upload model checkpoints after each training epoch. If training is interrupted, resume from the last saved checkpoint instead of restarting from scratch.

Pipeline outputs: Store evaluation metrics, preprocessed embeddings, or serialized configurations produced by pipeline steps. Reference these artifacts in downstream steps or share them across teams.

Experiment tracking: Version control for all outputs related to an experiment. Track how model performance evolves across training runs or compare artifacts produced by different hyperparameter configurations.

Artifacts are scoped to apps, just like Pipelines and Models. This means access control, versioning, and lifecycle policies follow the same patterns you're already using for other Clarifai resources.

Pipeline UI Improvements

Managing long-running workflows requires visibility into what's running, what's queued, and what failed. With this release, we've added several UI improvements to make it easier to monitor and control pipeline execution directly from the platform.

What's New

Pipelines List
View all pipelines in your app from a single interface. You can see pipeline metadata, creation dates, and quickly navigate to specific pipelines without needing to use the CLI or API.

Pipeline Versions List
Each pipeline can have multiple versions, representing different configurations or iterations of the workflow. The new Versions view lets you browse all versions of a pipeline, compare configurations, and select which version to run.

Pipeline Version Runs View
This is where you monitor active and completed runs. The Runs view shows execution status, timestamps, and logs for each run, making it easier to debug failures or track progress on long-running jobs.

Quick switching between pipelines and versions
Navigate between pipelines, their versions, and individual runs without leaving the UI. This makes it faster to compare results across different pipeline configurations or troubleshoot specific runs.

Start / Pause / Cancel Runs
You can now start, pause, or cancel pipeline runs directly from the UI. Previously, this required CLI or API calls. Now, you can stop a run that's consuming resources unnecessarily or pause execution to inspect intermediate state.

View run logs
Logs are streamed directly into the UI, so you can monitor execution in real time. This is especially useful for debugging failures or understanding what happened during a specific step in a multi-step workflow.

These improvements make pipelines more accessible for teams that prefer working through the UI rather than exclusively through the CLI or SDK. You still have full programmatic access through the API, but now you can also manage and monitor workflows visually.

Pipelines remain in Public Preview. We're actively iterating based on feedback, so if you're using pipelines and have suggestions for how the UI or execution model could be improved, we'd love to hear from you.

For a step-by-step guide on defining, uploading, and running pipelines, check out the Pipelines documentation.

Additional Changes

Cessation of the Community Plan

We've retired the Community Plan and migrated all users to our new Pay-As-You-Go plan, which provides a more sustainable and competitive pricing model.

All users who verify their phone number receive a $5 free welcome bonus to get started. The Pay-As-You-Go plan has no monthly minimums and far fewer feature gates, making it easier to test and scale AI workloads without upfront commitments.

For more details on the new pricing structure, see our recent announcement on Pay-As-You-Go credits.

Python SDK Updates

We've made several improvements to the Python SDK to improve reliability, developer experience, and compatibility with agentic workflows.

Added the load_concepts_from_config() method to VisualDetectorClass and VisualClassifierClass to load concepts from config.yaml.
Added a Dockerfile template that conditionally installs packages required for video streaming.
Fixed deployment cleanup logic to ensure it targets only failed model deployments.
Implemented an automatic retry mechanism for OpenAI API calls to gracefully handle transient httpx.ConnectError exceptions.
Fixed attribute access for OpenAI response objects in agentic transport by using hasattr() checks instead of dictionary .get() methods.

For a complete list of SDK updates, see the Python SDK changelog.

Ready to Start Building?

You can start deploying public MCP servers today to give agentic models access to browsing capabilities, real-time data, and developer tools. Deploy them on Clarifai's shared compute, dedicated instances, or your own infrastructure using the same orchestration layer as your models.

If you're running long-running workflows, use Artifacts to store and version files produced by pipelines. Upload checkpoints, logs, and outputs directly through the CLI or SDK, and resume execution from saved state when needed.

For teams managing complex pipelines, the new UI improvements make it easier to monitor runs, view logs, and control execution without leaving the platform.

Pipelines and public MCP server support are available in Public Preview. We'd love your feedback as you build.

Sign up here to get started with Clarifai, or check out the documentation. If you have questions or need help while building, join us on Discord. Our community and team are there to help.

Previous Return to Blog Menu Next