
This blog post focuses on new features and improvements. For a comprehensive list, including bug fixes, please see the release notes.
A new Python-based method for model uploading and inference
We have completely revamped the way models are uploaded and used for inference with a new Python-based method that prioritizes simplicity, speed, and developer experience.
Built with a Python-first, user-centric design, this flexible approach simplifies the process of working with models. It allows users to focus more on building and iterating, and less on navigating API mechanics. The new method streamlines inference, accelerates development, and significantly improves overall usability.
Model Upload
The Clarifai Python SDK now makes it even easier to upload custom models. Whether you're using a pre-trained model from Hugging Face or OpenAI, or one you've developed from scratch, integration is seamless. Once uploaded, your model can immediately take advantage of Clarifai's robust platform features.
After import, your model is automatically deployed and ready for use. You can evaluate it, connect it with other models and agent operators in a workflow, or serve inference requests directly.
As part of this release, we’ve significantly simplified how you define the model.py
file for custom model uploads. The new ModelC
lass pattern allows you to implement predict
, generate
, and streaming
methods without the need for extra abstraction or boilerplate. You can get started in just a few lines of code.
Here’s a quick example: a simple method that appends “Hello World” to any input text, with built-in support for different types of streaming responses. Check out the full documentation here.
Inference
The new inference approach offers an efficient, scalable, and simplified way to run predictions with your models.
Designed with a Python-first, developer-friendly focus, it reduces complexity so you can spend more time building and iterating, and less time dealing with low-level API details.
Below is an example of how to make a client-side predict
call that corresponds to the predict
method defined in the previous section. Checkout the docs here.
New Published Models
- Published Llama-4-Scout-17B-16E-Instruct, a powerful model in the Llama 4 series featuring 17 billion parameters and 16 experts for advanced instruction tuning. It supports a native 10 million-token context window (currently 8k supported on Clarifai), making it ideal for multi-document analysis, complex codebase understanding, and personalized, intelligent workflows.
- Published Qwen3-30B-A3B-GGUF, the latest addition to the Qwen series. This new release features both dense and mixture-of-experts (MoE) models, with significant improvements in reasoning, instruction-following, agent-based tasks, and multilingual capabilities. The Qwen3-30B-A3B outperforms larger models like QwQ-32B, leveraging fewer active parameters while maintaining strong performance across coding and reasoning benchmarks.

- Published OpenAI’s latest o3 model, a powerful and well-rounded LLM that sets a new standard for performance across math, science, coding, and visual reasoning tasks. It is built for complex, multi-step thinking and excels at technical problem-solving, interpreting visual data such as charts and diagrams, high-stakes decision-making, and creative ideation.
- Published o4-mini, a smaller model optimized for fast, cost-efficient reasoning. Despite its compact size, o4-mini delivers impressive accuracy on math and coding benchmarks like AIME 2025. It is ideal for use cases that require strong reasoning capabilities while keeping latency and cost low. Both the models are also available on the Playground, Try them out here.
Enhanced the Playground experience
- Added automatic mode detection based on the selected model — now intelligently switches between Chat and Vision modes for predictions.
- Improved model search and identification for a faster, more accurate selection experience.
- Introduced a Personal Access Token (PAT) dropdown, enabling users to easily insert their PAT keys into code snippets.

- Implemented dynamic pricing display that updates based on the selected deployment.
- The selected deployment ID is now automatically injected into the inference code.
Enhanced the Control Center
- We’re bringing a major update to the Control Center with detailed Compute time metrics for Compute Orchestration. This enhancement gives you deeper visibility into how compute resources are used and billed across your workflows:
1. Added Compute Hours in the Overview tab.
2. Added Compute Hours costs in the Costs tab.
3. Added Compute Hours usage details in the Usage tab.

- Added Compute Orchestration operations to audit logging: Operations related to clusters, nodepools, and model deployments are now tracked and visible in the Teams & Logs tab within the Control Center.
- Introduced new, more efficient and stable chart types with improved tooltips for better data visualization and user experience.
- Enhanced the design of the "Total Model Predictions by ID" chart by making the chart clickable, allowing users to navigate directly to the corresponding model. Also introduced other UI refinements for a more intuitive experience.
- Adjusted hover cards on charts to stay within the viewport by dynamically lowering their position and adding scrollbars when content exceeds the visible area.
Improved the Community platform
- Revamped the Explore page with refreshed visual designs, a featured models showcase, and categorized use cases such as LLMs and VLMs.
- Updated the individual model viewer page with an improved UI, direct access to the Playground, deployment listings, and additional enhancements.

Additional Changes
- The Home page is now accessible to all users, with sections requiring login automatically hidden for non-logged-in users. A new "Recent Activity" section shows users their most recent actions and operations. We also made improvements to usability, performance, and overall user experience.
- New organization accounts now start on the Community plan by default, instead of inheriting the user’s personal plan. This change applies to users on the Community, Essential, and Professional plans. Enterprise users are not affected. The "Member Since" column now shows when a member joined the organization, and Settings pages are hidden from users without the required permissions.
- The billing section has been redesigned for a more intuitive credit card management experience. We've added validation to prevent duplicate card entries and support for setting or changing the default credit card.
- The Python SDK now supports Pythonic models for a more native experience. We fixed failing tests to improve stability. The CLI is now ~20x faster for most operations, includes config contexts, improved error messages, and corrected return arguments in the model builder. Learn more here.
Ready to start building?
With this Python-first release, uploading and running inference on your custom models is now faster, simpler, and more intuitive than ever. Whether you're integrating a pre-trained model or deploying one you've built from scratch, the Clarifai Python SDK gives you the tools to move from prototype to production with minimal overhead.
Explore the documentation and start building today.