Blog
Nebius welcomes Clarifai’s core team and licenses inference IP to strengthen Nebius Token Factory.
Read now
Home
About
Blog
Docs
Login
Contact us
Home
About
Blog
Docs
Login
Contact us
Login
Contact us
WELCOME
CLARIFAI BLOG
Read about our announcements, events, engineering advancements, product tutorials, and Featured Hacks.
Clarifai Blog
The Next Chapter: Clarifai Compute Orchestration and Reasoning Engine Joins Nebius
Benchmarking Gemma-3-4B, MiniCPM-o 2.6, and Qwen2.5-VL-7B-Instruct for latency, throughput, and scalability.
Models
NVIDIA Nemotron 3 Nano Omni on Clarifai Reasoning Engine: Zero Day Support at 400 Tokens Per Second
Benchmarking Gemma-3-4B, MiniCPM-o 2.6, and Qwen2.5-VL-7B-Instruct for latency, throughput, and scalability.
Clarifai 12.3: Introducing KV Cache-Aware Routing
Clarifai 12.3 introduces KV Cache-Aware Routing. Routes requests to replicas with relevant cache state for ...
Clarifai API
Models
Inference
Run Gemma 4 Locally: Deploy Frontier AI on Your Hardware with Public API Access
Run Google's Gemma 4 models on your own hardware while exposing them via public API using Clarifai Local ...
What Is Kimi K2.5? Architecture, Benchmarks & AI Infra Guide
Deploy Public MCP servers as an API endpoint and integrate its tools into LLM workflows using function ...
llama.cpp: Fast Local LLM Inference, Hardware Choices & Tuning
Deploy Public MCP servers as an API endpoint and integrate its tools into LLM workflows using function ...
Flash Attention 2: Reducing GPU Memory and Accelerating Transformers
Deploy Public MCP servers as an API endpoint and integrate its tools into LLM workflows using function ...
Clarifai Reasoning Engine Achieves 414 Tokens Per Second on Kimi K2.5
Clarifai achieves 414 tokens per second on Kimi K2.5, one of the first providers to reach 400+ TPS on a ...
Clarifai 12.2: Three-Command CLI Workflow for Model Deployment
Clarifai 12.2 introduces a three-command CLI workflow for model deployment. Initialize, test locally, and ...
What is LPU? Language Processing Units | The Future of AI Inference
Deploy Public MCP servers as an API endpoint and integrate its tools into LLM workflows using function ...
Clarifai vs Other Inference Providers: Groq, Fireworks, Together AI
Deploy Public MCP servers as an API endpoint and integrate its tools into LLM workflows using function ...
vLLM vs Triton vs TGI: Choosing the Right LLM Serving Framework
Deploy Public MCP servers as an API endpoint and integrate its tools into LLM workflows using function ...
Posts by Tag
Agentic AI
(14)
AI Fundamentals
(19)
AI in 5
(1)
AI Infrastructure
(43)
AI SaaS
(1)
Applied AI
(6)
Automated Visual Inspection
(1)
Business News
(3)
Clarifai API
(9)
Company News
(12)
Compute Orchestration
(1)
Content Moderation
(6)
Customer Stories
(3)
Data Labeling
(2)
Digital Asset Management
(4)
Edge AI
(1)
Events
(1)
Face Recognition
(8)
gpu
(10)
Image Recognition
(30)
Industry News
(11)
Inference
(30)
llms
(1)
Machine Learning
(18)
MLOps
(9)
Models
(14)
NLP
(5)
Other
(2)
Platform
(11)
Product Releases
(49)
Public Sector
(1)
Tutorials
(26)
Visual Search
(6)
Releases
Industry
Documentation
Recent Posts