🚀 E-book
Learn how to master the modern AI infrastructural challenges.
October 24, 2025

How to Use the DeepSeek API | Run DeepSeek Models Seamlessly

Table of Contents:

How to use DeepSeek API

How to Use DeepSeek via API – Developer Guide

TL;DR

DeepSeek models, including DeepSeek‑R1 and DeepSeek‑V3.1, are accessible directly through the Clarifai platform. You can get started without needing a separate DeepSeek API key or endpoint.

  • Experiment in the Playground: Sign up for a Clarifai account and open the Playground. This lets you test prompts interactively, adjust parameters, and understand the model behavior before integration.
  • Integrate via API: Integrate models via Clarifai’s OpenAI-compatible endpoint by specifying the model URL and authenticating with your Personal Access Token (PAT).


https://api.clarifai.com/v2/ext/openai/v1

Authenticate with your Personal Access Token (PAT) and specify the model URL, such as DeepSeek‑R1 or DeepSeek‑V3.1.

Clarifai handles all hosting, scaling, and orchestration, letting you focus purely on building your application and using the model’s reasoning and chat capabilities.

DeepSeek in 90 Seconds—What and Why

DeepSeek encompasses a range of large language models (LLMs) designed with diverse architectural strategies to optimize performance across various tasks. While some models employ a Mixture-of-Experts (MoE) approach, others utilize dense architectures to balance efficiency and capability.

1. DeepSeek-R1

DeepSeek-R1 is a dense model that integrates reinforcement learning (RL) with knowledge distillation to enhance reasoning capabilities. It employs a standard transformer architecture augmented with Multi-Head Latent Attention (MLA) to improve context handling and reduce memory overhead. This design enables the model to achieve high performance in tasks requiring deep reasoning, such as mathematics and logic.

2. DeepSeek-V3

DeepSeek-V3 adopts a hybrid approach, combining both dense and MoE components. The dense part handles general conversational tasks, while the MoE component activates specialized experts for complex reasoning tasks. This architecture allows the model to efficiently switch between general and specialized modes, optimizing performance across a broad spectrum of applications.

3. Distilled Models

To provide more accessible options, DeepSeek offers distilled versions of its models, such as DeepSeek-R1-Distill-Qwen-7B. These models are smaller in size but retain much of the reasoning and coding capabilities of their larger counterparts. For instance, DeepSeek-R1-Distill-Qwen-7B is based on the Qwen 2.5 architecture and has been fine-tuned with reasoning data generated by DeepSeek-R1, achieving strong performance in mathematical reasoning and general problem-solving tasks.

How to Access DeepSeek on Clarifai

DeepSeek models can be accessed on Clarifai in three ways: through the Clarifai Playground UI, via the OpenAI-compatible API, or using the Clarifai SDK. Each method provides a different level of control and flexibility, allowing you to experiment, integrate, and deploy models according to your development workflow.

Clarifai Playground

The Playground provides a fast, interactive environment to test prompts and explore model behavior. 

You can select any DeepSeek model, including DeepSeek‑R1, DeepSeek‑V3.1, or distilled versions available on the community. You can input prompts, adjust parameters such as temperature and streaming, and immediately see the model responses. The Playground also allows you to compare multiple models side by side to test and evaluate their responses.

Within the Playground itself, you have the option to view the API section, where you can access code snippets in multiple languages, including cURL, Java, JavaScript, Node.js, the OpenAI-compatible API, the Clarifai Python SDK, PHP, and more. 

You can select the language you need, copy the snippet, and directly integrate it into your applications. For more details on testing models and using the Playground, see the Clarifai Playground Quickstart

Try it: The Clarifai Playground is the fastest way to test prompts. Navigate to the model page and click “Test in Playground”.

Via the OpenAI‑Compatible API

Clarifai provides a drop-in replacement for the OpenAI API, allowing you to use the same Python or TypeScript client libraries you are familiar with while pointing to Clarifai’s OpenAI-compatible endpoint. Once you have your PAT set as an environment variable, you can call any Clarifai-hosted DeepSeek model by specifying the model URL.

Python Example

import os

from openai import OpenAI

 

client = OpenAI(

    base_url="https://api.clarifai.com/v2/ext/openai/v1",

    api_key=os.environ["CLARIFAI_PAT"]

)

response = client.chat.completions.create(

    model="https://clarifai.com/deepseek-ai/deepseek-chat/models/DeepSeek-R1",

    messages=[

        {"role": "system", "content": "You are a helpful assistant."},

        {"role": "user", "content": "Tell me a three sentence bedtime story about a unicorn."}

    ],

    max_completion_tokens=100,

    temperature=0.7

)

print(response.choices[0].message.content)

TypeScript Example

import OpenAI from "openai";

const client = new OpenAI({

  baseURL: "https://api.clarifai.com/v2/ext/openai/v1",

  apiKey: process.env.CLARIFAI_PAT,

});

 

const response = await client.chat.completions.create({

  model: "https://clarifai.com/deepseek-ai/deepseek-chat/models/DeepSeek-R1",

  messages: [

    { role: "system", content: "You are a helpful assistant." },

    { role: "user", content: "Who are you?" }

  ],

});

console.log(response.choices?.[0]?.message.content);

Clarifai Python SDK

Clarifai’s Python SDK simplifies authentication and model calls, allowing you to interact with DeepSeek models using concise Python code. After setting your PAT, you can initialize a model client and make predictions with just a few lines.

import os

from clarifai.client import Model

model = Model(

    url="https://clarifai.com/deepseek-ai/deepseek-chat/models/DeepSeek-V3_1",

    pat=os.environ["CLARIFAI_PAT"]

)

response = model.predict(

    prompt="What is the future of AI?",

    max_tokens=512,

    temperature=0.7,

    top_p=0.95,

    thinking="False"

)

print(response)

Vercel AI SDK

For modern web applications, the Vercel AI SDK provides a TypeScript toolkit to interact with Clarifai models. It supports the OpenAI-compatible provider, enabling seamless integration.

import { createOpenAICompatible } from "@ai-sdk/openai-compatible";

import { generateText } from "ai";

const clarifai = createOpenAICompatible({

  baseURL: "https://api.clarifai.com/v2/ext/openai/v1",

  apiKey: process.env.CLARIFAI_PAT,

});

const model = clarifai("https://clarifai.com/deepseek-ai/deepseek-chat/models/DeepSeek-R1");

const { text } = await generateText({

  model,

  messages: [

    { role: "system", content: "You are a helpful assistant." },

    { role: "user", content: "What is photosynthesis?" }

  ],

});

console.log(text);

This SDK also supports streaming responses, tool calling, and other advanced features.In addition to the above, DeepSeek models can also be accessed via cURL, PHP, Java, and other languages. For a complete list of integration methods, supported providers, and advanced usage examples, refer to the documentation.

Advanced Inference Patterns

DeepSeek models on Clarifai support advanced inference features that make them suitable for production-grade workloads. You can enable streaming for low-latency responses, and tool calling to let the model interact dynamically with external systems or APIs. These capabilities work seamlessly through Clarifai’s OpenAI-compatible API.

Streaming Responses

Streaming returns model output token by token, improving responsiveness in real-time applications like chat interfaces or dashboards. The example below shows how to stream responses from a DeepSeek model hosted on Clarifai.

import os

from openai import OpenAI

# Initialize the OpenAI-compatible client for Clarifai

client = OpenAI(

    base_url="https://api.clarifai.com/v2/ext/openai/v1",

    api_key=os.environ["CLARIFAI_PAT"]

)

# Create a chat completion request with streaming enabled

response = client.chat.completions.create(

    model="https://clarifai.com/deepseek-ai/deepseek-chat/models/DeepSeek-V3_1",

    messages=[

        {"role": "system", "content": "You are a helpful assistant."},

        {"role": "user", "content": "Explain how transformers work in simple terms."}

    ],

    max_completion_tokens=150,

    temperature=0.7,

    stream=True

)

print("Assistant's Response:")

for chunk in response:

    if chunk.choices and chunk.choices[0].delta and chunk.choices[0].delta.content is not None:

        print(chunk.choices[0].delta.content, end="")

print("\n")

Streaming helps you render partial responses as they arrive instead of waiting for the entire output, reducing perceived latency.

Tool Calling

Tool calling enables a model to invoke external functions during inference, which is especially useful for building AI agents that can interact with APIs, fetch live data, or perform dynamic reasoning. DeepSeek-V3.1 supports tool calling, allowing your agents to make context-aware decisions. Below is an example of defining and using a tool with a DeepSeek model.

import os

from openai import OpenAI

# Initialize the OpenAI-compatible client for Clarifai

client = OpenAI(

    base_url="https://api.clarifai.com/v2/ext/openai/v1",

    api_key=os.environ["CLARIFAI_PAT"]

)

# Define a simple function the model can call

tools = [

    {

        "type": "function",

        "function": {

            "name": "get_weather",

            "description": "Returns the current temperature for a given location.",

            "parameters": {

                "type": "object",

                "properties": {

                    "location": {

                        "type": "string",

                        "description": "City and country, for example 'New York, USA'"

                    }

                },

                "required": ["location"],

                "additionalProperties": False

            }

        }

    }

]

# Create a chat completion request with tool-calling enabled

response = client.chat.completions.create(

    model="https://clarifai.com/deepseek-ai/deepseek-chat/models/DeepSeek-V3_1",

    messages=[

        {"role": "user", "content": "What is the weather like in New York today?"}

    ],

    tools=tools,

    tool_choice='auto'

)

# Print the tool call proposed by the model

tool_calls = response.choices[0].message.tool_calls

print("Tool calls:", tool_calls)

For more advanced inference patterns, including multi-turn reasoning, structured output generation, and extended examples of streaming and tool calling, refer to the documentation

Which DeepSeek Model Should I Pick?

Clarifai hosts multiple DeepSeek variants. Choosing the right one depends on your task:

  • DeepSeek‑R1use for reasoning and complex code. It excels at mathematical proofs, algorithm design, debugging and logical inference. Expect slower responses due to extended “thinking mode,” and higher token usage.

  • DeepSeek‑V3.1use for general chat and lightweight coding. V3.1 is a hybrid: it can switch between non‑thinking mode (faster, cheaper) and thinking mode (deeper reasoning) within a single model. Ideal for summarization, Q&A and everyday assistant tasks.

  • Distilled models (R1‑Distill‑Qwen‑7B, etc.) – these are smaller versions of the base models, offering lower latency and cost with slightly reduced reasoning depth. Use them when speed matters more than maximal performance.

At the time of writing, DeepSeek‑OCR has just been announced and is not yet available on Clarifai. Keep an eye on Clarifai’s model catalog for updates.

Frequently Asked Questions (FAQs)

Q1: Do I need a DeepSeek API key?
No. When using Clarifai, you only need a Clarifai Personal Access Token. Do not use or expose the DeepSeek API key unless you are calling DeepSeek directly (which this guide does not cover).

Q2: How do I switch between models in code?
Change the model value to the Clarifai model ID, such as openai/deepseek-ai/deepseek-chat/models/DeepSeek-R1 for R1 or openai/deepseek-ai/deepseek-chat/models/DeepSeek-V3.1 for V3.1.

Q3: What parameters can I tweak?
You can adjust temperature, top_p and max_tokens to control randomness, sampling breadth and output length. For streaming responses, set stream=True. Tool calling requires defining a tool schema.

Q4: Are there rate limits?
Clarifai enforces soft rate limits per PAT. Implement exponential backoff and avoid retrying 4XX errors. For high throughput, contact Clarifai to increase quotas.

Q5: Is my data secure?
Clarifai processes requests in secure environments and complies with major data‑protection standards. Store your PAT securely and avoid including sensitive data in prompts unless necessary.

Q6: Can I fine‑tune DeepSeek models?
DeepSeek models are MIT‑licensed. Clarifai plans to offer private hosting and fine‑tuning for enterprise customers in the near future. Until then, you can download and fine‑tune the open‑source models on your own infrastructure.

Conclusion

You now have a fast, standard way to integrate DeepSeek models, including R1, V3.1, and distilled variants, into your applications. Clarifai handles all infrastructure, scaling, and orchestration. No separate DeepSeek key or complex setup is needed. Try the models today through the Clarifai Playground or API and integrate them into your applications.