YouTip LogoYouTip

Ollama Api

1. Start the Ollama Service

\\n

Ollama provides an HTTP-based API that allows developers to interact with models programmatically.

\\n

This article will detail the usage of the Ollama API, including request formats, response formats, and example code.

\\n
\\n

2. API Endpoints

\\n

Ollama provides the following main API endpoints:

\\n

Generate Text

\\n
    \\n
  • Endpoint: POST /api/generate
  • \\n
  • Function: Send a prompt to the model and retrieve the generated text.
  • \\n
  • Request Format:
  • \\n
    { "model": "", // model name "prompt": "", // input prompt "stream": false, // enable streaming response(default false) "options": { // Optional parameter "temperature": 0.7, // Temperature parameter "max_tokens": 100 // Maximum token count }}
    \\n
  • Response Format:
  • \\n
    { "response": "", // generated text "done": true // done}
    \\n
\\n

Chat

\\n
    \\n
  • Endpoint: POST /api/chat
  • \\n
  • Function: Supports multi-turn conversations; the model retains context.
  • \\n
  • Request Format:
  • \\n
    { "model": "", // model name "messages": [ // message list { "role": "user", // user role "content": "" // User input } ], "stream": false, // enable streaming response "options": { // Optional parameter "temperature": 0.7, "max_tokens": 100 }}
    \\n
  • Response Format:
  • \\n
    { "message": { "role": "assistant", // assistant role "content": "" // generated text }, "done": true}
    \\n
\\n

List Local Models

\\n
    \\n
  • Endpoint: GET /api/tags
  • \\n
  • Function: List locally downloaded models.
  • \\n
  • Response Format:
  • \\n
    { "models": [ { "name": "", // model name "size": "", // model size "modified_at": "" // modified time } ]}
    \\n
\\n

Pull Model

\\n
    \\n
  • Endpoint: POST /api/pull
  • \\n
  • Function: Pull a model from the model library.
  • \\n
  • Request Format:
  • \\n
    { "name": "" // model name}
    \\n
  • Response Format:
  • \\n
    { "status": "downloading", // download status "digest": "" // model summary}
    \\n
\\n
\\n

3. Usage Examples

\\n

Generate Text

\\n

Send a request using curl:

\\n

Example

\\n
curl http://localhost:11434/api/generate -d'{\\n\\n  "model": "deepseek-coder",\\n\\n  "prompt": "Hello, can you help me write a code snippet?",\\n\\n  "stream": false\\n\\n  }'
\\n

Multi-turn Conversation

\\n

Send a request using curl:

\\n

Example

\\n
curl http://localhost:11434/api/chat -d'{\\n\\n  "model": "deepseek-coder",\\n\\n  "messages": [\\n\\n   {\\n\\n    "role": "user",\\n\\n    "content": "Hello, can you help me write a Python code snippet?"\\n\\n   }\\n\\n  ],\\n\\n  "stream": false\\n\\n  }'
\\n

List Local Models

\\n

Send a request using curl:

\\n
curl http://localhost:11434/api/tags
\\n

Pull Model

\\n

Send a request using curl:

\\n

Example

\\n
curl http://localhost:11434/api/pull -d'{\\n\\n  "name": "deepseek-coder"\\n\\n  }'
\\n
\\n

4. Streaming Responses

\\n

Ollama supports streaming responses, which are suitable for scenarios requiring real-time text generation.

\\n

Enable Streaming Response

\\n

Set "stream": true in the request, and the API will return the generated text line by line.

\\n

Example

\\n
curl http://localhost:11434/api/generate -d'{\\n\\n  "model": "deepseek-coder",\\n\\n  "prompt": "Hello, can you help me write a code snippet?",\\n\\n  "stream": true\\n\\n  }'
\\n

Response Format

\\n

Each line returns a JSON object:

\\n

Example

\\n
{\\n\\n"response":"",// partially generated text\\n\\n"done":false// done\\n\\n}
\\n
\\n

5. Programming Language Examples

\\n

Python: Interact with the Ollama API using the requests library:

\\n

Example

\\n
import requests\\n\\n# generate text\\n\\n response = requests.post(\\n\\n"http://localhost:11434/api/generate",\\n\\n json={\\n\\n"model": "deepseek-coder",\\n\\n"prompt": "Hello, can you help me write a code snippet?",\\n\\n"stream": False\\n\\n}\\n\\n)\\n\\nprint(response.json())
\\n

Multi-turn conversation:

\\n

Example

\\n
response = requests.post(\\n\\n"http://localhost:11434/api/chat",\\n\\n json={\\n\\n"model": "deepseek-coder",\\n\\n"messages": [\\n\\n{\\n\\n "role": "user",\\n\\n "content": "Hello, can you help me write a Python code snippet?"\\n\\n}\\n\\n],\\n\\n"stream": False\\n\\n}\\n\\n)\\n\\nprint(response.json())
\\n

JavaScript: Interact with Ollama using the fetch API:

\\n

Example

\\n
// generate text\\n\\n fetch("http://localhost:11434/api/generate",{\\n\\n  method:"POST",\\n\\n  headers:{"Content-Type":"application/json"},\\n\\n  body: JSON.stringify({\\n\\n   model:"deepseek-coder",\\n\\n   prompt:"Hello, can you help me write a code snippet?",\\n\\n   stream:false\\n\\n})\\n\\n})\\n\\n .then(response => response.json())\\n\\n .then(data => console.log(data));
\\n

Multi-turn conversation:

\\n

Example

\\n
fetch("http://localhost:11434/api/chat",{\\n\\n  method:"POST",\\n\\n  headers:{"Content-Type":"application/json"},\\n\\n  body: JSON.stringify({\\n\\n   model:"deepseek-coder",\\n\\n   messages:[\\n\\n{\\n\\n  role:"user",\\n\\n  content:"Hello, can you help me write a Python code snippet?"\\n\\n}\\n\\n],\\n\\n stream:false\\n\\n})\\n\\n})\\n\\n .then(response => response.json())\\n\\n .then(data => console.log(data));
← Nextjs IntroOllama Basic β†’