Ollama Usage Guide: Two Ways to Use Ollama

This guide covers two different ways to interact with Ollama models: the native HTTP API and the OpenAI-compatible API.

Prerequisites

  1. Install Ollama

  2. Pull a model

  3. Verify Ollama is running


Method 1: Native Ollama API with requests.post()

Best for: Understanding HTTP APIs, Ollama-specific features, raw control

Installation

Text Completion (single question / answer)

Text Completion Response Format

Chat (role-based chat history)

Chat Response Format

Method 2: OpenAI-Compatible API

Best for: Portability, industry standard, switching between providers

Installation

Basic Chat Example

With Parameters

With Tools (Function Calling)

Using LangChain's bind_tools (Alternative)

LangChain provides a more convenient way to work with tools using bind_tools():

Note on model compatibility: Not all Ollama models support tool calling well. Models with good tool support include:

Smaller or older models may not follow tool calling instructions reliably. Use temperature=0 for more consistent tool calling behavior.

Response Format

Portability Example


Comparison Table

Featurerequests.post()OpenAI SDK
Installationpip install requestspip install openai
Endpointlocalhost:11434/api/*localhost:11434/v1/*
Code complexityMedium (manual HTTP)Low (standard SDK)
Response formatOllama-specificOpenAI-standard
Portability❌ Ollama-only✅ Works everywhere
Model management✅ Full access❌ Chat only
Embeddings/api/embeddings/v1/embeddings
Best forLearning HTTP APIsProduction, portability

Complete Working Examples

Example 1: Simple Q&A (Both Methods)

Example 2: Multi-turn Conversation

Model not found

Connection refused

Import errors


Additional Resources


Summary

Both methods accomplish the same goal but with different tradeoffs:

Choose based on your needs: learning → requests, production/portability → OpenAI SDK.