Ollama Python Library

Overview of the Ollama Python library for running and interacting with large language models locally.

The ollama Python library is a client for interacting with local LLMs hosted via the Ollama runtime. It provides a simple API for loading models, running inference, and managing sessions on your own machine.

Key features:
- Connects to the Ollama runtime, which must be installed and running (ollama serve)
- Works with a variety of models (e.g. llama2, mistral, gemma)
- Supports chat-style interaction, system prompts, and streaming responses

Installation:

pip install ollama

Example usage:

import ollama

response = ollama.chat(
    model='llama2',
    messages=[
        {'role': 'system', 'content': 'You are a helpful assistant.'},
        {'role': 'user', 'content': 'Explain photosynthesis.'}
    ]
)

print(response['message']['content'])

Other methods:

   ollama.list()  list available models
   ollama.pull('mistral')  download a model
   ollama.create()  create a custom model from a Modelfile
   ollama.generate()  run single-turn inference without chat format

Docs: https://github.com/ollama/ollama/tree/main/python

Issue with "Error: could not connect to ollama app, is it running?"

from: https://collabnix.com/running-ollama-with-docker-for-python-applications

Create a .sh file:

#!/bin/sh

# Start Ollama in the background
ollama serve &

# Wait for Ollama to start
sleep 5

# Pull the required model(s)
ollama pull mistral

# Start your Python application
python app.py

links

social