Releases · ollama/ollama

@drifkin

What's Changed

openai: allow for content and tool calls in the same message by @drifkin in #11759
openai: when converting role=tool messages, propagate the tool name by @drifkin in #11761
openai: always provide reasoning by @drifkin in #11765

New Contributors

@gao-feng made their first contribution in #11170

Full Changelog: v0.11.3...v0.11.4

What's Changed

Fixed issue where gpt-oss would consume too much VRAM when split across GPU & CPU or multiple GPUs
Statically link C++ libraries on windows for better compatibility

Full Changelog: v0.11.2...v0.11.3

What's Changed

Fix crash in gpt-oss when using kv cache quanitization
Fix gpt-oss bug with "currentDate" not defined

Full Changelog: v0.11.1...v0.11.2

@jessegross

Welcome OpenAI's gpt-oss models

Ollama partners with OpenAI to bring its latest state-of-the-art open weight models to Ollama. The two models, 20B and 120B, bring a whole new local chat experience, and are designed for powerful reasoning, agentic tasks, and versatile developer use cases.

Feature highlights

Agentic capabilities: Use the models’ native capabilities for function calling, web browsing (Ollama is providing a built-in web search that can be optionally enabled to augment the model with the latest information), python tool calls, and structured outputs.
Full chain-of-thought: Gain complete access to the model's reasoning process, facilitating easier debugging and increased trust in outputs.
Configurable reasoning effort: Easily adjust the reasoning effort (low, medium, high) based on your specific use case and latency needs.
Fine-tunable: Fully customize models to your specific use case through parameter fine-tuning.
Permissive Apache 2.0 license: Build freely without copyleft restrictions or patent risk—ideal for experimentation, customization, and commercial deployment.

Quantization - MXFP4 format

OpenAI utilizes quantization to reduce the memory footprint of the gpt-oss models. The models are post-trained with quantization of the mixture-of-experts (MoE) weights to MXFP4 format, where the weights are quantized to 4.25 bits per parameter. The MoE weights are responsible for 90+% of the total parameter count, and quantizing these to MXFP4 enables the smaller model to run on systems with as little as 16GB memory, and the larger model to fit on a single 80GB GPU.

Ollama is supporting the MXFP4 format natively without additional quantizations or conversions. New kernels are developed for Ollama’s new engine to support the MXFP4 format.

Ollama collaborated with OpenAI to benchmark against their reference implementations to ensure Ollama’s implementations have the same quality.

Get started

You can get started by downloading the latest Ollama version (v0.11)

The model can be downloaded directly in Ollama’s new app or via the terminal:

ollama run gpt-oss:20b

ollama run gpt-oss:120b

What's Changed

kvcache: Enable SWA to retain additional entries by @jessegross in #11611
kvcache: Log contents of cache when unable to find a slot by @jessegross in #11658

Full Changelog: v0.10.1...v0.11.0

@skools-here

What's Changed

Fixed unicode character input for Japanese and other languages in Ollama's new app
Fixed AMD download URL in the logs for ollama serve

New Contributors

@skools-here made their first contribution in #11579

Full Changelog: v0.10.0...v0.10.1

@sncix

Ollama's new app

Ollama's new app is available for macOS and Windows: Download Ollama

What's Changed

ollama ps will now show the context length of loaded models
Improved performance in gemma3n models by 2-3x
Parallel request processing now defaults to 1. For more details, see the FAQ
Fixed issue where tool calling would not work correctly with granite3.3 and mistral-nemo models
Fixed issue where Ollama's tool calling would not work correctly if a tool's name was part of of another one, such as add and get_address
Improved performance when using multiple GPUs by 10-30%
Ollama's OpenAI-compatible API will now support WebP images
Fixed issue where ollama show would report an error
ollama run will more gracefully display errors

New Contributors

@sncix made their first contribution in #11189
@mfornet made their first contribution in #11425
@haiyuewa made their first contribution in #11427
@warting made their first contribution in #11461
@ycomiti made their first contribution in #11462
@minxinyi made their first contribution in #11502
@ruyut made their first contribution in #11528

Full Changelog: v0.9.6...v0.10.0

@vrampal

What's Changed

Fixed styling issue in launch screen
tool_name can now be provided in messages with "role": "tool" using the /api/chat endpoint

New Contributors

@vrampal made their first contribution in #9681

Full Changelog: v0.9.5...v0.9.6-rc0

@xukecheng

Updates to Ollama for macOS and Windows

A new version of Ollama's macOS and Windows applications are now available. New improvements to the apps will be introduced over the coming releases:

New features

Expose Ollama on the network

Ollama can now be exposed on the network, allowing others to access Ollama on other devices or even over the internet. This is useful for having Ollama running on a powerful Mac, PC or linux computer while making it accessible to less powerful devices.

Model directory

The directory in which models are stored can now be modified! This allows models to be stored on external hard disks or alternative directories than the default.

Smaller footprint and faster starting on macOS

The macOS app is now a native application and starts much faster while requiring a much smaller installation.

Additional changes in 0.9.5

Fixed issue where the ollama CLI would not be installed by Ollama on macOS on startup
Fixed issue where files in ollama-darwin.tgz were not notarized
Add NativeMind to Community Integrations by @xukecheng in #11242
Ollama for macOS now requires version 12 (Monterey) or newer

New Contributors

@xukecheng made their first contribution in #11242

Updates to Ollama for macOS and Windows

A new version of Ollama's macOS and Windows applications are now available. New improvements to the apps will be introduced over the coming releases:

New features

Expose Ollama on the network

Ollama can now be exposed on the network, allowing others to access Ollama on other devices or even over the internet. This is useful for having Ollama running on a powerful Mac, PC or linux computer while making it accessible to less powerful devices.

Model directory

The directory in which models are stored can now be modified! This allows models to be stored on external hard disks or alternative directories than the default.

Smaller footprint and faster starting on macOS

The macOS app is now a native application and starts much faster while requiring a much smaller installation.

What's Changed

Reduced download size and startup time for Ollama on macOS
Tool calling with empty parameters will now work correctly
Fixed issue when quantizing models with the Gemma 3n architecture
Ollama for macOS should not longer ask for root privileges when updating unless required
Ollama for macOS now requires version 12 (Monterey) or newer

Full Changelog: v0.9.3...v0.9.4

@Aj-Seven

Gemma 3n

Ollama now supports Gemma 3n.

Gemma 3n models are designed for efficient execution on everyday devices such as laptops, tablets or phones. These models were trained with data in over 140 spoken languages.

Effective 2B

ollama run gemma3n:e2b

Effective 4B

ollama run gemma3n:e4b

What's Changed

Fixed issue where errors would not be properly reported on Apple Silicon Macs
Ollama will now limit context length to what the model was trained against to avoid strange overflow behavior

New Contributors

@Aj-Seven made their first contribution in #11169

Full Changelog: v0.9.2...v0.9.3

Releases: ollama/ollama

v0.11.4

What's Changed

New Contributors

Contributors

Uh oh!

v0.11.3

What's Changed

Uh oh!

v0.11.2

What's Changed

Uh oh!

v0.11.0

Welcome OpenAI's gpt-oss models

Feature highlights

Quantization - MXFP4 format

Get started

What's Changed

Contributors

Uh oh!

v0.10.1

What's Changed

New Contributors

Contributors

Uh oh!

v0.10.0

Ollama's new app

What's Changed

New Contributors

Contributors

Uh oh!

v0.9.6

What's Changed

New Contributors

Contributors

Uh oh!

v0.9.5

Updates to Ollama for macOS and Windows

New features

Expose Ollama on the network

Model directory

Smaller footprint and faster starting on macOS

Additional changes in 0.9.5

New Contributors

Contributors

Uh oh!

v0.9.4

Updates to Ollama for macOS and Windows

New features

Expose Ollama on the network

Model directory

Smaller footprint and faster starting on macOS

What's Changed

Uh oh!

v0.9.3

Gemma 3n

Effective 2B

Effective 4B

What's Changed

New Contributors

Contributors

Uh oh!