Spring Ai Auto Pull Model

Ollama AI Models and Spring AI are different entities that communicate via Rest Services. The Spring Ai auto pull Model mechanism fetches the Models from the repository if not available. This feature is particularly useful for development and testing purposes and deploying the application on new Servers.

GGUF Hugging Face provides thousands of free Models

The Spring Ai Auto pull Model supports 3 strategies.

Pull Strategies Description
always PullModelStrategy.ALWAYS will pull the Model even if it's already available, this will make sure the latest Model is used.
when_missing PullModelStrategy.WHEN_MISSING will only pull the Model if it's not available. This may use an Older Version of the Model if already available.
never PullModelStrategy.NEVER will pull the model.

Auto Pulling is not recommended in the Production Envirment

Spring Ai Auto Pulling Model allows us to configure the Model pulling properties as well such as timeout and max retries, if required we can pull multiple additional Models as well.

Application initialization will only happen after the Model is pulled, this might slow down the startup for the first time.

Along with the main Model, The Spring Ai pulls additional Models by setting the properties spring.ai.ollama.init.chat.additional-models.

package com.example.springai.controller;

import org.springframework.ai.chat.client.ChatClient;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RestController;

public class SpringAiController {
    private final ChatClient chatClient;

    public SpringAiController(ChatClient.Builder chatClient) {
        this.chatClient = chatClient.build();

    String hello() {
        String helloPrompt = "Hello, I am learning Ai with Spring";
        return this.chatClient.prompt().user(helloPrompt).call().content();
package com.example.springai;

import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;

public class SpringAiApplication {
    public static void main(String[] args) {
        SpringApplication.run(SpringAiApplication.class, args);
# If running the Ollama Docker Instance separately, then set this property
#Auto-pulling Models
# The default Ollama Model in Spring Ai is mistral, but it can be changed by setting the below property.
# If additional Models are required, then set this property
#spring.ai.ollama.init.chat.additional-models=llama3.2, qwen2.5
    image: ollama/ollama:latest
    container_name: ollama_container
      - 11434:11434/tcp
      test: ollama --version || exit 1
    command: serve
      - ./ollama/ollama:/root/.ollama
      - ./entrypoint.sh:/entrypoint.sh
    pull_policy: missing
    tty: true
    restart: no
    entrypoint: [ "/usr/bin/bash", "/entrypoint.sh" ]

    image: ghcr.io/open-webui/open-webui:main
    container_name: open_webui_container
      WEBUI_AUTH: false
      - "8081:8080"
      - "host.docker.internal:host-gateway"
      - open-webui:/app/backend/data
    restart: no

# Start Ollama in the background.
/bin/ollama serve &
# Record Process ID.
# Pause for Ollama to start.
sleep 5
# The default Ollama Model in Spring Ai is mistral, but it can be changed in the applications property file. Make sure to download the same Model here
echo "🔴 Retrieve llama3.1 model..."
ollama pull mistral
echo "🟢 Done!"
# Wait for the Ollama process to finish.
wait $pid
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://maven.apache.org/POM/4.0.0"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd">
    <name>Auto-pulling Models</name>
    <description>Demo project for Spring Boot</description>


            <name>Spring Milestones</name>
            <name>Spring Snapshots</name>

Run the curl to see the Spring Ai Auto Pull Model

curl --location 'http://localhost:8080/hello'

