Ollama provides a platform to run LLM or large language models locally and supports API endpoints for interacting with the Model, the API responds in JSON format and can be expected via Stream as well, let's learn about Ollama LLM API Integration.
Load Model in Memory
By default, Ollama keeps the LLM Model in memory for 5 mins only.
curl --location 'http://localhost:11434/api/generate' \
--header 'Content-Type: application/json' \
--data '{
"model": "llama3.1"
}'
Load the LLM Model in Memory and keep it forever.
curl --location 'http://localhost:11434/api/generate' \
--header 'Content-Type: application/json' \
--data '{
"model": "llama3.1",
"keep_alive": -1
}'
Unload the LLM Model from memory
curl --location 'http://localhost:11434/api/generate' \
--header 'Content-Type: application/json' \
--data '{
"model": "llama3.1",
"keep_alive": 0
}'
Get a List of Model
curl --location 'http://localhost:11434/api/tags' \
--header 'Content-Type: application/json'
Get details of running the Model
curl --location 'http://localhost:11434/api/ps' \
--header 'Content-Type: application/json'
Get Model Details
curl --location 'http://localhost:11434/api/show' \
--header 'Content-Type: application/json' \
--data '{
"model": "llama3.1"
}'
Write the first prompt
curl --location 'http://localhost:11434/api/generate' \
--header 'Content-Type: application/json' \
--data '{
"model": "llama3.1",
"prompt": "Why is the sky blue?"
}'
Ollama LLM API Integration without stream
curl --location 'http://localhost:11434/api/generate' \
--header 'Content-Type: application/json' \
--data '{
"model": "llama3.1",
"prompt": "Why is the sky blue?",
"stream": false
}'
Response Type
curl --location 'http://localhost:11434/api/generate' \
--header 'Content-Type: application/json' \
--data '{
"model": "llama3.1",
"prompt": "Why is the sky blue? Respond using JSON",
"stream": false,
"format": "json"
}'
Temperature
curl --location 'http://localhost:11434/api/generate' \
--header 'Content-Type: application/json' \
--data '{
"model": "llama3.1",
"prompt": "Why is the sky blue?",
"stream": false,
"options": {
"temperature": 0
}
}'
All Parameter
curl --location 'http://localhost:11434/api/generate' \
--header 'Content-Type: application/json' \
--data '{
"model": "llama3.1",
"prompt": "Why is the sky blue?",
"stream": false,
"options": {
"num_keep": 5,
"seed": 42,
"num_predict": 100,
"top_k": 20,
"top_p": 0.9,
"min_p": 0.0,
"tfs_z": 0.5,
"typical_p": 0.7,
"repeat_last_n": 33,
"temperature": 0.8,
"repeat_penalty": 1.2,
"presence_penalty": 1.5,
"frequency_penalty": 1.0,
"mirostat": 1,
"mirostat_tau": 0.8,
"mirostat_eta": 0.6,
"penalize_newline": true,
"stop": [
"\n",
"user:"
],
"numa": false,
"num_ctx": 1024,
"num_batch": 2,
"num_gpu": 1,
"main_gpu": 0,
"low_vram": false,
"f16_kv": true,
"vocab_only": false,
"use_mmap": true,
"use_mlock": false,
"num_thread": 8
}
}'
Chat
curl --location 'http://localhost:11434/api/chat' \
--header 'Content-Type: application/json' \
--data '{
"model": "llama3.1",
"stream": false,
"messages": [
{
"role": "user",
"content": "why is the sky blue?"
}
]
}'
Generate Embedding
curl --location 'http://localhost:11434/api/embed' \
--header 'Content-Type: application/json' \
--data '{
"model": "llama3.1",
"input": "Why is the sky blue?"
}'
Request Raw Prompt
Sometimes it's required to bypass the Prompt template and provide a full prompt, Ollama provides a raw
parameter to disable the existing template.
curl --location 'http://localhost:11434/api/generate' \
--header 'Content-Type: application/json' \
--data '{
"model": "llama3.1",
"prompt": "[INST] why is the sky blue? [/INST]",
"raw": true,
"stream": false
}'
Show ModeFIle of the LLM Model
docker exec -it ollama_container ollama show --modelfile llama3.1
Create a New Model from the Existing Model
curl --location 'http://localhost:11434/api/create' \
--header 'Content-Type: application/json' \
--data '{
"name": "mario",
"modelfile": "FROM llama3\nSYSTEM You are mario from Super Mario Bros."
}'
now the new Model can be used.
curl --location 'http://localhost:11434/api/generate' \
--header 'Content-Type: application/json' \
--data '{
"model": "mario",
"prompt": "Why is the sky blue?",
"stream": false
}'
Delete the Model
curl --location --request DELETE 'http://localhost:11434/api/delete' \
--header 'Content-Type: application/json' \
--data '{
"name": "mario"
}'