Skip to content

Ollama

Overview

This module adds support for selected Ollama models.

Maven Coordinates

In addition to the Helidon integration with LangChain4j core dependencies, you must add the following:

xml
<dependency>
    <groupId>io.helidon.integrations.langchain4j.providers</groupId>
    <artifactId>helidon-integrations-langchain4j-providers-ollama</artifactId>
</dependency>

Components

OllamaChatModel

To automatically create and add OllamaChatModel to the service registry add the following lines to application.yaml:

yaml
langchain4j:
  providers:
    ollama:
      base-url: "http://localhost:11434"

  models:
    ollama-chat-model:
      provider: ollama
      model-name: "llama3.1"

If enabled is set to false, the configuration is ignored, and the component is not created.

Full list of configuration properties:

KeyTypeDescription
base-urlstringThe base URL for the Ollama API. If not present, the default value supplied from LangChain4j is used.
enabledbooleanIf set to false, the component will not be available even if configured.
formatstringSpecifies the structure or style of the text produced by the model, such as plain text, JSON, or a custom format.
log-requestsbooleanWhether to log API requests.
log-responsesbooleanWhether to log API responses.
max-retriesintegerThe maximum number of retries for failed API requests.
model-namestringThe model name to use.
num-predictintLength of the output generated by the model.
repeat-penaltydoubleThe penalty applied to repeated tokens during text generation. Higher values discourage the model from generating the same token multiple times, promoting more varied and natural output. A value of 1.0 applies no penalty (default behavior), while values greater than 1.0 reduce the likelihood of repetition. Excessively high values may overly penalize common phrases, leading to unnatural results.
seedintThe seed for the random number generator used by the model.
stopstring[]List of sequences where the API will stop generating further tokens.
temperaturedoubleSampling temperature to use, between 0 and 2. Higher values make the output more random, while lower values make it more focused and deterministic.
timeoutdurationThe timeout setting for API requests. See here for the format.
top-kintLimits the token pool to the topK highest-probability tokens, controlling the balance between deterministic and diverse outputs. A smaller topK (e.g., 1) results in deterministic output, while a larger value (e.g., 50) allows for more variability and creativity.
top-pdoubleNucleus sampling value, where the model considers the results of the tokens with top_p probability mass.

OllamaEmbeddingModel

To automatically create and add OllamaEmbeddingModel to the service registry add the following lines to application.yaml:

yaml
langchain4j:
  providers:
    ollama:
      base-url: "http://localhost:11434"

  models:
    ollama-embedding-model:
      provider: ollama
      model-name: "nomic-embed-text"

If enabled is set to false, the configuration is ignored, and the component is not created.

Full list of configuration properties:

KeyTypeDescription
base-urlstringThe base URL for the Ollama API. If not present, the default value supplied from LangChain4j is used.
enabledbooleanIf set to false, the component will not be available even if configured.
log-requestsbooleanWhether to log API requests.
log-responsesbooleanWhether to log API responses.
max-retriesintegerThe maximum number of retries for failed API requests.
model-namestringThe model name to use.
timeoutdurationThe timeout setting for API requests. See here for the format.

OllamaLanguageModel

To automatically create and add OllamaLanguageModel to the service registry add the following lines to application.yaml:

yaml
langchain4j:
  providers:
    ollama:
      base-url: "http://localhost:11434"

  models:
    ollama-language-model:
      provider: ollama
      model-name: "llama3.1"

If enabled is set to false, the configuration is ignored, and the component is not created.

Full list of configuration properties:

KeyTypeDescription
base-urlstringThe base URL for the Ollama API. If not present, the default value supplied from LangChain4j is used.
enabledbooleanIf set to false, the component will not be available even if configured.
formatstringSpecifies the structure or style of the text produced by the model, such as plain text, JSON, or a custom format.
log-requestsbooleanWhether to log API requests.
log-responsesbooleanWhether to log API responses.
max-retriesintegerThe maximum number of retries for failed API requests.
model-namestringThe model name to use.
num-predictintLength of the output generated by the model.
repeat-penaltydoubleThe penalty applied to repeated tokens during text generation. Higher values discourage the model from generating the same token multiple times, promoting more varied and natural output. A value of 1.0 applies no penalty (default behavior), while values greater than 1.0 reduce the likelihood of repetition. Excessively high values may overly penalize common phrases, leading to unnatural results.
seedintThe seed for the random number generator used by the model.
stopstring[]List of sequences where the API will stop generating further tokens.
temperaturedoubleSampling temperature to use, between 0 and 2. Higher values make the output more random, while lower values make it more focused and deterministic.
timeoutdurationThe timeout setting for API requests. See here for the format.
top-kintLimits the token pool to the topK highest-probability tokens, controlling the balance between deterministic and diverse outputs. A smaller topK (e.g., 1) results in deterministic output, while a larger value (e.g., 50) allows for more variability and creativity.
top-pdoubleNucleus sampling value, where the model considers the results of the tokens with top_p probability mass.

OllamaStreamingChatModel

To automatically create and add OllamaStreamingChatModel to the service registry add the following lines to application.yaml:

yaml
langchain4j:
  providers:
    ollama:
      base-url: "http://localhost:11434"

  models:
    ollama-streaming-chat-model:
      provider: ollama
      model-name: "llama3.1"

If enabled is set to false, the configuration is ignored, and the component is not created.

Full list of configuration properties:

KeyTypeDescription
base-urlstringThe base URL for the Ollama API. If not present, the default value supplied from LangChain4j is used.
enabledbooleanIf set to false, the component will not be available even if configured.
formatstringSpecifies the structure or style of the text produced by the model, such as plain text, JSON, or a custom format.
log-requestsbooleanWhether to log API requests.
log-responsesbooleanWhether to log API responses.
max-retriesintegerThe maximum number of retries for failed API requests.
model-namestringThe model name to use.
num-predictintLength of the output generated by the model.
repeat-penaltydoubleThe penalty applied to repeated tokens during text generation. Higher values discourage the model from generating the same token multiple times, promoting more varied and natural output. A value of 1.0 applies no penalty (default behavior), while values greater than 1.0 reduce the likelihood of repetition. Excessively high values may overly penalize common phrases, leading to unnatural results.
seedintThe seed for the random number generator used by the model.
stopstring[]List of sequences where the API will stop generating further tokens.
temperaturedoubleSampling temperature to use, between 0 and 2. Higher values make the output more random, while lower values make it more focused and deterministic.
timeoutdurationThe timeout setting for API requests. See here for the format.
top-kintLimits the token pool to the topK highest-probability tokens, controlling the balance between deterministic and diverse outputs. A smaller topK (e.g., 1) results in deterministic output, while a larger value (e.g., 50) allows for more variability and creativity.
top-pdoubleNucleus sampling value, where the model considers the results of the tokens with top_p probability mass.

Additional Information