Skip to content

Jlama

Overview

This module adds support for selected Jlama models.

Maven Coordinates

In addition to the Helidon integration with LangChain4J core dependencies, you must add the following:

xml
<dependency>
    <groupId>io.helidon.integrations.langchain4j.providers</groupId>
    <artifactId>helidon-integrations-langchain4j-providers-jlama</artifactId>
</dependency>

Components

JlamaChatModel

To automatically create and add JlamaChatModel to the service registry add the following lines to application.yaml:

yaml
langchain4j:
  providers:
    jlama:
      temperature: 1.2

  models:
    jlama-chat-model:
      provider: jlama
      model-name: "tjake/Qwen2.5-0.5B-Instruct-JQ4"

If enabled is set to false, the configuration is ignored, and the component is not created.

Full list of configuration properties:

KeyTypeDescription
enabledbooleanIf set to false, the component will not be available even if configured.
model-namestringThe model name to use.
temperaturedoubleSampling temperature to use, between 0 and 2. Higher values make the output more random, while lower values make it more focused and deterministic.
working-quantized-typeenumQuantize the model at runtime. Default quantization is Q4.
model-cache-pathPathPath to a directory where the model will be cached once downloaded.
working-directoryPathPath to a directory where persistent ChatMemory can be stored on disk for a given model instance.
auth-tokenstringToken to use when fetching private models from Hugging Face
max-tokensintegerMaximum number of tokens to generate.
thread-countintegerNumber of threads to use.
quantize-model-at-runtimebooleanWhether quantize the model at runtime.

JlamaEmbeddingModel

To automatically create and add JlamaEmbeddingModel to the service registry add the following lines to application.yaml:

yaml
langchain4j:
  providers:
    jlama:
      temperature: 1.2

  models:
    jlama-embedding-model:
      provider: jlama
      model-name: "tjake/Qwen2.5-0.5B-Instruct-JQ4"

If enabled is set to false, the configuration is ignored, and the component is not created.

Full list of configuration properties:

KeyTypeDescription
enabledbooleanIf set to false, the component will not be available even if configured.
model-namestringThe model name to use.
model-cache-pathPathPath to a directory where the model will be cached once downloaded.
working-directoryPathPath to a directory where persistent ChatMemory can be stored on disk for a given model instance.
auth-tokenstringToken to use when fetching private models from Hugging Face
thread-countintegerNumber of threads to use.
pooling-typeenumMethod of embedding pooling.

JlamaLanguageModel

To automatically create and add JlamaLanguageModel to the service registry add the following lines to application.yaml:

yaml
langchain4j:
  providers:
    jlama:
      temperature: 1.2

  models:
    jlama-language-model:
      provider: jlama
      model-name: "tjake/Qwen2.5-0.5B-Instruct-JQ4"

If enabled is set to false, the configuration is ignored, and the component is not created.

Full list of configuration properties:

KeyTypeDescription
enabledbooleanIf set to false, the component will not be available even if configured.
model-namestringThe model name to use.
temperaturedoubleSampling temperature to use, between 0 and 2. Higher values make the output more random, while lower values make it more focused and deterministic.
working-quantized-typeenumQuantize the model at runtime. Default quantization is Q4.
model-cache-pathPathPath to a directory where the model will be cached once downloaded.
working-directoryPathPath to a directory where persistent ChatMemory can be stored on disk for a given model instance.
auth-tokenstringToken to use when fetching private models from Hugging Face
max-tokensintegerMaximum number of tokens to generate.
thread-countintegerNumber of threads to use.
quantize-model-at-runtimebooleanWhether quantize the model at runtime.

JlamaStreamingChatModel

To automatically create and add JlamaStreamingChatModel to the service registry add the following lines to application.yaml:

yaml
langchain4j:
  providers:
    jlama:
      temperature: 1.2

  models:
    jlama-streaming-chat-model:
      provider: jlama
      model-name: "tjake/Qwen2.5-0.5B-Instruct-JQ4"

If enabled is set to false, the configuration is ignored, and the component is not created.

Full list of configuration properties:

KeyTypeDescription
enabledbooleanIf set to false, the component will not be available even if configured.
model-namestringThe model name to use.
temperaturedoubleSampling temperature to use, between 0 and 2. Higher values make the output more random, while lower values make it more focused and deterministic.
working-quantized-typeenumQuantize the model at runtime. Default quantization is Q4.
model-cache-pathPathPath to a directory where the model will be cached once downloaded.
working-directoryPathPath to a directory where persistent ChatMemory can be stored on disk for a given model instance.
auth-tokenstringToken to use when fetching private models from Hugging Face
max-tokensintegerMaximum number of tokens to generate.
thread-countintegerNumber of threads to use.
quantize-model-at-runtimebooleanWhether quantize the model at runtime.

Additional Information