Customization
Add an LLM, embedder, or document store that no built-in template covers, then wire it into the pipelines.
Most providers can be enabled by editing config.yaml alone. If you need a
provider that the LiteLLM layer does not already understand, you can add it
yourself by defining a small provider class and registering it in the
configuration. This page outlines that process.
1. Decide what you want to add
The Analytics Service builds its LLM, embedding, and document-store capabilities on top of the Haystack and LiteLLM frameworks. Together they cover a wide range of models and stores, so first check whether your target is already supported before writing anything new.
If you are adding an embedder, confirm that it is compatible with your document store. Vector stores such as Qdrant support a specific set of embedding models and dimensions, so the embedder and the store must agree.
2. Create a provider definition file
Place your new provider in the directory that matches its kind — llm,
embedder, or document_store:
src
|__ providers
|__ llm
|__ embedder
|__ document_storeFor example, to add a new LLM you would create a file such as mistral.py in
the llm package.
3. Implement the provider class
Inherit from the appropriate base class — LLMProvider, EmbedderProvider, or
DocumentStoreProvider — and implement the methods it requires. The existing
OpenAI provider is a useful reference:
OPENAI_API_BASE = "https://api.openai.com/v1"
GENERATION_MODEL_NAME = "gpt-4o-mini"
GENERATION_MODEL_KWARGS = {
"temperature": 0,
"n": 1,
"max_tokens": 4096,
"response_format": {"type": "json_object"},
}
@provider("openai_llm")
class OpenAILLMProvider(LLMProvider):
def __init__(self, ...):
...
def get_generator(self, ...):
return AsyncGenerator(...)Keep the following in mind:
- Implement the async versions of the methods to avoid blocking the service.
- The name you register with
@provider(...)is the name you reference fromconfig.yaml, so keep it consistent with the file and its suffix. - Define your defaults (model name, endpoint, keyword arguments) in both the code and your environment configuration.
4. Register the provider in config.yaml
Once the class exists, declare it in config.yaml. Define a block for the LLM
and/or embedder, then reference them from the pipeline section.
LLM block:
type: llm
provider: custom_llm_name
models:
- model: model_name
kwargs:
temperature: 0
max_tokens: 4096
api_base: api_endpointEmbedder block:
type: embedder
provider: custom_embedder_name
models:
- model: model_name
dimension: 1536
api_base: api_endpoint
timeout: 30Pipeline block:
type: pipeline
pipes:
- name: pipeline_name
llm: custom_llm_name.model_name
embedder: custom_embedder_name.model_nameThe configuration keys you use here must match the parameter names in your provider's constructor. If a value is ignored, a name mismatch between the YAML and the class signature is the usual cause.
