Installation & Setup
How do I configure Azure OpenAI?
How do I configure Azure OpenAI?
For a full walkthrough, see the Installation guide.To debug connectivity issues, add
- Go to Azure AI Foundry → My Assets → Models + endpoints
- Select your deployed model, then open the Code tab
- Copy these values from the code example:
azure_endpoint→ use as Base URL (e.g.https://yourname.cognitiveservices.azure.com/)api_version→ use as API Version (e.g.2024-12-01-preview)- API key shown in the example → use as API Key
- In HA, set API Provider to
Azure OpenAIand paste the values above - Enable Skip Authentication
- In the assistant options, enter the deployment name as the model name
Azure API Management domains (
azure-api.net, services.ai.azure.com) are also supported in addition to cognitiveservices.azure.com.openai: debug to your HA logger config to see the exact URL being called.Can I use Ollama or other local LLMs?
Can I use Ollama or other local LLMs?
For a full walkthrough, see the Installation guide.Yes. Use the
Function calling support: Many local models do not reliably call tools. If the assistant verbally acknowledges your request but nothing actually happens, use a model with explicit tool-use support such as
/v1 endpoint and enable Skip Authentication:| Field | Value |
|---|---|
| API Provider | OpenAI |
| Base URL | http://<HOST>:11434/v1 |
| API Key | Any value (e.g. ollama) |
| Skip Authentication | On |
qwen2.5:14b or llama3.1:8b, and add this to your system prompt:Device Control
The assistant talks about actions but never executes them
The assistant talks about actions but never executes them
Several things can cause this:
- Entities not exposed — Go to Settings → Voice Assistants → Expose entities and make sure the relevant entities are toggled on.
- Model doesn’t support function/tool calling — Local models especially may not emit structured tool calls. Try other models, or add explicit instructions to the system prompt.
The AI can't find my entity or uses the wrong service name
The AI can't find my entity or uses the wrong service name
- Entity not exposed: Confirm the entity is enabled under Settings → Voice Assistants → Expose entities.
-
Wrong service name: LLMs sometimes invent service names. There are two ways to guide the model toward the correct service:
Option 1 — Text instruction in the system prompt:
Option 2 — Provide a concrete example in the system prompt: Showing the AI a full working example is often more effective than a text rule alone. For instance, to teach it how to start a timer correctly:The model will follow the pattern from the example rather than guessing the service name.
-
Missing area context: Help the AI understand room assignments by adding
area_name()to the entities CSV in your prompt:
I get a 'max_tokens is not supported with this model' error
I get a 'max_tokens is not supported with this model' error
Newer OpenAI models (o1, o3, gpt-4.5, gpt-5, etc.) replaced
max_tokens with max_completion_tokens. This was fixed in v2.0.0. Upgrade via HACS.I get a context length exceeded error
I get a context length exceeded error
You have too many tokens in the request. Options:
- Reduce exposed entities — Go to Settings → Voice Assistants → Expose entities and only expose what the AI actually needs to control.
- Filter function output — If a function (e.g. weather forecast) returns large data, use a template to filter the result before returning it to the model.
- Reduce the number of functions — Each function definition (its name, description, and parameter schema) is included in every request and counts toward the token limit. Remove functions you don’t actively use.
Advanced Usage
How do I give the assistant web search access?
How do I give the assistant web search access?
The OpenAI API does not include browsing by default. Add a web search function using the Google Custom Search API:Also update your system prompt to allow the assistant to answer general questions beyond home control.
Local LLM echoes my prompt or outputs weird tokens like <|recipient|>
Local LLM echoes my prompt or outputs weird tokens like <|recipient|>
This means the local model does not properly support OpenAI-compatible function/tool calling. It is outputting raw chat template tokens instead of structured tool calls.Solutions:
- Switch to a model with explicit tool-use support
- For basic chat only (no device control), remove all functions from the configuration.
Debugging
How do I enable debug logging?
How do I enable debug logging?
Add the following to your To also see the raw payloads sent to the LLM:In the HA Logs UI (Settings → System → Logs), click Load Full Logs — the filtered view at the top only shows errors by default.
configuration.yaml:Don’t see your question here? Search the GitHub Issues or open a new one.