Using OpenAI Plugin Envs for Any OpenAI-Compatible Provider

April 24, 2025 · 4 min read

Core Team

The plugin-openai package in this project can connect to any OpenAI-compatible API provider—not just OpenAI itself! Thanks to flexible environment variable support, you can swap between providers like Azure, OpenRouter, or even your own local LLM with just a few tweaks. It’s that easy!

🤔 What is an OpenAI-Compatible Provider?

An OpenAI-compatible provider is any service that implements the OpenAI API spec. Popular examples include:

OpenRouter
Ollama (with its OpenAI-compatible API, supports embedding also)
Local LLMs with an OpenAI API wrapper
Other cloud or self-hosted endpoints

Note: If your provider supports the OpenAI API, this plugin can probably talk to it!

🛠️ Key Environment Variables

The following environment variables are supported by the OpenAI plugin:

Variable	Purpose
`OPENAI_API_KEY`	The API key for authentication (required)
`OPENAI_BASE_URL`	The base URL for the API (override to use other providers)
`OPENAI_SMALL_MODEL`	Default small model name
`OPENAI_LARGE_MODEL`	Default large model name
`OPENAI_EMBEDDING_MODEL`	Embedding model name
`OPENAI_EMBEDDING_URL`	Base URL specifically for the embedding API endpoint
`OPENAI_EMBEDDING_DIMENSIONS`	Embedding vector dimensions
`SMALL_MODEL`	(Fallback) Small model name
`LARGE_MODEL`	(Fallback) Large model name

Example: Connecting to OpenRouter

OPENAI_API_KEY=your-openrouter-key
OPENAI_BASE_URL=https://openrouter.ai/api/v1
OPENAI_SMALL_MODEL=openrouter/gpt-3.5-turbo
OPENAI_LARGE_MODEL=openrouter/gpt-4

Warning: OpenRouter does not currently support the /v1/embeddings endpoint. If you need embeddings, you must use a different provider for them. See the section below on Handling Providers Without Embedding Support.

Example: Connecting to Ollama

OPENAI_API_KEY=ollama-local-demo
OPENAI_BASE_URL=http://localhost:11434/v1
OPENAI_SMALL_MODEL=llama2
OPENAI_LARGE_MODEL=llama2:70b

Example: Connecting to a Local LLM (Llama.cpp)

OPENAI_API_KEY=sk-local-demo
OPENAI_BASE_URL=http://localhost:8000/v1
OPENAI_SMALL_MODEL=llama-2-7b-chat
OPENAI_LARGE_MODEL=llama-2-13b-chat

Example: Connecting to LM Studio

LM Studio is a popular desktop app for running large language models locally. It provides an OpenAI-compatible API server, so you can use it as a drop-in replacement for OpenAI or other providers.

To use LM Studio with the OpenAI plugin, start the LM Studio API server (default port 1234) and set your environment variables as follows:

OPENAI_API_KEY=lmstudio-local-demo # (can be any non-empty string)
OPENAI_BASE_URL=http://localhost:1234/v1
OPENAI_SMALL_MODEL=your-model-name-here
OPENAI_LARGE_MODEL=your-model-name-here
OPENAI_EMBEDDING_MODEL=text-embedding-nomic-embed-text-v1.5

Make sure to use the model identifier as listed in LM Studio for the OPENAI_SMALL_MODEL and OPENAI_LARGE_MODEL values.
LM Studio supports the /v1/models, /v1/chat/completions, /v1/embeddings, and /v1/completions endpoints.
You can reuse any OpenAI-compatible SDK by pointing the base URL to http://localhost:1234/v1.

For more details, see the LM Studio OpenAI Compatibility API docs.

Handling Providers Without Embedding Support

Some OpenAI-compatible providers (like OpenRouter) might not offer an embedding endpoint (/v1/embeddings). If you need embedding functionality (e.g., for memory or context retrieval), you can configure the OpenAI plugin to use a different provider specifically for embeddings using the OPENAI_EMBEDDING_URL environment variable.

Example: OpenRouter for Chat, Ollama for Embeddings

Let's say you want to use OpenRouter for its wide model selection for chat completion but prefer Ollama's embedding capabilities (see Ollama OpenAI docs).

# General API settings (points to OpenRouter)
OPENAI_API_KEY=your-openrouter-key
OPENAI_BASE_URL=https://openrouter.ai/api/v1
OPENAI_SMALL_MODEL=openrouter/gpt-3.5-turbo
OPENAI_LARGE_MODEL=openrouter/gpt-4

# Embedding-specific settings (points to Ollama)
OPENAI_EMBEDDING_URL=http://localhost:11434/v1 # Your Ollama embedding endpoint
OPENAI_EMBEDDING_MODEL=all-minilm # Ollama embedding model (e.g., all-minilm)
# OPENAI_EMBEDDING_DIMENSIONS=1536 # Optional: Specify if needed for your model

🤔 What is an OpenAI-Compatible Provider?​

🛠️ Key Environment Variables​

Example: Connecting to OpenRouter​

Example: Connecting to Ollama​

Example: Connecting to a Local LLM (Llama.cpp)​

Example: Connecting to LM Studio​

Handling Providers Without Embedding Support​