AI DIAL 适配器 是 AI Skill Hub 本期精选AI工具之一。综合评分 8.0 分,整体质量较高。我们强烈推荐将其纳入你的 AI 工具库,帮助提升工作效率。
实现 Azure OpenAI 语言模型的 AI DIAL API
AI DIAL 适配器 是一款基于 Python 开发的开源工具,专注于 AI、DIAL、OpenAI 等核心功能。作为 GitHub 开源项目,它拥有活跃的社区支持和持续的版本迭代,代码完全透明可审计,支持本地部署以保护数据隐私。无论是个人使用还是集成到企业工作流,都能提供稳定可靠的解决方案。
实现 Azure OpenAI 语言模型的 AI DIAL API
AI DIAL 适配器 是一款基于 Python 开发的开源工具,专注于 AI、DIAL、OpenAI 等核心功能。作为 GitHub 开源项目,它拥有活跃的社区支持和持续的版本迭代,代码完全透明可审计,支持本地部署以保护数据隐私。无论是个人使用还是集成到企业工作流,都能提供稳定可靠的解决方案。
# 方式一:pip 安装(推荐)
pip install ai-dial-adapter-openai
# 方式二:虚拟环境安装(推荐生产环境)
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
pip install ai-dial-adapter-openai
# 方式三:从源码安装(获取最新功能)
git clone https://github.com/epam/ai-dial-adapter-openai
cd ai-dial-adapter-openai
pip install -e .
# 验证安装
python -c "import ai_dial_adapter_openai; print('安装成功')"
# 命令行使用
ai-dial-adapter-openai --help
# 基本用法
ai-dial-adapter-openai input_file -o output_file
# Python 代码中调用
import ai_dial_adapter_openai
# 示例
result = ai_dial_adapter_openai.process("input")
print(result)
# ai-dial-adapter-openai 配置文件示例(config.yml) app: name: "ai-dial-adapter-openai" debug: false log_level: "INFO" # 运行时指定配置文件 ai-dial-adapter-openai --config config.yml # 或通过环境变量配置 export AI_DIAL_ADAPTER_OPENAI_API_KEY="your-key" export AI_DIAL_ADAPTER_OPENAI_OUTPUT_DIR="./output"
max_tokens for Claude models---
LLM Adapters unify the APIs of respective LLMs to align with the Unified Protocol of DIAL Core. Each Adapter operates within a dedicated container. Multi-modality allows supporting non-textual communications such as image-to-text, text-to-image, file transfers and more.
The project implements AI DIAL API for language models from Azure OpenAI.
---
The adapter is able to convert certain upstream APIs to the DIAL Chat Completions API (which is an extension of Azure OpenAI Chat Completions API).
Chat Completions deployments are exposed via the endpoint:
POST ${ADAPTER_ORIGIN}/openai/deployments/${ADAPTER_DEPLOYMENT_ID}/chat/completions
Since: ai-dial-adapter-openai:0.38.0 AND ai-dial-core:0.42.0
The adapter is able to proxy requests to models supporting Responses API.
The following Responses API endpoints are exposed by the adapter:
POST ${ADAPTER_ORIGIN}/openai/v1/responses
Current limitations:
GET /responses/{response_id} endpoint which isn't supported yet.The adapter is able to convert certain upstream APIs to the DIAL Embeddings API (which is an extension of Azure OpenAI Embeddings API).
Embeddings deployments are exposed via the endpoint:
POST ${ADAPTER_ORIGIN}/openai/deployments/${ADAPTER_DEPLOYMENT_ID}/embeddings
The following variables cluster all deployments into the groups of deployments which share the same API and the same tokenization algorithm.
| Variable | Default | Description |
|---|---|---|
| DALLE3_DEPLOYMENTS | `|Comma-separated list of deployments that support DALL-E 3 API. Example: dall-e-3,dalle3,dall-e` | |
| DALLE3_AZURE_API_VERSION | 2024-02-01 | The API version for requests to the Azure DALL·E 3 API |
| GPT_IMAGE_1_DEPLOYMENTS | `|Comma-separated list of deployments that support GPT-Image 1 API. Example: gpt-image-1` | |
| GPT_IMAGE_1_AZURE_API_VERSION | 2024-02-01 | The API version for requests to the Azure GPT-Image 1 API |
| MISTRAL_DEPLOYMENTS | `|Comma-separated list of deployments that support Mistral Large Azure API. Example: mistral-large-azure,mistral-large` | |
| DATABRICKS_DEPLOYMENTS | `|Comma-separated list of Databricks chat completion deployments. Example: databricks-dbrx-instruct,databricks-mixtral-8x7b-instruct,databricks-llama-2-70b-chat` | |
| GPT4O_DEPLOYMENTS | `|Comma-separated list of GPT-4o chat completion deployments. Example: gpt-4o-2024-05-13` | |
| GPT4O_MINI_DEPLOYMENTS | `|Comma-separated list of GPT-4o mini chat completion deployments. Example: gpt-4o-mini-2024-07-18` | |
| VLLM_DEPLOYMENTS | `|Comma-separated list of deployments that use a vLLM OpenAI-compatible upstream. Example: vllm-llama3,vllm-qwen2` | |
| QWEN3_ASR_VLLM_DEPLOYMENTS | `| Comma-separated list of [Qwen3-ASR deployments](#qwen3-asr) served via vLLM. Example: qwen3-asr` | |
| AZURE_AI_VISION_DEPLOYMENTS | `|Comma-separated list of Azure AI Vision embedding deployments. The endpoint of the deployment is expected to point to the Azure service: https://<service-name>.cognitiveservices.azure.com/` | |
| AUDIO_AZURE_API_VERSION | 2025-03-01-preview | The API version for requests to the [Azure Audio API](#azure-audio-api) endpoints. |
Deployments that do not fall into any of the categories are considered to support text-to-text chat completion OpenAI API or text embeddings OpenAI API.
Copy .env.example to .env and customize it for your environment.
Certain models support configuration via the $ADAPTER_ORIGIN/openai/deployments/$DEPLOYMENT_NAME/configuration endpoint.
GET request to this endpoint returns the schema of the model configuration in JSON Schema format.
Such models expect the custom_fields.configuration field of the chat/completions request to contain a JSON value conforming to that schema. The custom_fields.configuration field is optional if and only if every field in the schema is also optional.
The configuration can be preset in the DIAL Core config via the defaults parameter:
<details><summary>DIAL Core Config</summary>
{
"models": {
"my-deployment-id": {
"type": "chat",
"endpoint": "$ADAPTER_ORIGIN/openai/deployments/my-deployment-id/chat/completions",
"upstreams": [
{
"endpoint": "$AZURE_OPENAI_SERVICE_ORIGIN/openai/deployments/openai-deployment-id/chat/completions"
}
],
"defaults": {
"custom_fields": {
"configuration": $MODEL_CONFIGURATION_OBJECT
}
}
}
}
}
</details>
This is convenient when major model features can be enabled via configuration (e.g., web search or reasoning) and you want a deployment where these features are permanently enabled.
DIAL Core will enrich requests with the configuration specified in defaults, so the client doesn’t need to provide it with each chat completion request.
<details><summary>DIAL Core Config</summary>
{
"models": {
"${DIAL_DEPLOYMENT_ID}": {
"type": "chat",
"endpoint": "${ADAPTER_ORIGIN}/openai/deployments/${ADAPTER_DEPLOYMENT_ID}/chat/completions",
"upstreams": [
{
"endpoint": "https://${AZURE_OPENAI_SERVICE_NAME}.openai.azure.com/openai/deployments/${AZURE_OPENAI_DEPLOYMENT_ID}/chat/completions",
"key": "${OPTIONAL_API_KEY}"
}
]
}
}
}
</details>
There are three free variables in the config related to deployment ids. Each of these variables corresponds to an HTTP request initiated by the DIAL client:
DIAL_DEPLOYMENT_ID - it's the deployment id visible to the DIAL Client via DIAL deployment listing. The client will be using the id to call the model by sending the request POST ${DIAL_CORE_ORIGIN}/openai/deployments/${DIAL_DEPLOYMENT_ID}/chat/completionsADAPTER_DEPLOYMENT_ID - the deployment id the OpenAI adapter receives when DIAL Core calls POST ${ADAPTER_ORIGIN}/openai/deployments/${ADAPTER_DEPLOYMENT_ID}/chat/completions. Use this identifier in environment variables that define deployment categories.AZURE_OPENAI_DEPLOYMENT_ID - the Azure OpenAI deployment called by the OpenAI adapter.Typically these three variables share the same value (the Azure OpenAI deployment name). They may differ if you expose multiple DIAL deployments that call the same Azure OpenAI endpoint but configured differently.
The DefaultAzureCredential is used to authenticate requests to Azure when an API key is not provided in the upstream configuration.
The Next generation API (aka v1 API) doesn't include the deployment id in the URL:
POST https://SERVICE_NAME.openai.azure.com/openai/deployments/gpt-4o/chat/completionsPOST https://SERVICE_NAME.openai.azure.com/openai/v1/chat/completionsThe DIAL configuration changes accordingly:
<details><summary>DIAL Core Config</summary>
{
"models": {
"${DIAL_DEPLOYMENT_ID}": {
"type": "chat",
"overrideName": "${AZURE_OPENAI_DEPLOYMENT_ID}",
"endpoint": "${ADAPTER_ORIGIN}/openai/deployments/${ADAPTER_DEPLOYMENT_ID}/chat/completions",
"upstreams": [
{
"endpoint": "https://${AZURE_OPENAI_SERVICE_NAME}.openai.azure.com/openai/v1/chat/completions",
"key": "${OPTIONAL_API_KEY}"
}
]
}
}
}
</details>
Because the deployment ID is not included in the upstream URL, specify it in the overrideName field. If this field is missing, the model name takes the value of the model field from the original chat completion request (if present), otherwise ${ADAPTER_DEPLOYMENT_ID}.
Certain advanced features of OpenAI models, such as reasoning summary, are only accessible via Responses API and not accessible via Chat Completions API.
<details><summary>DIAL Core Config</summary>
{
"models": {
"${DIAL_DEPLOYMENT_ID}": {
"type": "chat",
"overrideName": "${AZURE_OPENAI_DEPLOYMENT_ID}",
"endpoint": "${ADAPTER_ORIGIN}/openai/deployments/${ADAPTER_DEPLOYMENT_ID}/chat/completions",
"upstreams": [
{
"endpoint": "https://${AZURE_OPENAI_SERVICE_NAME}.openai.azure.com/openai/v1/responses",
"key": "${API_KEY}"
}
]
}
}
}
</details>
As in other cases where the upstream URL omits a deployment id, specify it in the overrideName field.
The last generation API is also supported via an URLs in the following format:
"endpoint": "https://${AZURE_OPENAI_SERVICE_NAME}.openai.azure.com/openai/responses"
The deployments backed by Azure OpenAI Responses API support the Web Search tool, which could be enabled by passing a static function called web_search as one of the tools:
<details> <summary>Example request</summary>
{
"model": "upstream-model-name",
"messages": [
{
"role": "user",
"content": "What is the weather in Kyiv now? Include source links."
}
],
"tools": [
{
"type": "static_function",
"static_function": {
"name": "web_search",
"configuration": {
"search_context_size": "high"
}
}
}
],
"stream": true
} </details>
Each Web Search tool calls are translated into a DIAL stages, and URL citations are mirrored as DIAL attachments:
<details> <summary>Example response:</summary>
{
"choices": [
{
"message": {
"content": "Kyiv weather is mild.",
"custom_content": {
"stages": [
{
"name": "Web Search",
"status": "completed",
"content": "Search 'weather Kyiv'"
}
],
"attachments": [
{
"type": "text/markdown",
"title": "Kyiv weather source",
"url": "https://example.com/weather/kyiv"
}
]
}
}
}
]
} </details>
Certain LLM models like gpt-oss-120b or Mistral-Large-2411 can only be deployed to an Azure AI Foundry service. They are accessible via:
<details><summary>DIAL Core Config (Azure AI model inference endpoint)</summary>
{
"models": {
"${DIAL_DEPLOYMENT_ID}": {
"type": "chat",
"overrideName": "${AZURE_AI_FOUNDRY_DEPLOYMENT_ID}",
"endpoint": "${ADAPTER_ORIGIN}/openai/deployments/${ADAPTER_DEPLOYMENT_ID}/chat/completions",
"upstreams": [
{
"endpoint": "https://${AZURE_AI_FOUNDRY_SERVICE_NAME}.services.ai.azure.com/models/chat/completions",
"key": "${OPTIONAL_API_KEY}"
}
]
}
}
}
</details>
<details><summary>DIAL Core Config (Azure OpenAI endpoint)</summary>
{
"models": {
"${DIAL_DEPLOYMENT_ID}": {
"type": "chat",
"overrideName": "${AZURE_AI_FOUNDRY_DEPLOYMENT_ID}",
"endpoint": "${ADAPTER_ORIGIN}/openai/deployments/${ADAPTER_DEPLOYMENT_ID}/chat/completions",
"upstreams": [
{
"endpoint": "https://${AZURE_AI_FOUNDRY_SERVICE_NAME}.openai.azure.com/openai/deployments/gpt-oss-120b/chat/completions",
"key": "${OPTIONAL_API_KEY}"
}
]
}
}
}
</details>
<details><summary>DIAL Core Config</summary>
{
"models": {
"${DIAL_DEPLOYMENT_ID}": {
"type": "chat",
"endpoint": "${ADAPTER_ORIGIN}/openai/deployments/${ADAPTER_DEPLOYMENT_ID}/chat/completions",
"upstreams": [
{
"endpoint": "https://${AZURE_OPENAI_SERVICE_NAME}.openai.azure.com/openai/deployments/${AZURE_OPENAI_DEPLOYMENT_ID}/images/generations",
"key": "${OPTIONAL_API_KEY}"
}
]
}
}
}
</details>
The supported upstream models are dall-e-3 and gpt-image-1. These are the values that AZURE_OPENAI_DEPLOYMENT_ID variable can take.
[!IMPORTANT] The DALL·E 3 adapter deployment must be declared inDALLE3_DEPLOYMENTSenv variable, and GPT-Image 1 deployment - inGPT_IMAGE_1_DEPLOYMENTS.
<details><summary>DIAL Core Config</summary>
{
"models": {
"${DIAL_DEPLOYMENT_ID}": {
"type": "chat",
"overrideName": "sora",
"endpoint": "${ADAPTER_ORIGIN}/openai/deployments/sora/chat/completions",
"upstreams": [
{
"endpoint": "https://${AZURE_OPENAI_SERVICE_NAME}.openai.azure.com/openai/v1/video/generations",
"key": "${OPTIONAL_API_KEY}"
}
]
}
}
}
</details>
The video generation models support configuration via the custom_fields.configuration field in the chat completion request:
{
"model": "sora",
"messages": [
{
"role": "user",
"content": "A cat playing with a ball of yarn"
}
],
"custom_fields": {
"configuration": {
"width": 480,
"height": 480,
"n_seconds": 5,
"n_variants": 1
}
}
}
Width and height are defaulted to 480x480 if not specified.
Find the details in the Azure API specification.
[!NOTE] n_variants>1 results in multiple video attachments to a single chat completion choice.
[!IMPORTANT] Prompt tokens in the usage are set to zero. Completion tokens are set to the overall number of seconds in the generated video(s).
<details><summary>DIAL Core Config</summary>
{
"models": {
"${DIAL_DEPLOYMENT_ID}": {
"type": "chat",
"overrideName": "sora-2",
"endpoint": "${ADAPTER_ORIGIN}/openai/deployments/sora-2/chat/completions",
"upstreams": [
{
"endpoint": "https://${AZURE_OPENAI_SERVICE_NAME}.openai.azure.com/openai/v1/videos",
"key": "${OPTIONAL_API_KEY}"
}
]
}
}
}
</details>
The Sora 2 deployment works in either of following modes:
<details> <summary>Chat completion request</summary>
{
"model": "sora-2",
"messages": [
{
"role": "system",
"content": "A system message that will be ignored"
},
{
"role": "user",
"content": "A cat playing with a ball of yarn"
}
]
}
</details>
<details> <summary>Chat completion request</summary>
{
"model": "sora-2",
"messages": [
{
"role": "user",
"content": [
{"type": "text", "text": "Animate the image"},
{"type": "image_url", "image_url": {"url": "http://example.com/image.jpg"}}
]
}
]
}
</details>
Video remixing (video-to-video generation) isn't supported.
The Sora 2 deployment supports configuration via the custom_fields.configuration field in the chat completion request:
{
"model": "sora-2",
"messages": [
{
"role": "user",
"content": "A cat playing with a ball of yarn"
}
],
"custom_fields": {
"configuration": {
"seconds": 4,
"size": "720x1280",
"auto_crop_reference_images": true
}
}
}
The size is defaulted to 720x1280 if not specified. The duration is defaulted to 4 seconds if not specified.
The auto cropping flag enables cropping of the input reference image to the output video size. It can be useful, since Sora 2 rejects any requests where the resolution of the source image and final video do not match. The flag defaults to False.
Find the details in the Azure Sora 2 API specification.
[!IMPORTANT] Prompt tokens in the usage are set to zero. Completion tokens are set to the overall number of seconds in the generated video(s).
The adapter supports models connected via Azure Audio API.
Set AZURE_DEPLOYMENT_ID variable to one of the text-to-speech models supported by Azure Audio API:
<details><summary>DIAL Core Config</summary>
{
"models": {
"${DIAL_DEPLOYMENT_ID}": {
"type": "chat",
"endpoint": "${ADAPTER_ORIGIN}/openai/deployments/${AZURE_AUDIO_API_DEPLOYMENT_ID}/chat/completions",
"upstreams": [
{
"endpoint": "https://${AZURE_SERVICE_NAME}.(openai|cognitiveservices).azure.com/openai/deployments/${AZURE_DEPLOYMENT_ID/audio/speech",
"key": "${OPTIONAL_API_KEY}"
}
]
}
}
}
</details>
At the moment of writing, these are: tts, tts-hd, and gpt-4o-mini-tts.
The adapter takes the last user message as a text prompt and sends it to the upstream as input parameter. The input text is limited to 4096 characters. The text is being translated into speech audio by the upstream model. The audio file is returned as an attachment in the chat completion response.
System instructions are used to set the tone of the synthesized speech.
The adapter supports the following configuration for the TTS models:
{
"instruction": "Speak in a cheerful tone.", # optional, sets the tone; appended the system message from the chat completion request
"voice": "allow", # one of the preset voices
"speed": 1.0, # speech speed multiplier
"response_format": "mp3" # one of the supported audio formats
}
Find the configuration details in the Azure specification or in the OpenAI Platform specification.
The usage is computed in the following way:
gpt-4o-mini-tts - prompt tokens are computed using gpt-4o tiktoken algorithm. Completion tokens are set to zero.tts and tts-hd - there is no official documentation on the pricing for these models. Tokenizer for gpt-4o model will be used as a default for prompt tokens calculation. Completion tokens are set to zero.Set AZURE_DEPLOYMENT_ID variable to one of the speech-to-text models supported by Azure Audio API:
<details><summary>DIAL Core Config</summary>
{
"models": {
"${DIAL_DEPLOYMENT_ID}": {
"type": "chat",
"endpoint": "${ADAPTER_ORIGIN}/openai/deployments/${AZURE_AUDIO_API_DEPLOYMENT_ID}/chat/completions",
"upstreams": [
{
"endpoint": "https://${AZURE_SERVICE_NAME}.(openai|cognitiveservices).azure.com/openai/deployments/${AZURE_DEPLOYMENT_ID/audio/transcriptions",
"key": "${OPTIONAL_API_KEY}"
}
]
}
}
}
</details>
At the moment of writing, these are: whisper, gpt-4o-transcribe, gpt-4o-mini-transcribe, and gpt-4o-transcribe-diarize.
The adapter takes an audio attachment from the last user message and pass it to the transcription model. The transcription is return as a text in the chat completion response.
System instructions are used to set the prompt parameter in the Transcription API request.
The usage is computed in the following way:
gpt-4o-* models return audio tokens in the usage.prompt_tokens field and text tokens - in usage.completion_tokens.whisper models return duration of the given audio file in seconds in usage.prompt_tokens and zero in usage.completion_tokens.<details><summary>DIAL Core Config</summary>
{
"models": {
"${DIAL_DEPLOYMENT_ID}": {
"type": "chat",
"overrideName": "${OPENAI_MODEL_NAME}",
"endpoint": "${ADAPTER_ORIGIN}/openai/deployments/${ADAPTER_DEPLOYMENT_ID}/chat/completions",
"upstreams": [
{
"endpoint": "https://api.openai.com/v1/chat/completions",
"key": "${API_KEY}"
}
]
}
}
}
</details>
Note the difference from the Azure OpenAI configuration:
overrideName to specify the upstream OpenAI model name. The upstream URL does not include the model name (unlike Azure), so we pass it via overrideName. If this field is missing, the model name takes the value of the model field from the original chat completion request (if present), otherwise ${ADAPTER_DEPLOYMENT_ID}.The adapter also supports legacy Completions API both for Azure-style upstream endpoints and OpenAI Platform-style endpoints:
<details><summary>DIAL Core Config</summary>
{
"models": {
"${DIAL_DEPLOYMENT_ID}": {
"type": "chat",
"overrideName": "${OPENAI_MODEL_NAME}",
"endpoint": "${ADAPTER_ORIGIN}/openai/deployments/${ADAPTER_DEPLOYMENT_ID}/chat/completions",
"upstreams": [
{
"endpoint": "https://api.openai.com/v1/completions",
"key": "${API_KEY}"
}
]
}
}
}
</details>
The Mistral Platform provides Chat Completions API, therefore, it could be connected to via the adapter:
<details><summary>DIAL Core Config</summary>
{
"models": {
"${DIAL_DEPLOYMENT_ID}": {
"type": "chat",
"overrideName": "${MISTRAL_MODEL_NAME}",
"endpoint": "${ADAPTER_ORIGIN}/openai/deployments/${MISTRAL_MODEL_NAME}/chat/completions",
"upstreams": [
{
"endpoint": "https://api.mistral.ai/v1/chat/completions",
"key": "${MISTRAL_API_KEY}"
}
]
}
}
}
</details>
Where MISTRAL_MODEL_NAME is one of the available models on the Platform.
The deployment should be added to the environment variable MISTRAL_DEPLOYMENTS.
The adapter supports reasoning for Magistral models. The reasoning tokens are displayed in a dedicated stage titled Reasoning.
vLLM provides an OpenAI-compatible Chat Completions API and can be connected to the adapter.
<details><summary>DIAL Core Config</summary>
{
"models": {
"${DIAL_DEPLOYMENT_ID}": {
"type": "chat",
"overrideName": "${VLLM_MODEL_NAME}",
"endpoint": "${ADAPTER_ORIGIN}/openai/deployments/${ADAPTER_DEPLOYMENT_ID}/chat/completions",
"upstreams": [
{
"endpoint": "${VLLM_ORIGIN}/v1/chat/completions"
}
]
}
}
}
</details>
Enable the vLLM-specific flow by adding ${ADAPTER_DEPLOYMENT_ID} to the environment variable VLLM_DEPLOYMENTS.
You can connect the Qwen3-ASR model served with vLLM to DIAL. This adapter provides first-class support for this integration scenario:
audio/*). The adapter converts them into the content parts expected by the vLLM Chat Completions API.Language: English (or whichever language was detected).[!NOTE]QWEN3_ASR_VLLM_DEPLOYMENTSis separate fromVLLM_DEPLOYMENTS. Deployments listed inQWEN3_ASR_VLLM_DEPLOYMENTSreceive the ASR language extraction post-processing, while regularVLLM_DEPLOYMENTSreceive reasoning extraction instead.
The adapter supports Claude models deployed in Azure Foundry and exposing Anthropic Messages API:
<details><summary>DIAL Core Config</summary>
{
"models": {
"${DIAL_DEPLOYMENT_ID}": {
"type": "chat",
"overrideName": "${ANTHROPIC_MODEL_NAME}",
"endpoint": "${ADAPTER_ORIGIN}/openai/deployments/${ADAPTER_DEPLOYMENT_ID}/chat/completions",
"upstreams": [
{
"endpoint": "https://${AZURE_AI_FOUNDRY_SERVICE_NAME}.services.ai.azure.com/anthropic/v1/messages",
"key": "${OPTIONAL_API_KEY}"
}
]
}
}
}
</details>
max_tokens for Claude modelsUnlike OpenAI GPT models, Claude models require the max_tokens parameter in the chat completion request.
We recommend configuring max_tokens default value on a per-model basis in the DIAL Core Config, for example:
{
"models": {
"dial-claude-deployment-id": {
"type": "chat",
"description": "...",
"endpoint": "...",
"defaults": {
"max_tokens": 2048
}
}
}
}
If the default is missing in the DIAL Core Config, it will be taken from the CLAUDE_DEFAULT_MAX_TOKENS environment variable. However, we strongly recommend not to rely on this variable and instead configure the defaults in the DIAL Core Config. Such a per-model configuration is operationally cleaner since all the information relevant to tokens (like pricing and token limits) is kept in the same place.
The default value set in the DIAL Core Config takes precedence over the one configured in the adapter.
Make sure the default doesn't exceed Claude's max output tokens, otherwise, you will receive an error like this one: max_tokens: 10000 > 8192, which is the maximum allowed number of output tokens for claude-...).
The adapter supports automatic prompt caching.
To enable it:
defaults.custom_fields.cache_breakpoint.autoCachingSupported: true in the DIAL Core configuration.<details><summary>DIAL Core Config</summary>
{
"models": {
"${DIAL_DEPLOYMENT_ID}": {
"type": "chat",
"overrideName": "${ANTHROPIC_MODEL_NAME}",
"defaults": {
"custom_fields": {
"cache_breakpoint": {}
}
},
"endpoint": "${ADAPTER_ORIGIN}/openai/deployments/${ADAPTER_DEPLOYMENT_ID}/chat/completions",
"upstreams": [
{
"endpoint": "https://${AZURE_AI_FOUNDRY_SERVICE_NAME1}.services.ai.azure.com/anthropic/v1/messages",
"key": "${OPTIONAL_API_KEY1}"
},
{
"endpoint": "https://${AZURE_AI_FOUNDRY_SERVICE_NAME2}.services.ai.azure.com/anthropic/v1/messages",
"key": "${OPTIONAL_API_KEY2}"
}
],
"features": {
"autoCachingSupported": true
}
}
}
}
</details>
The adapter support explicit cache breakpoints in system and user message as well as in the tool definitions. Find the examples of requests in the Anthropic adapter documentation.
Set the feature flag cacheSupported: true in the DIAL Core configuration, when the DIAL deployment has multiple upstreams. This flag enables logic in DIAL Core that routes chat completions requests with the same prefixes to the same upstreams:
<details><summary>DIAL Core Config</summary>
{
"models": {
"${DIAL_DEPLOYMENT_ID}": {
"type": "chat",
"overrideName": "${ANTHROPIC_MODEL_NAME}",
"endpoint": "${ADAPTER_ORIGIN}/openai/deployments/${ADAPTER_DEPLOYMENT_ID}/chat/completions",
"upstreams": [
{
"endpoint": "https://${AZURE_AI_FOUNDRY_SERVICE_NAME1}.services.ai.azure.com/anthropic/v1/messages",
"key": "${OPTIONAL_API_KEY1}"
},
{
"endpoint": "https://${AZURE_AI_FOUNDRY_SERVICE_NAME2}.services.ai.azure.com/anthropic/v1/messages",
"key": "${OPTIONAL_API_KEY2}"
}
],
"features": {
"cacheSupported": true
}
}
}
}
</details>
Note that in the following DIAL Core config examples, responsesEndpoint URL enables Responses API in DIAL. Whereas, endpoint URL is required and enables Chat Completions API in DIAL.
<details><summary>DIAL Core Config</summary>
{
"models": {
"${DIAL_DEPLOYMENT_ID}": {
"type": "chat",
"overrideName": "${AZURE_OPENAI_DEPLOYMENT_ID}",
"responsesEndpoint": "${ADAPTER_ORIGIN}/openai/v1/responses",
"upstreams": [
{
"responsesEndpoint": "https://${AZURE_OPENAI_SERVICE_NAME1}.openai.azure.com/openai/v1/responses",
"key": "${OPTIONAL_API_KEY1}"
},
{
"responsesEndpoint": "https://${AZURE_OPENAI_SERVICE_NAME2}.openai.azure.com/openai/v1/responses",
"key": "${OPTIONAL_API_KEY2}"
},
{
"responsesEndpoint": "https://${AZURE_OPENAI_SERVICE_NAME3}.openai.azure.com/openai/v1/responses",
"key": "${OPTIONAL_API_KEY3}"
}
]
}
}
}
</details>
<details><summary>DIAL Core Config</summary>
{
"models": {
"${DIAL_DEPLOYMENT_ID}": {
"type": "chat",
"overrideName": "${OPENAI_PLATFORM_MODEL_NAME}",
"responsesEndpoint": "${ADAPTER_ORIGIN}/openai/v1/responses",
"upstreams": [
{
"responsesEndpoint": "https://api.openai.com/v1/responses",
"key": "${API_KEY}"
}
]
}
}
}
</details>
---
<details><summary>DIAL Core Config</summary>
{
"models": {
"${DIAL_DEPLOYMENT_ID}": {
"type": "embedding",
"endpoint": "${ADAPTER_ORIGIN}/openai/deployments/${ADAPTER_DEPLOYMENT_ID}/embeddings",
"upstreams": [
{
"endpoint": "https://${AZURE_OPENAI_SERVICE_NAME}.openai.azure.com/openai/deployments/${AZURE_OPENAI_DEPLOYMENT_ID}/embeddings",
"key": "${OPTIONAL_API_KEY}"
}
]
}
}
}
</details>
<details><summary>DIAL Core Config</summary>
{
"models": {
"${DIAL_DEPLOYMENT_ID}": {
"type": "embedding",
"overrideName": "${AZURE_OPENAI_DEPLOYMENT_ID}",
"endpoint": "${ADAPTER_ORIGIN}/openai/deployments/${ADAPTER_DEPLOYMENT_ID}/embeddings",
"upstreams": [
{
"endpoint": "https://${AZURE_OPENAI_SERVICE_NAME}.openai.azure.com/openai/v1/embeddings",
"key": "${OPTIONAL_API_KEY}"
}
]
}
}
}
</details>
The adapter supports Azure Multimodal embeddings.
<details><summary>DIAL Core Config</summary>
{
"models": {
"${DIAL_DEPLOYMENT_ID}": {
"type": "embedding",
"endpoint": "${ADAPTER_ORIGIN}/openai/deployments/${ADAPTER_DEPLOYMENT_ID}/embeddings",
"upstreams": [
{
"endpoint": "https://${COMPUTER_VISION_SERVICE_NAME}.cognitiveservices.azure.com",
"key": "${OPTIONAL_API_KEY}"
}
]
}
}
}
</details>
[!IMPORTANT]${ADAPTER_DEPLOYMENT_ID}must be added to the env variableAZURE_AI_VISION_DEPLOYMENTSto enable the embeddings deployment.
The multimodal embeddings model supports text and images as inputs.
Since the original OpenAI embeddings API only support text inputs, the image inputs should be passed in the custom_input request field as URL or in base64-encoded format:
curl -X POST "${DIAL_CORE_ORIGIN}/deployments/${DIAL_DEPLOYMENT_ID}/embeddings" -v \
-H "api-key:${DIAL_API_KEY}" \
-H "content-type:application/json" \
-d '{"input": ["cat", "fish"], "custom_input": [{"type": "image/png", "url": "https://learn.microsoft.com/azure/ai-services/computer-vision/media/quickstarts/presentation.png"}]}'
The response will contain three embedding vectors, each corresponding to one of the inputs in the original request.
<details><summary>DIAL Core Config</summary>
{
"models": {
"${DIAL_DEPLOYMENT_ID}": {
"type": "embedding",
"overrideName": "${OPENAI_MODEL_NAME}",
"endpoint": "${ADAPTER_ORIGIN}/openai/deployments/${ADAPTER_DEPLOYMENT_ID}/embeddings",
"upstreams": [
{
"endpoint": "https://api.openai.com/v1/embeddings",
"key": "${API_KEY}"
}
]
}
}
}
</details>
---
The Responses API provides more features than Chat Completions API. Some of these features could be enabled via a configuration fields in the chat completions request.
The JSON schema of the configuration is open which enables forward compatibility with the future developments in the Responses API.
[!NOTE] Such a configuration is only possible for the models that are configured in the DIAL Core config to use Responses API upstream endpoints.
The reasoning and the reasoning summary could be enabled via the configuration like this one:
<details><summary>Request</summary>
{
"model": "gpt-5-2025-08-07",
"messages": [
{
"role": "user",
"content": "Write a bash script that takes a matrix represented as a string with format \"[1,2],[3,4],[5,6]\" and prints the transpose in the same format."
}
],
"custom_fields": {
"configuration": {
"reasoning": {
"effort": "medium",
"summary": "auto"
}
}
}
}
</details>
Here custom_fields.configuration.reasoning is an object which is being passed to the Response API as the reasoning parameter.
[!IMPORTANT] Not all models support reasoning. Consult with the documentation before enabling reasoning.
---
The adapter provides an Azure-flavour of the OpenAI Chat Completions API.
Azure’s API is a variant of the OpenAI Platform API. The key differences are the deployment ID in the path and the required api-version query parameter:
OpenAI Platform: POST https://api.openai.com/v1/chat/completions
Azure OpenAI: POST https://YOUR_RESOURCE_NAME.openai.azure.com/openai/deployments/YOUR_DEPLOYMENT_NAME/chat/completions?api-version=2024-06-01
The api-version parameter tracks API changes, and the OpenAI SDK requires it.
Consider an application calling Azure OpenAI via DIAL. You typically pin an Azure OpenAI API version (usually the latest). Over time, new API versions ship with new features, and SDKs add support for them. This
高质量的 AI 工具集成项目
AI Skill Hub 为第三方内容聚合平台,本页面信息基于公开数据整理,不对工具功能和质量作任何法律背书。
建议在沙箱或测试环境中充分验证后,再部署至生产环境,并做好必要的安全评估。
✅ Apache 2.0 — 宽松开源协议,可商用,需保留版权声明和 NOTICE 文件,含专利授权条款。
经综合评估,AI DIAL 适配器 在AI工具赛道中表现稳健,质量优秀。如果你已有明确的使用需求,可以直接上手体验;如果还在评估阶段,建议对比同类工具后再做决策。
| 原始名称 | ai-dial-adapter-openai |
| Topics | AIDIALOpenAIPython |
| GitHub | https://github.com/epam/ai-dial-adapter-openai |
| License | Apache-2.0 |
| 语言 | Python |
收录时间:2026-06-10 · 更新时间:2026-06-10 · License:Apache-2.0 · AI Skill Hub 不对第三方内容的准确性作法律背书。