能力标签

🌐 翻译 🐳 Docker 🔗 REST API 🧬 Embedding 🖼 视觉 🔊 TTS 🎙 STT 🧠 Claude ✨ GPT 🎬 视频生成

🛠

AI工具

AI DIAL 适配器

基于 Python · 开源免费，本地部署，数据完全自主可控

英文名：ai-dial-adapter-openai

⭐ 17 Stars 🍴 8 Forks 💻 Python 📄 Apache-2.0 🏷 AI 8.0分

8.0AI 综合评分

AIDIALOpenAIPython

🌐 访问官网

✦ AI Skill Hub 推荐

AI DIAL 适配器是 AI Skill Hub 本期精选AI工具之一。综合评分 8.0 分，整体质量较高。我们强烈推荐将其纳入你的 AI 工具库，帮助提升工作效率。

📚 深度解析

AI DIAL 适配器是一款基于 Python 的开源工具，在 GitHub 上收获 0k+ Star，是AI、DIAL、OpenAI、Python领域中的优质开源项目。开源工具的最大优势在于代码完全透明，你可以审计每一行代码的安全性，也可以根据自身需求进行二次开发和定制。

**为什么要使用开源工具而非商业 SaaS？**
对于个人开发者和有隐私需求的用户，本地部署的开源工具意味着数据不离本机，不受第三方服务商的数据政策约束。同时，开源工具通常没有使用次数限制和月度费用，一次安装即可长期使用，对于高频使用场景的总拥有成本（TCO）远低于订阅制商业工具。

**安装与环境准备**
AI DIAL 适配器依赖 Python 运行环境。建议通过 pyenv（Python）或 nvm（Node.js）管理 Python 版本，避免全局环境污染。对于新手用户，推荐先创建虚拟环境（python -m venv venv && source venv/bin/activate），再安装依赖，这样即使出现问题也可以随时删除虚拟环境重新开始，不影响系统稳定性。

**社区与维护**
GitHub Issue 和 Discussion 是获取帮助的最快渠道。在提问前建议先检查 Closed Issues（已关闭的问题），大多数常见问题都已有解答。遇到 Bug 时，提供 pip list 的输出、完整错误堆栈和最小可复现示例，能显著提高开发者响应速度。AI Skill Hub 将持续追踪 AI DIAL 适配器的版本更新，及时通知重要功能变化。

📋 工具概览

实现 Azure OpenAI 语言模型的 AI DIAL API

AI DIAL 适配器是一款基于 Python 开发的开源工具，专注于 AI、DIAL、OpenAI 等核心功能。作为 GitHub 开源项目，它拥有活跃的社区支持和持续的版本迭代，代码完全透明可审计，支持本地部署以保护数据隐私。无论是个人使用还是集成到企业工作流，都能提供稳定可靠的解决方案。

GitHub Stars

⭐ 17

开发语言

Python

支持平台

Windows / macOS / Linux

维护状态

轻量级项目，按需更新

开源协议

Apache-2.0

AI 综合评分

8.0 分

工具类型

AI工具

Forks

📖 中文文档

以下内容由 AI Skill Hub 根据项目信息自动整理，如需查看完整原始文档请访问底部「原始来源」。

实现 Azure OpenAI 语言模型的 AI DIAL API

📌 核心特色

开源免费，支持本地部署，数据完全自主可控
活跃的 GitHub 开源社区，持续迭代更新
提供详细文档和使用示例，新手友好
支持自定义配置，灵活适配不同使用环境
可作为基础组件集成进现有技术栈或进行二次开发

🎯 主要使用场景

本地部署运行，保护数据隐私，满足合规要求
自定义集成到现有系统，扩展技术栈能力
作为开源基础组件进行商业化二次开发

以下安装命令基于项目开发语言和类型自动生成，实际以官方 README 为准。

安装命令

# 方式一：pip 安装（推荐）
pip install ai-dial-adapter-openai

# 方式二：虚拟环境安装（推荐生产环境）
python -m venv .venv
source .venv/bin/activate  # Windows: .venv\Scripts\activate
pip install ai-dial-adapter-openai

# 方式三：从源码安装（获取最新功能）
git clone https://github.com/epam/ai-dial-adapter-openai
cd ai-dial-adapter-openai
pip install -e .

# 验证安装
python -c "import ai_dial_adapter_openai; print('安装成功')"

📋 安装步骤说明

访问 GitHub 仓库页面
按照 README 文档完成依赖安装
根据系统环境完成初始化配置
参考官方示例或文档开始使用
遇到问题可在 GitHub Issues 中查找解答

以下用法示例由 AI Skill Hub 整理，涵盖最常见的使用场景。

常用命令 / 代码示例

# 命令行使用
ai-dial-adapter-openai --help

# 基本用法
ai-dial-adapter-openai input_file -o output_file

# Python 代码中调用
import ai_dial_adapter_openai

# 示例
result = ai_dial_adapter_openai.process("input")
print(result)

以下配置示例基于典型使用场景生成，具体参数请参照官方文档调整。

配置示例

# ai-dial-adapter-openai 配置文件示例（config.yml）
app:
  name: "ai-dial-adapter-openai"
  debug: false
  log_level: "INFO"

# 运行时指定配置文件
ai-dial-adapter-openai --config config.yml

# 或通过环境变量配置
export AI_DIAL_ADAPTER_OPENAI_API_KEY="your-key"
export AI_DIAL_ADAPTER_OPENAI_OUTPUT_DIR="./output"

📑 README 深度解析真实文档完整度 44/100 含工作流图查看 GitHub 原文 →

以下内容由系统直接从 GitHub README 解析整理，保留代码块、表格与列表结构。

简介

DIAL OpenAI Adapter

Overview
Chat Completions API deployments
Supported upstream chat APIs
Azure OpenAI Chat Completions API (Last generation API)
Azure OpenAI Chat Completions API (Next generation API)
Azure OpenAI Responses API (Next generation API)
Web Search Tool
Azure AI Foundry Chat Completions API
Azure OpenAI Images API
Azure OpenAI Video API (Sora 1 API)
Azure OpenAI Sora 2 API
Azure Audio API
Text-to-speech models (TTS)
Speech-to-text models (STT)
OpenAI Platform Chat Completions API
OpenAI Completions API
Mistral Chat Completion API
vLLM Chat Completion API
Qwen3-ASR
Anthropic Messages API
Default max_tokens for Claude models
Automatic prompt caching
Explicit prompt caching
Tokenization of chat completion requests/responses
How to minimize adapter-side tokenization
Tokenization algorithm
Text tokenization
Image tokenization
vLLM tokenization
Tokenize endpoint
DIAL Core configuration
Responses API deployments
Supported upstream Responses APIs
Azure OpenAI Responses API
OpenAI Platform Responses API
Embedding deployments
Supported upstream embedding APIs
Azure OpenAI Embeddings API (Last generation API)
Azure OpenAI Embeddings API (Next generation API)
Azure multimodal embeddings
OpenAI Platform Embeddings API
Environment Variables
Categories of deployments
Other variables
Configurable models
DALL-E / GPT Image 1
Forward compatibility
Models based on Responses API
Reasoning configuration
Load balancing
Upstream header proxying
Prompt caching
API versioning
Server performance configuration
Deployment
Private CAs and self-signed certificates
Docker
Development
Development Environment
Setup
IDE configuration
Make on Windows
Run
Lint
Test
Clean
Git hooks

---

Overview

LLM Adapters unify the APIs of respective LLMs to align with the Unified Protocol of DIAL Core. Each Adapter operates within a dedicated container. Multi-modality allows supporting non-textual communications such as image-to-text, text-to-image, file transfers and more.

The project implements AI DIAL API for language models from Azure OpenAI.

---

Chat Completions API deployments

The adapter is able to convert certain upstream APIs to the DIAL Chat Completions API (which is an extension of Azure OpenAI Chat Completions API).

Chat Completions deployments are exposed via the endpoint:

POST ${ADAPTER_ORIGIN}/openai/deployments/${ADAPTER_DEPLOYMENT_ID}/chat/completions

Responses API deployments

Since: ai-dial-adapter-openai:0.38.0 AND ai-dial-core:0.42.0

The adapter is able to proxy requests to models supporting Responses API.

The following Responses API endpoints are exposed by the adapter:

POST ${ADAPTER_ORIGIN}/openai/v1/responses

Current limitations:

Background mode isn't supported since it makes use of the GET /responses/{response_id} endpoint which isn't supported yet.
WebSocket mode isn't supported.
Passing context from the previous response is limited to DIAL deployments with number of upstreams equal one.
References to DIAL files aren't supported.

Embedding deployments

The adapter is able to convert certain upstream APIs to the DIAL Embeddings API (which is an extension of Azure OpenAI Embeddings API).

Embeddings deployments are exposed via the endpoint:

POST ${ADAPTER_ORIGIN}/openai/deployments/${ADAPTER_DEPLOYMENT_ID}/embeddings

Categories of deployments

The following variables cluster all deployments into the groups of deployments which share the same API and the same tokenization algorithm.

Variable	Default	Description
DALLE3_DEPLOYMENTS	``\|Comma-separated list of deployments that support DALL-E 3 API. Example:` dall-e-3,dalle3,dall-e`
DALLE3_AZURE_API_VERSION	2024-02-01	The API version for requests to the Azure DALL·E 3 API
GPT_IMAGE_1_DEPLOYMENTS	``\|Comma-separated list of deployments that support GPT-Image 1 API. Example:` gpt-image-1`
GPT_IMAGE_1_AZURE_API_VERSION	2024-02-01	The API version for requests to the Azure GPT-Image 1 API
MISTRAL_DEPLOYMENTS	``\|Comma-separated list of deployments that support Mistral Large Azure API. Example:` mistral-large-azure,mistral-large`
DATABRICKS_DEPLOYMENTS	``\|Comma-separated list of Databricks chat completion deployments. Example:` databricks-dbrx-instruct,databricks-mixtral-8x7b-instruct,databricks-llama-2-70b-chat`
GPT4O_DEPLOYMENTS	``\|Comma-separated list of GPT-4o chat completion deployments. Example:` gpt-4o-2024-05-13`
GPT4O_MINI_DEPLOYMENTS	``\|Comma-separated list of GPT-4o mini chat completion deployments. Example:` gpt-4o-mini-2024-07-18`
VLLM_DEPLOYMENTS	``\|Comma-separated list of deployments that use a vLLM OpenAI-compatible upstream. Example:` vllm-llama3,vllm-qwen2`
QWEN3_ASR_VLLM_DEPLOYMENTS	``\| Comma-separated list of [Qwen3-ASR deployments](#qwen3-asr) served via vLLM. Example:` qwen3-asr`
AZURE_AI_VISION_DEPLOYMENTS	``\|Comma-separated list of Azure AI Vision embedding deployments. The endpoint of the deployment is expected to point to the Azure service:` https://<service-name>.cognitiveservices.azure.com/`
AUDIO_AZURE_API_VERSION	2025-03-01-preview	The API version for requests to the [Azure Audio API](#azure-audio-api) endpoints.

Deployments that do not fall into any of the categories are considered to support text-to-text chat completion OpenAI API or text embeddings OpenAI API.

Environment Variables

Copy .env.example to .env and customize it for your environment.

Configurable models

Certain models support configuration via the $ADAPTER_ORIGIN/openai/deployments/$DEPLOYMENT_NAME/configuration endpoint.

GET request to this endpoint returns the schema of the model configuration in JSON Schema format.

Such models expect the custom_fields.configuration field of the chat/completions request to contain a JSON value conforming to that schema. The custom_fields.configuration field is optional if and only if every field in the schema is also optional.

The configuration can be preset in the DIAL Core config via the defaults parameter:

<details><summary>DIAL Core Config</summary>

{
  "models": {
    "my-deployment-id": {
      "type": "chat",
      "endpoint": "$ADAPTER_ORIGIN/openai/deployments/my-deployment-id/chat/completions",
      "upstreams": [
        {
          "endpoint": "$AZURE_OPENAI_SERVICE_ORIGIN/openai/deployments/openai-deployment-id/chat/completions"
        }
      ],
      "defaults": {
        "custom_fields": {
            "configuration": $MODEL_CONFIGURATION_OBJECT
        }
      }
    }
  }
}

</details>

This is convenient when major model features can be enabled via configuration (e.g., web search or reasoning) and you want a deployment where these features are permanently enabled.

DIAL Core will enrich requests with the configuration specified in defaults, so the client doesn’t need to provide it with each chat completion request.

Supported upstream chat APIs

Azure OpenAI Chat Completions API (Last generation API)

<details><summary>DIAL Core Config</summary>

{
  "models": {
    "${DIAL_DEPLOYMENT_ID}": {
      "type": "chat",
      "endpoint": "${ADAPTER_ORIGIN}/openai/deployments/${ADAPTER_DEPLOYMENT_ID}/chat/completions",
      "upstreams": [
        {
          "endpoint": "https://${AZURE_OPENAI_SERVICE_NAME}.openai.azure.com/openai/deployments/${AZURE_OPENAI_DEPLOYMENT_ID}/chat/completions",
          "key": "${OPTIONAL_API_KEY}"
        }
      ]
    }
  }
}

</details>

There are three free variables in the config related to deployment ids. Each of these variables corresponds to an HTTP request initiated by the DIAL client:

DIAL_DEPLOYMENT_ID - it's the deployment id visible to the DIAL Client via DIAL deployment listing. The client will be using the id to call the model by sending the request POST ${DIAL_CORE_ORIGIN}/openai/deployments/${DIAL_DEPLOYMENT_ID}/chat/completions
ADAPTER_DEPLOYMENT_ID - the deployment id the OpenAI adapter receives when DIAL Core calls POST ${ADAPTER_ORIGIN}/openai/deployments/${ADAPTER_DEPLOYMENT_ID}/chat/completions. Use this identifier in environment variables that define deployment categories.
AZURE_OPENAI_DEPLOYMENT_ID - the Azure OpenAI deployment called by the OpenAI adapter.

sequenceDiagram autonumber actor U as DIAL Client participant C as DIAL Core participant A as OpenAI Adapter participant AZ as Azure OpenAI participant OP as OpenAI Platform Note over U,C: DIAL_DEPLOYMENT_ID U->>C: POST /openai/deployments/
${DIAL_DEPLOYMENT_ID}/chat/completions Note over C,A: ADAPTER_DEPLOYMENT_ID C->>A: POST ${ADAPTER_ORIGIN}/openai/deployments/
${ADAPTER_DEPLOYMENT_ID}/chat/completions alt Azure OpenAI upstream Note over A,AZ: AZURE_OPENAI_DEPLOYMENT_ID A->>AZ: POST https://${AZURE_OPENAI_SERVICE_NAME}.openai.azure.com/
openai/deployments/${AZURE_OPENAI_DEPLOYMENT_ID}/
chat/completions Note right of A: Auth: api-key (if provided) or Azure AD via DefaultAzureCredential AZ-->>A: JSON or SSE stream else OpenAI Platform upstream A->>OP: POST https://api.openai.com/v1/chat/completions
(with "model"=${OPENAI_MODEL_NAME}, api-key) OP-->>A: JSON or SSE stream end A-->>C: Normalized response (headers/stream) C-->>U: Response to client

Typically these three variables share the same value (the Azure OpenAI deployment name). They may differ if you expose multiple DIAL deployments that call the same Azure OpenAI endpoint but configured differently.

The DefaultAzureCredential is used to authenticate requests to Azure when an API key is not provided in the upstream configuration.

Azure OpenAI Chat Completions API (Next generation API)

The Next generation API (aka v1 API) doesn't include the deployment id in the URL:

Last generation API: POST https://SERVICE_NAME.openai.azure.com/openai/deployments/gpt-4o/chat/completions
Next generation API: POST https://SERVICE_NAME.openai.azure.com/openai/v1/chat/completions

The DIAL configuration changes accordingly:

<details><summary>DIAL Core Config</summary>

{
  "models": {
    "${DIAL_DEPLOYMENT_ID}": {
      "type": "chat",
      "overrideName": "${AZURE_OPENAI_DEPLOYMENT_ID}",
      "endpoint": "${ADAPTER_ORIGIN}/openai/deployments/${ADAPTER_DEPLOYMENT_ID}/chat/completions",
      "upstreams": [
        {
          "endpoint": "https://${AZURE_OPENAI_SERVICE_NAME}.openai.azure.com/openai/v1/chat/completions",
          "key": "${OPTIONAL_API_KEY}"
        }
      ]
    }
  }
}

</details>

Because the deployment ID is not included in the upstream URL, specify it in the overrideName field. If this field is missing, the model name takes the value of the model field from the original chat completion request (if present), otherwise ${ADAPTER_DEPLOYMENT_ID}.

Azure OpenAI Responses API (Next generation API)

Certain advanced features of OpenAI models, such as reasoning summary, are only accessible via Responses API and not accessible via Chat Completions API.

<details><summary>DIAL Core Config</summary>

{
  "models": {
    "${DIAL_DEPLOYMENT_ID}": {
      "type": "chat",
      "overrideName": "${AZURE_OPENAI_DEPLOYMENT_ID}",
      "endpoint": "${ADAPTER_ORIGIN}/openai/deployments/${ADAPTER_DEPLOYMENT_ID}/chat/completions",
      "upstreams": [
        {
          "endpoint": "https://${AZURE_OPENAI_SERVICE_NAME}.openai.azure.com/openai/v1/responses",
          "key": "${API_KEY}"
        }
      ]
    }
  }
}

</details>

As in other cases where the upstream URL omits a deployment id, specify it in the overrideName field.

The last generation API is also supported via an URLs in the following format:

"endpoint": "https://${AZURE_OPENAI_SERVICE_NAME}.openai.azure.com/openai/responses"

Web Search Tool

The deployments backed by Azure OpenAI Responses API support the Web Search tool, which could be enabled by passing a static function called web_search as one of the tools:

<details> <summary>Example request</summary>

{
  "model": "upstream-model-name",
  "messages": [
    {
      "role": "user",
      "content": "What is the weather in Kyiv now? Include source links."
    }
  ],
  "tools": [
    {
      "type": "static_function",
      "static_function": {
        "name": "web_search",
        "configuration": {
          "search_context_size": "high"
        }
      }
    }
  ],
  "stream": true
}

</details>

Each Web Search tool calls are translated into a DIAL stages, and URL citations are mirrored as DIAL attachments:

<details> <summary>Example response:</summary>

{
  "choices": [
    {
      "message": {
        "content": "Kyiv weather is mild.",
        "custom_content": {
          "stages": [
            {
              "name": "Web Search",
              "status": "completed",
              "content": "Search 'weather Kyiv'"
            }
          ],
          "attachments": [
            {
              "type": "text/markdown",
              "title": "Kyiv weather source",
              "url": "https://example.com/weather/kyiv"
            }
          ]
        }
      }
    }
  ]
}

</details>

Azure AI Foundry Chat Completions API

Certain LLM models like gpt-oss-120b or Mistral-Large-2411 can only be deployed to an Azure AI Foundry service. They are accessible via:

Azure AI model inference endpoint or
Azure OpenAI endpoint

<details><summary>DIAL Core Config (Azure AI model inference endpoint)</summary>

{
  "models": {
    "${DIAL_DEPLOYMENT_ID}": {
      "type": "chat",
      "overrideName": "${AZURE_AI_FOUNDRY_DEPLOYMENT_ID}",
      "endpoint": "${ADAPTER_ORIGIN}/openai/deployments/${ADAPTER_DEPLOYMENT_ID}/chat/completions",
      "upstreams": [
        {
          "endpoint": "https://${AZURE_AI_FOUNDRY_SERVICE_NAME}.services.ai.azure.com/models/chat/completions",
          "key": "${OPTIONAL_API_KEY}"
        }
      ]
    }
  }
}

</details>

<details><summary>DIAL Core Config (Azure OpenAI endpoint)</summary>

{
  "models": {
    "${DIAL_DEPLOYMENT_ID}": {
      "type": "chat",
      "overrideName": "${AZURE_AI_FOUNDRY_DEPLOYMENT_ID}",
      "endpoint": "${ADAPTER_ORIGIN}/openai/deployments/${ADAPTER_DEPLOYMENT_ID}/chat/completions",
      "upstreams": [
        {
          "endpoint": "https://${AZURE_AI_FOUNDRY_SERVICE_NAME}.openai.azure.com/openai/deployments/gpt-oss-120b/chat/completions",
          "key": "${OPTIONAL_API_KEY}"
        }
      ]
    }
  }
}

</details>

Azure OpenAI Images API

<details><summary>DIAL Core Config</summary>

{
  "models": {
    "${DIAL_DEPLOYMENT_ID}": {
      "type": "chat",
      "endpoint": "${ADAPTER_ORIGIN}/openai/deployments/${ADAPTER_DEPLOYMENT_ID}/chat/completions",
      "upstreams": [
        {
          "endpoint": "https://${AZURE_OPENAI_SERVICE_NAME}.openai.azure.com/openai/deployments/${AZURE_OPENAI_DEPLOYMENT_ID}/images/generations",
          "key": "${OPTIONAL_API_KEY}"
        }
      ]
    }
  }
}

</details>

The supported upstream models are dall-e-3 and gpt-image-1. These are the values that AZURE_OPENAI_DEPLOYMENT_ID variable can take.

[!IMPORTANT] The DALL·E 3 adapter deployment must be declared in DALLE3_DEPLOYMENTS env variable, and GPT-Image 1 deployment - in GPT_IMAGE_1_DEPLOYMENTS.

Azure OpenAI Video API (Sora 1 API)

<details><summary>DIAL Core Config</summary>

{
  "models": {
    "${DIAL_DEPLOYMENT_ID}": {
      "type": "chat",
      "overrideName": "sora",
      "endpoint": "${ADAPTER_ORIGIN}/openai/deployments/sora/chat/completions",
      "upstreams": [
        {
          "endpoint": "https://${AZURE_OPENAI_SERVICE_NAME}.openai.azure.com/openai/v1/video/generations",
          "key": "${OPTIONAL_API_KEY}"
        }
      ]
    }
  }
}

</details>

The video generation models support configuration via the custom_fields.configuration field in the chat completion request:

{
  "model": "sora",
  "messages": [
    {
      "role": "user",
      "content": "A cat playing with a ball of yarn"
    }
  ],
  "custom_fields": {
    "configuration": {
      "width": 480,
      "height": 480,
      "n_seconds": 5,
      "n_variants": 1
    }
  }
}

Width and height are defaulted to 480x480 if not specified.

Find the details in the Azure API specification.

[!NOTE] n_variants>1 results in multiple video attachments to a single chat completion choice.

[!IMPORTANT] Prompt tokens in the usage are set to zero. Completion tokens are set to the overall number of seconds in the generated video(s).

Azure OpenAI Sora 2 API

<details><summary>DIAL Core Config</summary>

{
  "models": {
    "${DIAL_DEPLOYMENT_ID}": {
      "type": "chat",
      "overrideName": "sora-2",
      "endpoint": "${ADAPTER_ORIGIN}/openai/deployments/sora-2/chat/completions",
      "upstreams": [
        {
          "endpoint": "https://${AZURE_OPENAI_SERVICE_NAME}.openai.azure.com/openai/v1/videos",
          "key": "${OPTIONAL_API_KEY}"
        }
      ]
    }
  }
}

</details>

The Sora 2 deployment works in either of following modes:

text-to-video generation: the last user message is used as a text prompt sent to Sora 2

<details> <summary>Chat completion request</summary>

    {
      "model": "sora-2",
      "messages": [
        {
          "role": "system",
          "content": "A system message that will be ignored"
        },
        {
          "role": "user",
          "content": "A cat playing with a ball of yarn"
        }
      ]
    }

</details>

image-to-video generation: if the last user message has an attachment, this attachment is sent to Sora 2 as a reference source along with the last user message as a text prompt.

<details> <summary>Chat completion request</summary>

    {
      "model": "sora-2",
      "messages": [
        {
          "role": "user",
          "content": [
            {"type": "text", "text": "Animate the image"},
            {"type": "image_url", "image_url": {"url": "http://example.com/image.jpg"}}
          ]
        }
      ]
    }

</details>

Video remixing (video-to-video generation) isn't supported.

The Sora 2 deployment supports configuration via the custom_fields.configuration field in the chat completion request:

{
  "model": "sora-2",
  "messages": [
    {
      "role": "user",
      "content": "A cat playing with a ball of yarn"
    }
  ],
  "custom_fields": {
    "configuration": {
      "seconds": 4,
      "size": "720x1280",
      "auto_crop_reference_images": true
    }
  }
}

The size is defaulted to 720x1280 if not specified. The duration is defaulted to 4 seconds if not specified.

The auto cropping flag enables cropping of the input reference image to the output video size. It can be useful, since Sora 2 rejects any requests where the resolution of the source image and final video do not match. The flag defaults to False.

Find the details in the Azure Sora 2 API specification.

[!IMPORTANT] Prompt tokens in the usage are set to zero. Completion tokens are set to the overall number of seconds in the generated video(s).

Azure Audio API

The adapter supports models connected via Azure Audio API.

Text-to-speech models (TTS)

Set AZURE_DEPLOYMENT_ID variable to one of the text-to-speech models supported by Azure Audio API:

<details><summary>DIAL Core Config</summary>

{
  "models": {
    "${DIAL_DEPLOYMENT_ID}": {
      "type": "chat",
      "endpoint": "${ADAPTER_ORIGIN}/openai/deployments/${AZURE_AUDIO_API_DEPLOYMENT_ID}/chat/completions",
      "upstreams": [
        {
          "endpoint": "https://${AZURE_SERVICE_NAME}.(openai|cognitiveservices).azure.com/openai/deployments/${AZURE_DEPLOYMENT_ID/audio/speech",
          "key": "${OPTIONAL_API_KEY}"
        }
      ]
    }
  }
}

</details>

At the moment of writing, these are: tts, tts-hd, and gpt-4o-mini-tts.

The adapter takes the last user message as a text prompt and sends it to the upstream as input parameter. The input text is limited to 4096 characters. The text is being translated into speech audio by the upstream model. The audio file is returned as an attachment in the chat completion response.

System instructions are used to set the tone of the synthesized speech.

The adapter supports the following configuration for the TTS models:

{
  "instruction": "Speak in a cheerful tone.", # optional, sets the tone; appended the system message from the chat completion request
  "voice": "allow", # one of the preset voices
  "speed": 1.0, # speech speed multiplier
  "response_format": "mp3" # one of the supported audio formats
}

Find the configuration details in the Azure specification or in the OpenAI Platform specification.

The usage is computed in the following way:

gpt-4o-mini-tts - prompt tokens are computed using gpt-4o tiktoken algorithm. Completion tokens are set to zero.
tts and tts-hd - there is no official documentation on the pricing for these models. Tokenizer for gpt-4o model will be used as a default for prompt tokens calculation. Completion tokens are set to zero.

Speech-to-text models (STT)

Set AZURE_DEPLOYMENT_ID variable to one of the speech-to-text models supported by Azure Audio API:

<details><summary>DIAL Core Config</summary>

{
  "models": {
    "${DIAL_DEPLOYMENT_ID}": {
      "type": "chat",
      "endpoint": "${ADAPTER_ORIGIN}/openai/deployments/${AZURE_AUDIO_API_DEPLOYMENT_ID}/chat/completions",
      "upstreams": [
        {
          "endpoint": "https://${AZURE_SERVICE_NAME}.(openai|cognitiveservices).azure.com/openai/deployments/${AZURE_DEPLOYMENT_ID/audio/transcriptions",
          "key": "${OPTIONAL_API_KEY}"
        }
      ]
    }
  }
}

</details>

At the moment of writing, these are: whisper, gpt-4o-transcribe, gpt-4o-mini-transcribe, and gpt-4o-transcribe-diarize.

The adapter takes an audio attachment from the last user message and pass it to the transcription model. The transcription is return as a text in the chat completion response.

System instructions are used to set the prompt parameter in the Transcription API request.

The usage is computed in the following way:

gpt-4o-* models return audio tokens in the usage.prompt_tokens field and text tokens - in usage.completion_tokens.
whisper models return duration of the given audio file in seconds in usage.prompt_tokens and zero in usage.completion_tokens.

OpenAI Platform Chat Completions API

<details><summary>DIAL Core Config</summary>

{
  "models": {
    "${DIAL_DEPLOYMENT_ID}": {
      "type": "chat",
      "overrideName": "${OPENAI_MODEL_NAME}",
      "endpoint": "${ADAPTER_ORIGIN}/openai/deployments/${ADAPTER_DEPLOYMENT_ID}/chat/completions",
      "upstreams": [
        {
          "endpoint": "https://api.openai.com/v1/chat/completions",
          "key": "${API_KEY}"
        }
      ]
    }
  }
}

</details>

Note the difference from the Azure OpenAI configuration:

The API key is required.
Added overrideName to specify the upstream OpenAI model name. The upstream URL does not include the model name (unlike Azure), so we pass it via overrideName. If this field is missing, the model name takes the value of the model field from the original chat completion request (if present), otherwise ${ADAPTER_DEPLOYMENT_ID}.

OpenAI Completions API

The adapter also supports legacy Completions API both for Azure-style upstream endpoints and OpenAI Platform-style endpoints:

<details><summary>DIAL Core Config</summary>

{
  "models": {
    "${DIAL_DEPLOYMENT_ID}": {
      "type": "chat",
      "overrideName": "${OPENAI_MODEL_NAME}",
      "endpoint": "${ADAPTER_ORIGIN}/openai/deployments/${ADAPTER_DEPLOYMENT_ID}/chat/completions",
      "upstreams": [
        {
          "endpoint": "https://api.openai.com/v1/completions",
          "key": "${API_KEY}"
        }
      ]
    }
  }
}

</details>

Mistral Chat Completion API

The Mistral Platform provides Chat Completions API, therefore, it could be connected to via the adapter:

<details><summary>DIAL Core Config</summary>

{
  "models": {
    "${DIAL_DEPLOYMENT_ID}": {
      "type": "chat",
      "overrideName": "${MISTRAL_MODEL_NAME}",
      "endpoint": "${ADAPTER_ORIGIN}/openai/deployments/${MISTRAL_MODEL_NAME}/chat/completions",
      "upstreams": [
        {
          "endpoint": "https://api.mistral.ai/v1/chat/completions",
          "key": "${MISTRAL_API_KEY}"
        }
      ]
    }
  }
}

</details>

Where MISTRAL_MODEL_NAME is one of the available models on the Platform.

The deployment should be added to the environment variable MISTRAL_DEPLOYMENTS.

The adapter supports reasoning for Magistral models. The reasoning tokens are displayed in a dedicated stage titled Reasoning.

vLLM Chat Completion API

vLLM provides an OpenAI-compatible Chat Completions API and can be connected to the adapter.

<details><summary>DIAL Core Config</summary>

{
  "models": {
    "${DIAL_DEPLOYMENT_ID}": {
      "type": "chat",
      "overrideName": "${VLLM_MODEL_NAME}",
      "endpoint": "${ADAPTER_ORIGIN}/openai/deployments/${ADAPTER_DEPLOYMENT_ID}/chat/completions",
      "upstreams": [
        {
          "endpoint": "${VLLM_ORIGIN}/v1/chat/completions"
        }
      ]
    }
  }
}

</details>

Enable the vLLM-specific flow by adding ${ADAPTER_DEPLOYMENT_ID} to the environment variable VLLM_DEPLOYMENTS.

Qwen3-ASR

You can connect the Qwen3-ASR model served with vLLM to DIAL. This adapter provides first-class support for this integration scenario:

Audio attachments: Clients send audio files as DIAL attachments (mime types audio/*). The adapter converts them into the content parts expected by the vLLM Chat Completions API.
ASR language metadata extraction: The adapter reports the detected language in a dedicated DIAL stage titled Language: English (or whichever language was detected).

[!NOTE] QWEN3_ASR_VLLM_DEPLOYMENTS is separate from VLLM_DEPLOYMENTS. Deployments listed in QWEN3_ASR_VLLM_DEPLOYMENTS receive the ASR language extraction post-processing, while regular VLLM_DEPLOYMENTS receive reasoning extraction instead.

Anthropic Messages API

The adapter supports Claude models deployed in Azure Foundry and exposing Anthropic Messages API:

<details><summary>DIAL Core Config</summary>

{
  "models": {
    "${DIAL_DEPLOYMENT_ID}": {
      "type": "chat",
      "overrideName": "${ANTHROPIC_MODEL_NAME}",
      "endpoint": "${ADAPTER_ORIGIN}/openai/deployments/${ADAPTER_DEPLOYMENT_ID}/chat/completions",
      "upstreams": [
        {
          "endpoint": "https://${AZURE_AI_FOUNDRY_SERVICE_NAME}.services.ai.azure.com/anthropic/v1/messages",
          "key": "${OPTIONAL_API_KEY}"
        }
      ]
    }
  }
}

</details>

Default `max_tokens` for Claude models

Unlike OpenAI GPT models, Claude models require the max_tokens parameter in the chat completion request.

We recommend configuring max_tokens default value on a per-model basis in the DIAL Core Config, for example:

{
    "models": {
        "dial-claude-deployment-id": {
            "type": "chat",
            "description": "...",
            "endpoint": "...",
            "defaults": {
                "max_tokens": 2048
            }
        }
    }
}

If the default is missing in the DIAL Core Config, it will be taken from the CLAUDE_DEFAULT_MAX_TOKENS environment variable. However, we strongly recommend not to rely on this variable and instead configure the defaults in the DIAL Core Config. Such a per-model configuration is operationally cleaner since all the information relevant to tokens (like pricing and token limits) is kept in the same place.

The default value set in the DIAL Core Config takes precedence over the one configured in the adapter.

Make sure the default doesn't exceed Claude's max output tokens, otherwise, you will receive an error like this one: max_tokens: 10000 > 8192, which is the maximum allowed number of output tokens for claude-...).

Automatic prompt caching

The adapter supports automatic prompt caching.

To enable it:

Configure a top-level cache breakpoint in the chat completion request via defaults.custom_fields.cache_breakpoint.
If the DIAL deployment uses multiple upstreams, set autoCachingSupported: true in the DIAL Core configuration.

<details><summary>DIAL Core Config</summary>

{
  "models": {
    "${DIAL_DEPLOYMENT_ID}": {
      "type": "chat",
      "overrideName": "${ANTHROPIC_MODEL_NAME}",
      "defaults": {
        "custom_fields": {
          "cache_breakpoint": {}
        }
      },
      "endpoint": "${ADAPTER_ORIGIN}/openai/deployments/${ADAPTER_DEPLOYMENT_ID}/chat/completions",
      "upstreams": [
        {
          "endpoint": "https://${AZURE_AI_FOUNDRY_SERVICE_NAME1}.services.ai.azure.com/anthropic/v1/messages",
          "key": "${OPTIONAL_API_KEY1}"
        },
        {
          "endpoint": "https://${AZURE_AI_FOUNDRY_SERVICE_NAME2}.services.ai.azure.com/anthropic/v1/messages",
          "key": "${OPTIONAL_API_KEY2}"
        }
      ],
      "features": {
        "autoCachingSupported": true
      }
    }
  }
}

</details>

Explicit prompt caching

The adapter support explicit cache breakpoints in system and user message as well as in the tool definitions. Find the examples of requests in the Anthropic adapter documentation.

Set the feature flag cacheSupported: true in the DIAL Core configuration, when the DIAL deployment has multiple upstreams. This flag enables logic in DIAL Core that routes chat completions requests with the same prefixes to the same upstreams:

<details><summary>DIAL Core Config</summary>

{
  "models": {
    "${DIAL_DEPLOYMENT_ID}": {
      "type": "chat",
      "overrideName": "${ANTHROPIC_MODEL_NAME}",
      "endpoint": "${ADAPTER_ORIGIN}/openai/deployments/${ADAPTER_DEPLOYMENT_ID}/chat/completions",
      "upstreams": [
        {
          "endpoint": "https://${AZURE_AI_FOUNDRY_SERVICE_NAME1}.services.ai.azure.com/anthropic/v1/messages",
          "key": "${OPTIONAL_API_KEY1}"
        },
        {
          "endpoint": "https://${AZURE_AI_FOUNDRY_SERVICE_NAME2}.services.ai.azure.com/anthropic/v1/messages",
          "key": "${OPTIONAL_API_KEY2}"
        }
      ],
      "features": {
        "cacheSupported": true
      }
    }
  }
}

</details>

Supported upstream Responses APIs

Note that in the following DIAL Core config examples, responsesEndpoint URL enables Responses API in DIAL. Whereas, endpoint URL is required and enables Chat Completions API in DIAL.

Azure OpenAI Responses API

<details><summary>DIAL Core Config</summary>

{
  "models": {
    "${DIAL_DEPLOYMENT_ID}": {
      "type": "chat",
      "overrideName": "${AZURE_OPENAI_DEPLOYMENT_ID}",
      "responsesEndpoint": "${ADAPTER_ORIGIN}/openai/v1/responses",
      "upstreams": [
        {
          "responsesEndpoint": "https://${AZURE_OPENAI_SERVICE_NAME1}.openai.azure.com/openai/v1/responses",
          "key": "${OPTIONAL_API_KEY1}"
        },
        {
          "responsesEndpoint": "https://${AZURE_OPENAI_SERVICE_NAME2}.openai.azure.com/openai/v1/responses",
          "key": "${OPTIONAL_API_KEY2}"
        },
        {
          "responsesEndpoint": "https://${AZURE_OPENAI_SERVICE_NAME3}.openai.azure.com/openai/v1/responses",
          "key": "${OPTIONAL_API_KEY3}"
        }
      ]
    }
  }
}

</details>

OpenAI Platform Responses API

<details><summary>DIAL Core Config</summary>

{
  "models": {
    "${DIAL_DEPLOYMENT_ID}": {
      "type": "chat",
      "overrideName": "${OPENAI_PLATFORM_MODEL_NAME}",
      "responsesEndpoint": "${ADAPTER_ORIGIN}/openai/v1/responses",
      "upstreams": [
        {
          "responsesEndpoint": "https://api.openai.com/v1/responses",
          "key": "${API_KEY}"
        }
      ]
    }
  }
}

</details>

---

Supported upstream embedding APIs

Azure OpenAI Embeddings API (Last generation API)

<details><summary>DIAL Core Config</summary>

{
  "models": {
    "${DIAL_DEPLOYMENT_ID}": {
      "type": "embedding",
      "endpoint": "${ADAPTER_ORIGIN}/openai/deployments/${ADAPTER_DEPLOYMENT_ID}/embeddings",
      "upstreams": [
        {
          "endpoint": "https://${AZURE_OPENAI_SERVICE_NAME}.openai.azure.com/openai/deployments/${AZURE_OPENAI_DEPLOYMENT_ID}/embeddings",
          "key": "${OPTIONAL_API_KEY}"
        }
      ]
    }
  }
}

</details>

Azure OpenAI Embeddings API (Next generation API)

<details><summary>DIAL Core Config</summary>

{
  "models": {
    "${DIAL_DEPLOYMENT_ID}": {
      "type": "embedding",
      "overrideName": "${AZURE_OPENAI_DEPLOYMENT_ID}",
      "endpoint": "${ADAPTER_ORIGIN}/openai/deployments/${ADAPTER_DEPLOYMENT_ID}/embeddings",
      "upstreams": [
        {
          "endpoint": "https://${AZURE_OPENAI_SERVICE_NAME}.openai.azure.com/openai/v1/embeddings",
          "key": "${OPTIONAL_API_KEY}"
        }
      ]
    }
  }
}

</details>

Azure multimodal embeddings

The adapter supports Azure Multimodal embeddings.

<details><summary>DIAL Core Config</summary>

{
  "models": {
    "${DIAL_DEPLOYMENT_ID}": {
      "type": "embedding",
      "endpoint": "${ADAPTER_ORIGIN}/openai/deployments/${ADAPTER_DEPLOYMENT_ID}/embeddings",
      "upstreams": [
        {
          "endpoint": "https://${COMPUTER_VISION_SERVICE_NAME}.cognitiveservices.azure.com",
          "key": "${OPTIONAL_API_KEY}"
        }
      ]
    }
  }
}

</details>

[!IMPORTANT] ${ADAPTER_DEPLOYMENT_ID} must be added to the env variable AZURE_AI_VISION_DEPLOYMENTS to enable the embeddings deployment.

The multimodal embeddings model supports text and images as inputs.

Since the original OpenAI embeddings API only support text inputs, the image inputs should be passed in the custom_input request field as URL or in base64-encoded format:

curl -X POST "${DIAL_CORE_ORIGIN}/deployments/${DIAL_DEPLOYMENT_ID}/embeddings" -v \
  -H "api-key:${DIAL_API_KEY}" \
  -H "content-type:application/json" \
  -d '{"input": ["cat", "fish"], "custom_input": [{"type": "image/png", "url": "https://learn.microsoft.com/azure/ai-services/computer-vision/media/quickstarts/presentation.png"}]}'

The response will contain three embedding vectors, each corresponding to one of the inputs in the original request.

OpenAI Platform Embeddings API

<details><summary>DIAL Core Config</summary>

{
  "models": {
    "${DIAL_DEPLOYMENT_ID}": {
      "type": "embedding",
      "overrideName": "${OPENAI_MODEL_NAME}",
      "endpoint": "${ADAPTER_ORIGIN}/openai/deployments/${ADAPTER_DEPLOYMENT_ID}/embeddings",
      "upstreams": [
        {
          "endpoint": "https://api.openai.com/v1/embeddings",
          "key": "${API_KEY}"
        }
      ]
    }
  }
}

</details>

---

Models based on Responses API

The Responses API provides more features than Chat Completions API. Some of these features could be enabled via a configuration fields in the chat completions request.

The JSON schema of the configuration is open which enables forward compatibility with the future developments in the Responses API.

[!NOTE] Such a configuration is only possible for the models that are configured in the DIAL Core config to use Responses API upstream endpoints.

Reasoning configuration

The reasoning and the reasoning summary could be enabled via the configuration like this one:

<details><summary>Request</summary>

{
  "model": "gpt-5-2025-08-07",
  "messages": [
    {
      "role": "user",
      "content": "Write a bash script that takes a matrix represented as a string with format \"[1,2],[3,4],[5,6]\" and prints the transpose in the same format."
    }
  ],
  "custom_fields": {
    "configuration": {
      "reasoning": {
        "effort": "medium",
        "summary": "auto"
      }
    }
  }
}

</details>

Here custom_fields.configuration.reasoning is an object which is being passed to the Response API as the reasoning parameter.

[!IMPORTANT] Not all models support reasoning. Consult with the documentation before enabling reasoning.

---

API versioning

The adapter provides an Azure-flavour of the OpenAI Chat Completions API.

Azure’s API is a variant of the OpenAI Platform API. The key differences are the deployment ID in the path and the required api-version query parameter:

OpenAI Platform: POST https://api.openai.com/v1/chat/completions
Azure OpenAI:    POST https://YOUR_RESOURCE_NAME.openai.azure.com/openai/deployments/YOUR_DEPLOYMENT_NAME/chat/completions?api-version=2024-06-01

The api-version parameter tracks API changes, and the OpenAI SDK requires it.

Consider an application calling Azure OpenAI via DIAL. You typically pin an Azure OpenAI API version (usually the latest). Over time, new API versions ship with new features, and SDKs add support for them. This

🎯 aiskill88 AI 点评 A 级 2026-06-10

高质量的 AI 工具集成项目