Deploying OpenAI Text-to-Speech (TTS) Model on Azure: A Step-by-Step Guide
Azure Cognitive Services provides a straightforward way to deploy OpenAI models, including powerful text-to-speech capabilities. In this guide, I’ll demonstrate how to deploy a text-to-speech model using Azure CLI commands.
Prerequisites
- An Azure subscription
- Azure CLI installed and logged in (
az login
)
Step 1: Define Environment Variables
Set your environment variables to simplify and standardize deployments.
export AZURE_RESOURCE_GROUP="example-rg"
export AZURE_REGION="eastus"
export OPENAI_NAME="example-openai"
export AZURE_SUBSCRIPTION_ID="your-subscription-id"
# Keep these variables as is.
export AUDIO_MODEL="gpt-4o-mini-tts"
export AUDIO_MODEL_VERSION="2025-03-20"
Explanation:
AZURE_RESOURCE_GROUP
: Azure Resource Group name. Choose a unique name for your resource group to contain all related resources.AZURE_REGION
: Azure region where resources will be created. Common regions includeeastus
,westus
,westeurope
, etc.OPENAI_NAME
: Unique name for your OpenAI resource. This will be used to access your deployed model.AZURE_SUBSCRIPTION_ID
: Azure subscription identifier. You can find this in your Azure portal under “Subscriptions”.AUDIO_MODEL
: The specific model you want to deploy. For text-to-speech,gpt-4o-mini-tts
is a suitable choice.AUDIO_MODEL_VERSION
: The version of the model you wish to use. This should match the version available in Azure OpenAI service.
Step 2: Create Resource Group
Create a resource group to contain your OpenAI resources, if it doesn’t already exist, by running the following command:
az group create \
--name "${AZURE_RESOURCE_GROUP}" \
--location "${AZURE_REGION}" \
--subscription "${AZURE_SUBSCRIPTION_ID}"
Step 3: Create Cognitive Services Account
Create an Azure Cognitive Services account for hosting your OpenAI model:
az cognitiveservices account create \
--name "${OPENAI_NAME}" \
--resource-group "${AZURE_RESOURCE_GROUP}" \
--location "${AZURE_REGION}" \
--kind OpenAI \
--sku S0 \
--custom-domain "${OPENAI_NAME}" \
--subscription "${AZURE_SUBSCRIPTION_ID}"
Explanation:
--name
: Sets the unique name for the service.--kind
: Specifies the kind of service (OpenAI).--sku
: Selects the pricing tier (S0 is standard).--custom-domain
: Custom domain name for accessing the service.
Step 4: Deploy the OpenAI Model
Deploy your chosen model (gpt-4o-mini-tts
) to your account:
az cognitiveservices account deployment create \
--name "${OPENAI_NAME}" \
--resource-group "${AZURE_RESOURCE_GROUP}" \
--deployment-name "${OPENAI_NAME}-${AUDIO_MODEL}" \
--model-name "${AUDIO_MODEL}" \
--model-version "${AUDIO_MODEL_VERSION}" \
--model-format OpenAI \
--sku-capacity 1 \
--sku-name "GlobalStandard"
Explanation:
--deployment-name
: Unique name for this specific model deployment.--model-name
: The OpenAI model to deploy.--model-version
: Specific version of the model.--model-format
: Format of the model (OpenAI).--sku-capacity
: Number of instances allocated for the deployment.--sku-name
: SKU type that dictates the availability and performance.
Step 5: Retrieve API Key
Fetch the API key necessary for accessing your deployed model:
export AZURE_OPENAI_API_KEY=$(az cognitiveservices account keys list \
--name "${OPENAI_NAME}" \
--resource-group "${AZURE_RESOURCE_GROUP}" \
--query key1 -o tsv)
Explanation:
--query key1
: Retrieves the primary key of your Cognitive Services account.-o tsv
: Outputs the result in a tab-separated format for easy scripting.
Step 6: Set Endpoint URL
Create an environment variable for your model endpoint:
export OPENAI_ENDPOINT="https://${OPENAI_NAME}.openai.azure.com/openai/deployments/${OPENAI_NAME}-${AUDIO_MODEL}/audio/speech?api-version=2025-04-01-preview"
Verify Deployment with cURL
To verify your deployment, run the following cURL command:
curl "$OPENAI_ENDPOINT" \
-H "api-key: $AZURE_OPENAI_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o-mini-tts",
"input": "I am excited to try text to speech.",
"voice": "alloy",
"instructions": "Speak in a cheerful and positive tone."
}' --output speech.mp3
Explanation:
- Sends a POST request to the OpenAI endpoint.
-H
: Includes headers specifying API key and content type.-d
: Defines the JSON payload with model details, input text, voice choice, and speaking instructions.--output
: Saves the generated speech audio tospeech.mp3
.
Summary
You’ve successfully deployed an OpenAI text-to-speech model on Azure. You can now leverage Azure’s powerful infrastructure and your custom deployment to build advanced audio-based applications. Always ensure you securely handle your API keys in production environments.