Skip to main content

Overview

MiniMax Voice is an audio model for creating reusable custom voices. Use design to create a new voice from a text description, or use clone to create a reusable voice from one reference audio clip.
CapabilityValue
Model IDminimax-voice
Modesdesign, clone
Built-in voicesNo
Custom voicesYes
Voice ID sourceGenerated by the platform from voice_id_prefix
Design preview text length1-500 characters
Clone preview text length1-2000 characters when provided
Clone audio URLsExactly 1 URL

Endpoint and authentication

Base URL:
https://api.apixo.ai/api/v1
MethodEndpointPurpose
POST/generateTask/minimax-voiceSubmit a design or clone task
GET/statusTask/minimax-voice?taskId={taskId}Poll task status and retrieve results
All requests require your APIXO API key:
Authorization: Bearer YOUR_API_KEY
Submit requests also require:
Content-Type: application/json

Copy-paste async quickstart

This minimal request submits a clone task and returns a taskId.
curl -X POST "https://api.apixo.ai/api/v1/generateTask/minimax-voice" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "request_type": "async",
    "input": {
      "mode": "clone",
      "voice_id_prefix": "warm001",
      "audio_urls": [
        "https://example.com/reference.wav"
      ],
      "preview_text": "Hello, this is a cloned voice preview.",
      "need_noise_reduction": true,
      "need_volume_normalization": true,
      "accuracy": 0.7,
      "language_boost": "Chinese"
    }
  }'
Successful response:
{
  "code": 200,
  "message": "success",
  "data": {
    "taskId": "task_12345678",
    "state": "processing",
    "voice_id": "warm001-a1b2c3",
    "voice_type": "preparing"
  }
}
Save the taskId; you need it to poll for the final result.

Poll for result

curl -X GET "https://api.apixo.ai/api/v1/statusTask/minimax-voice?taskId=task_12345678" \
  -H "Authorization: Bearer YOUR_API_KEY"
Processing response:
{
  "code": 200,
  "message": "success",
  "data": {
    "taskId": "task_12345678",
    "state": "processing",
    "voice_id": "warm001-a1b2c3",
    "voice_type": "preparing",
    "createTime": 1781502331739
  }
}
Success response:
{
  "code": 200,
  "message": "success",
  "data": {
    "taskId": "task_12345678",
    "state": "success",
    "voice_id": "warm001-a1b2c3",
    "voice_type": "active",
    "resultJson": "{\"resultUrls\":[\"https://file.apixo.ai/temp/preview.mp3\"]}",
    "createTime": 1781502331739,
    "completeTime": 1781502332542,
    "costTime": 803
  }
}
Failed response:
{
  "code": 200,
  "message": "success",
  "data": {
    "taskId": "task_12345678",
    "state": "failed",
    "voice_id": "warm001-a1b2c3",
    "failCode": "VALIDATION_ERROR",
    "failMsg": "Reference audio validation failed.",
    "createTime": 1781502331739,
    "completeTime": 1781502332542
  }
}
Parse resultJson after state becomes success:
const payload = JSON.parse(data.resultJson);
const audioUrls = payload.resultUrls;

Request body

Design

{
  "request_type": "async",
  "input": {
    "mode": "design",
    "voice_id_prefix": "warm001",
    "prompt": "A warm and calm female narration voice for product explainers.",
    "preview_text": "Hello, welcome to our product demo."
  }
}

Clone

{
  "request_type": "async",
  "input": {
    "mode": "clone",
    "voice_id_prefix": "warm001",
    "audio_urls": [
      "https://example.com/reference.wav"
    ],
    "preview_text": "Hello, this is a cloned voice preview.",
    "need_noise_reduction": true,
    "need_volume_normalization": true,
    "accuracy": 0.7,
    "language_boost": "Chinese"
  }
}

Parameters

request_type
string
default:"async"
required
Result delivery mode. Use async for polling with statusTask, or callback for webhook delivery.
callback_url
string
Required when request_type is callback. Must be a public HTTPS URL that can receive the final task payload. See Webhooks.
input
object
required
MiniMax Voice input parameters.

Voice ID behavior

  • The final voice_id is generated automatically from voice_id_prefix.
  • A typical returned value looks like warm001-a1b2c3.
  • Use the returned voice_id in later MiniMax speech generation requests that support custom voices.
  • If you want to keep a newly created custom voice available for long-term reuse, use that voice_id in a supported speech request after creation.
  • A newly created custom voice that is never used later may become unavailable after 7 days.

Validation and media rules

  • Only the documented fields on this page are part of the public request contract.
  • Do not send your own final voice_id in the request; the platform generates it for you.
  • Clone requires exactly 1 reference audio URL.
  • Reference audio URLs must be publicly reachable.
  • Supported reference audio formats include mp3, m4a, and wav.
  • Clear speech works best. Heavy background music, strong noise, long silence, or invalid audio content can cause the task to fail.
  • The preview synthesis model for clone mode is platform-managed and is not user-selectable.

Response format

Submit task response

POST /generateTask/minimax-voice returns a task ID when the task is accepted:
code
integer
API status code. 200 means the task was accepted.
message
string
Human-readable status message.
data.taskId
string
Unique task identifier used with the status endpoint.

Status response fields

taskId
string
Unique task identifier.
state
string
Current task state: processing, success, or failed.
voice_id
string
Reusable custom voice ID generated for this task.
voice_type
string
Voice readiness state. preparing means the custom voice is still being prepared; active means it was created successfully.
resultJson
string
JSON string containing preview audio result URLs. Present when preview audio is available.
failCode
string
Machine-readable failure code. Present when state is failed.
failMsg
string
Human-readable failure message. Present when state is failed.
createTime
integer
Task creation timestamp in Unix milliseconds.
completeTime
integer
Task completion timestamp in Unix milliseconds. Present after completion.
costTime
integer
Processing duration in milliseconds. Present after successful completion.

Webhook callback mode

Use callback mode when your backend should receive the final result automatically instead of polling.
curl -X POST "https://api.apixo.ai/api/v1/generateTask/minimax-voice" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "request_type": "callback",
    "callback_url": "https://your-server.com/webhooks/apixo",
    "input": {
      "mode": "design",
      "voice_id_prefix": "warm001",
      "prompt": "A warm and calm female narration voice for product explainers.",
      "preview_text": "Hello, welcome to our product demo."
    }
  }'
See Webhooks for delivery requirements and retry behavior.

Billing

MiniMax Voice is billed per request.
WorkflowAPIXO price
design$0.50 / request
clone$0.50 / request
For current route and market comparison pricing, see Pricing.

Latency and polling

Voice creation usually takes longer than lightweight TTS requests because the custom voice must be prepared before it can be reused.
WorkflowRecommended first pollPoll interval
design10s after task creation5s-10s
clone10s after task creation5s-10s
Use callback mode for production voice-creation workflows so your backend does not need to keep polling.

Errors and troubleshooting

HTTP errors

CodeMeaningWhat to do
400Invalid request body or parameter shapeFix the request before retrying
401Missing or invalid API keyCheck the Authorization header
402Insufficient balance or quotaAdd balance or switch account/key
403Key or route cannot access the modelCheck permissions and route strategy
429Rate limit or concurrency limit reachedRetry with exponential backoff
500Server errorRetry with backoff
502Provider-side error during voice creationRetry with backoff
504Voice creation timeoutRetry or use callback mode

Common validation and failure cases

  • voice_id_prefix must start with a letter, contain only letters or digits, and be at least 6 characters long.
  • preview_text is required for design, and cannot be empty.
  • audio_urls must contain exactly one non-empty URL for clone.
  • need_noise_reduction and need_volume_normalization must be boolean values when provided.
  • accuracy must be between 0 and 1.
  • language_boost must be one of the documented supported values.
  • Reference audio that is unreachable, empty, too noisy, or otherwise invalid can cause task failure.