MiniMax Speech 2.8 API

Overview

MiniMax Speech 2.8 is an async text-to-speech model for high-quality speech generation. APIXO supports built-in preset voices through the documented preset voice_id list, and also accepts custom voice_id values created by MiniMax Voice.

Capability	Value
Model ID	`minimax-speech-2-8`
Modes	`speech-turbo`, `speech-hd`
Built-in preset voices	Yes, via documented preset `voice_id` list
Custom voices	Yes
Prompt length	1-10000 characters
Pronunciation dictionary format	`Alias/Pronunciation`
Speed range	`0.5` to `2.0`
Volume range	`0.1` to `10.0`
Pitch range	`-12` to `12`

For the full built-in preset voice_id list, see MiniMax Speech 2.8 Preset Voices.

Endpoint and authentication

Base URL:

https://api.apixo.ai/api/v1

Method	Endpoint	Purpose
`POST`	`/generateTask/minimax-speech-2-8`	Submit a speech task
`GET`	`/statusTask/minimax-speech-2-8?taskId={taskId}`	Poll task status and retrieve results

All requests require your APIXO API key:

Authorization: Bearer YOUR_API_KEY

Submit requests also require:

Content-Type: application/json

Copy-paste async quickstart

This minimal request submits a speech task and returns a taskId.

curl -X POST "https://api.apixo.ai/api/v1/generateTask/minimax-speech-2-8" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "request_type": "async",
    "input": {
      "mode": "speech-hd",
      "voice_id": "preset_general_wise_woman",
      "prompt": "Hello, welcome to APIXO.",
      "format": "mp3"
    }
  }'

Successful submit response:

{
  "code": 200,
  "message": "success",
  "data": {
    "taskId": "task_12345678",
    "state": "processing"
  }
}

Save the taskId; you need it to poll for the final result.

Poll for result

curl -X GET "https://api.apixo.ai/api/v1/statusTask/minimax-speech-2-8?taskId=task_12345678" \
  -H "Authorization: Bearer YOUR_API_KEY"

Processing response:

{
  "code": 200,
  "message": "success",
  "data": {
    "taskId": "task_12345678",
    "state": "processing",
    "createTime": 1781502331739
  }
}

Success response:

{
  "code": 200,
  "message": "success",
  "data": {
    "taskId": "task_12345678",
    "state": "success",
    "resultJson": "{\"resultUrls\":[\"https://file.apixo.ai/temp/output.mp3\"]}",
    "createTime": 1781502331739,
    "completeTime": 1781502332542,
    "costTime": 803
  }
}

Failed response:

{
  "code": 200,
  "message": "success",
  "data": {
    "taskId": "task_12345678",
    "state": "failed",
    "failCode": "VALIDATION_ERROR",
    "failMsg": "The parameter {{voice_id}} is invalid.",
    "createTime": 1781502331739,
    "completeTime": 1781502332542
  }
}

Parse resultJson after state becomes success:

const payload = JSON.parse(data.resultJson);
const audioUrls = payload.resultUrls;

Request body

{
  "request_type": "async",
  "input": {
    "mode": "speech-hd",
    "voice_id": "preset_general_wise_woman",
    "prompt": "Welcome to APIXO.",
    "pronunciation_dict": [
      "Omg/Oh my god"
    ],
    "speed": 1.0,
    "volume": 1.0,
    "pitch": 0,
    "emotion": "happy",
    "sample_rate": 24000,
    "bitrate": 128000,
    "channel": "stereo",
    "format": "mp3",
    "language_boost": "English"
  }
}

Parameters

string

default:"async"

required

Result delivery mode. Use async for polling with statusTask, or callback for webhook delivery.

string

Required when request_type is callback. Must be a public HTTPS URL that can receive the final task payload. See Webhooks.

object

required

MiniMax Speech 2.8 input parameters.

Show properties

string

required

Speech quality mode. Supported values: speech-turbo, speech-hd.

string

required

Required speech voice selector. Use either a preset voice_id from MiniMax Speech 2.8 Preset Voices, or a custom voice_id created by MiniMax Voice.

string

required

Required synthesis text. Supports 1-10000 characters. You can include pause control such as <#0.5#>, and expressive tags such as (laughs), (sighs), and (coughs) when needed.

string[]

Optional pronunciation overrides. Each item must use Alias/Pronunciation format such as Omg/Oh my god.

number

Optional speaking-speed control. APIXO accepts either a number or numeric string, and the final value must be between 0.5 and 2.0.

number

Optional volume control. APIXO accepts either a number or numeric string, and the final value must be between 0.1 and 10.0.

number

Optional pitch control. APIXO accepts either a number or numeric string, and the final value must be between -12 and 12.

string

Optional emotion control. Supported values: happy, sad, angry, fearful, disgusted, surprised, neutral.

integer

Optional output sample rate. APIXO accepts either an integer or numeric string. Supported values: 8000, 16000, 22050, 24000, 32000, 44100.

integer

Optional output bitrate. APIXO accepts either an integer or numeric string. Supported values: 32000, 64000, 128000, 256000.

string

Optional output channel selector. 1 = mono, 2 = stereo. Supported values: 1, 2, mono, stereo.

string

Optional output format. Supported values: mp3, wav, pcm, flac.

string

Optional language hint. Supported values: Chinese, Chinese,Yue, English, Arabic, Russian, Spanish, French, Portuguese, German, Turkish, Dutch, Ukrainian, Vietnamese, Indonesian, Japanese, Italian, Korean, Thai, Polish, Romanian, Greek, Czech, Finnish, Hindi, Bulgarian, Danish, Hebrew, Malay, Persian, Slovak, Swedish, Croatian, Filipino, Hungarian, Norwegian, Slovenian, Catalan, Nynorsk, Tamil, Afrikaans, auto.

Voice ID rules

Use the preset voice_id list on MiniMax Speech 2.8 Preset Voices for built-in voices.
Use the returned custom voice_id from MiniMax Voice when you want a custom cloned or designed voice.
Custom voice_id values must be valid and available for the current account.
Only the documented fields on this page are part of the public request contract. Extra unsupported fields are ignored by APIXO unless they conflict with platform behavior.

Prompt behavior

prompt supports normal narration text.
You can use pause control such as <#0.5#> inside the text.
Inline expressive tags such as (laughs), (sighs), and (coughs) can also be included when useful.
pronunciation_dict is the right place for abbreviation or alias-to-pronunciation overrides.

Pricing

Mode	Price
`speech-turbo`	`0.0600 USD / 1000 characters`
`speech-hd`	`0.1000 USD / 1000 characters`

Billing is based on the submitted prompt character count.

Response format

Submit task response

POST /generateTask/minimax-speech-2-8 returns a task ID when the task is accepted:

integer

API status code. 200 means the task was accepted.

string

Human-readable status message.

string

Unique task identifier used with the status endpoint.

Status response fields

string

Unique task identifier.

string

Current task state: processing, success, or failed.

string

JSON string containing audio result URLs. Present when state is success.

string

Machine-readable failure code. Present when state is failed.

string

Human-readable failure message. Present when state is failed.

integer

Task creation timestamp in Unix milliseconds.

integer

Task completion timestamp in Unix milliseconds. Present after completion.

integer

Processing duration in milliseconds. Present after successful completion.

Webhook callback mode

Use callback mode when your backend should receive the final result automatically instead of polling.

curl -X POST "https://api.apixo.ai/api/v1/generateTask/minimax-speech-2-8" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "request_type": "callback",
    "callback_url": "https://your-server.com/webhooks/apixo",
    "input": {
      "mode": "speech-turbo",
      "voice_id": "preset_general_friendly_person",
      "prompt": "Hello, this result will be delivered by webhook."
    }
  }'

See Webhooks for delivery requirements and retry behavior.

Latency and polling

Actual latency varies by text length, selected mode, queue load, and route health.

Workflow	Typical generation time	Recommended first poll	Poll interval
`speech-turbo`	5s-30s	5s after task creation	3s-5s
`speech-hd`	5s-30s	5s after task creation	3s-5s

Errors and troubleshooting

HTTP errors

Code	Meaning	What to do
`400`	Invalid request body or parameter shape	Fix the request before retrying
`401`	Missing or invalid API key	Check the `Authorization` header
`402`	Insufficient balance or quota	Add balance or switch account/key
`403`	Key or route cannot access the model	Check permissions and route strategy
`429`	Rate limit or concurrency limit reached	Retry with exponential backoff
`500`	Server error	Retry with backoff
`502`	Temporary model service error	Retry with backoff
`504`	Processing timeout	Retry or use callback mode

Common validation rules

mode, voice_id, and prompt are required.
Built-in voices must use the documented preset voice_id list.
Custom voice_id values must be valid and available for the current account.
pronunciation_dict must be a string array, and each item must use Alias/Pronunciation format.
speed, volume, and pitch can be numeric strings, but the final values must stay within the documented ranges.
sample_rate and bitrate can be numeric strings, but only the documented option values are accepted.
channel accepts 1, 2, mono, or stereo, where 1=mono and 2=stereo.
emotion, format, and language_boost must match the documented supported values.

​Overview

​Endpoint and authentication

​Copy-paste async quickstart

​Poll for result

​Request body

​Parameters

​Voice ID rules

​Prompt behavior

​Pricing

​Response format

​Submit task response

​Status response fields

​Webhook callback mode

​Latency and polling

​Errors and troubleshooting

​HTTP errors

​Common validation rules

​Related links

Overview

Endpoint and authentication

Copy-paste async quickstart

Poll for result

Request body

Parameters

Voice ID rules

Prompt behavior

Pricing

Response format

Submit task response

Status response fields

Webhook callback mode

Latency and polling

Errors and troubleshooting

HTTP errors

Common validation rules

Related links