2-9 AI Audio
Choose any voice to realize text-to-speech and unleash your creativity!
Text-to-Speech (TTS) technology brings your text to life, transforming written content into vibrant, spoken audio. With just a click, you can listen to your documents, books, or any written material as if someone is speaking directly to you. Ideal for multitasking, learning on the go, or simply making information more accessible, TTS opens up a world of possibilities, enabling you to hear the future of reading. Experience the freedom and flexibility to absorb content wherever, whenever.
Page Entrance:

How to Use “Text to Speech"
Click on AI Audio to enter the audio community.
Select a audio you like, click "Generate," and enter your text (currently supports English, Japanese, and Chinese). Then click "Generate" again.


*You can view previously generated audios in the history record on the right.
If you do not like any of the available audios, you can choose to customize your own audio.

Three steps to train your audio
I. Workflow Overview
Fill in audio details → 2) Upload audio → 3) Click “Train Now” and review the result
II. Step-by-Step Instructions
Step 1: Enter Basic Audio Information

Cover Image: 1 × 1 ratio, ≤ 2 MB
Audio Name: 1 – 20 characters
Model: Choose the training model (default: SeaArt-speech-01-hd; more versions may be added)
Gender / Age / Tone: Select according to the voice you upload
Language: Must match the uploaded audio; currently supports Japanese, English, Chinese and Korean
Text-to-Audio Sample: A sample line for the model, ≤ 50 characters
Tags: 0 – 5 keywords for easy search
Public or Private:
◦ Public — the trained voice will be published to the community ◦ Private — only you can access it
Step 2: Upload Audio

Accepted formats: mp3 / wav / aac
Length limit: ≤ 30 seconds (10 s of clean audio is enough for fast training)
File size: ≤ 20 MB
Quality tips:
✓ Use pure speech with no music, reverb or background noise
✓ Choose a clip with clear vocal characteristics and stable emotion
✗ Avoid music or clips with background tracks, as they greatly reduce quality
Step 3: Click “Train Now”

Cost: 28 (displayed in real time)
Progress & results:
◦ Click “Training Records” (top-right) to track all runs
◦ When finished, you can play, rename or delete the result in the list
◦ If set to Public, the audio will also appear on your profile > Audio Works

III. FAQ & Tips
Why 10 – 20 seconds?
A short, clean clip lets the model finish in minutes while still capturing voice features.
Can I upload multiple segments at once?
Not yet. Please merge them offline into a single clip before uploading.
Poor recording quality?
• Use software such as Audition or Audacity to remove background noise, then re-upload.
Training fails or stalls?
• Check your network connection.
• Confirm the audio meets the length/format limits.
Last updated