XTTSv2 是一个非常酷的文本转语音模型，它只需一段快速的 3 秒音频剪辑就能让你在不同语言中克隆声音。基于Tortoise，XTTS 具有重要的模型变化，使得跨语言声音克隆和多语言语音生成变得非常简单。无需大量涵盖无数小时的训练数据。
XTTSv2可用于语音克隆和语音生成。

安装

pip install TTS

Python接口

运行程序会自动下载缺少的模型，下载目录：

~/.local/share/tts/

import os
import torch
from playsound import playsound
from TTS.api import TTS


# 播放音频文件
def play_audio(audio_file):
    if os.path.exists(audio_file):
        playsound(audio_file)
    else:
        print(f'{audio_file} not exists')


def tts_xtts(text, audio_file):
    # 查看模型列表
    for m in TTS().list_models().list_models():
        print(m)
    
    device = "cuda" if torch.cuda.is_available() else "cpu"

    tts = TTS(model_name="tts_models/multilingual/multi-dataset/xtts_v2", progress_bar=True).to(device)
    tts.tts_to_file(text=text, speaker_wav="Xiaoxiao.wav", language="zh-cn", file_path=audio_file)


if __name__ == '__main__':
    audio_file = 'output.wav'
    tts_xtts('你好，文本转语音可以将文字转成语音输出', audio_file)
    play_audio(audio_file)

参考语音：

Xiaoxiao.wav

输出语音：

output.wav

目录CONTENT

AI声音克隆

安装

Python接口

评论区