10-Minute Tutorial - How to Use ChatTTS on Your Colab
Welcome to our ChatTTS Colab Example and Tutorial. The good news is that we can use ChatTTS on the basic plan of Colab, though it might be a bit slow (about 90 seconds).
In simple terms, we first need to clone the official ChatTTS repository into Colab with the terminal command in ipynb (!
command).
Then, download the necessary dependencies of ChatTTS into colab. After that, we can start using this poweful text to speech repository in colab. This sounds a bit rough, so let's dive in and follow our ChatTTS Colab Tutorial steps by steps.
Open your Colab, and let's get started!
1. Clone ChatTTS from Github
Clone the official ChatTTS repository into colab, then move its contents to the root directory:
!git clone https://github.com/2noise/ChatTTS
!mv ChatTTS empty
!mv empty/* .
You could find files of ChatTTS open source repository on "File" tab on the left side of colab.
2. Download Necessary Dependencies into Colab
Yes, we only need to download the following dependencies into colab to get our example demo running properly.
Check reqirements.txt
file for complete dependencies if you want to. It is displaed on ChatTTS repository or File tab on Colab.
!pip install omegaconf -q
!pip install vocos -q
!pip install vector_quantize_pytorch -q
!pip install nemo_text_processing -q
!pip install WeTextProcessing -q
3. Initialize ChatTTS on Colab
As bellow code block mentions, we initialize ChatTTS and load models into colab.
import torch
from ChatTTS.core import Chat
from IPython.display import Audio
chat = Chat()
chat.load_models()
4. Define the Text to Be Procede
You can process multiple texts at the same time, just put them in texts
.
To start example, let's set up two examples in both Chinese and English. You could change them as you wish.
texts=["ChatTTS is a text-to-speech model designed specifically for dialogue scenario such as LLM assistant. It supports both English and Chinese languages.",
"ChatTTS是专门为对话场景设计的文本转语音模型,例如LLM助手对话任务。它支持英文和中文两种语言。"]
5. Generate Speech by ChatTTS
use_decoder
means generating better quality speech.
It would take about two minius on colab free plan.
wavs = chat.infer(texts, use_decoder=True)
6. Play the Speech
Done! Now we can directly play it on colab.
We apply the Audio
class to play the audio.
from IPython.display import Audio
Then play our generated speech!
Audio(wavs[0], rate=24_000, autoplay=True)
Or
Audio(wavs[1], rate=24_000, autoplay=True)
Complete Code
!git clone https://github.com/2noise/ChatTTS
!mv ChatTTS empty
!mv empty/* .
!pip install omegaconf -q
!pip install vocos -q
!pip install vector_quantize_pytorch -q
!pip install nemo_text_processing -q
!pip install WeTextProcessing -q
import torch
from ChatTTS.core import Chat
from IPython.display import Audio
chat = Chat()
chat.load_models()
texts=["ChatTTS is a text-to-speech model designed specifically for dialogue scenario such as LLM assistant. It supports both English and Chinese languages.",
"ChatTTS是专门为对话场景设计的文本转语音模型,例如LLM助手对话任务。它支持英文和中文两种语言。"]
wavs = chat.infer(texts, use_decoder=True)
from IPython.display import Audio
Audio(wavs[0], rate=24_000, autoplay=True)
# Audio(wavs[1], rate=24_000, autoplay=True)