10-Minute Tutorial - How to Use ChatTTS on Your Colab

Welcome to our ChatTTS Colab Example and Tutorial. The good news is that we can use ChatTTS on the basic plan of Colab, though it might be a bit slow (about 90 seconds).

In simple terms, we first need to clone the official ChatTTS repository into Colab with the terminal command in ipynb (! command).

Then, download the necessary dependencies of ChatTTS into colab. After that, we can start using this poweful text to speech repository in colab. This sounds a bit rough, so let's dive in and follow our ChatTTS Colab Tutorial steps by steps.

Open your Colab, and let's get started!

1. Clone ChatTTS from Github

Clone the official ChatTTS repository into colab, then move its contents to the root directory:

!git clone https://github.com/2noise/ChatTTS
!mv ChatTTS empty
!mv empty/* .

You could find files of ChatTTS open source repository on "File" tab on the left side of colab.

2. Download Necessary Dependencies into Colab

Yes, we only need to download the following dependencies into colab to get our example demo running properly.

Check reqirements.txt file for complete dependencies if you want to. It is displaed on ChatTTS repository or File tab on Colab.

!pip install omegaconf -q
!pip install vocos -q
!pip install vector_quantize_pytorch -q
!pip install nemo_text_processing -q
!pip install WeTextProcessing -q

3. Initialize ChatTTS on Colab

As bellow code block mentions, we initialize ChatTTS and load models into colab.

import torch
from ChatTTS.core import Chat
from IPython.display import Audio

chat = Chat()
chat.load_models()

4. Define the Text to Be Procede

You can process multiple texts at the same time, just put them in texts.

To start example, let's set up two examples in both Chinese and English. You could change them as you wish.

texts=["ChatTTS is a text-to-speech model designed specifically for dialogue scenario such as LLM assistant. It supports both English and Chinese languages.",
    "ChatTTS是专门为对话场景设计的文本转语音模型,例如LLM助手对话任务。它支持英文和中文两种语言。"]

5. Generate Speech by ChatTTS

use_decoder means generating better quality speech.

It would take about two minius on colab free plan.

wavs = chat.infer(texts, use_decoder=True)

6. Play the Speech

Done! Now we can directly play it on colab.

We apply the Audio class to play the audio.

from IPython.display import Audio

Then play our generated speech!

Audio(wavs[0], rate=24_000, autoplay=True)

Or

Audio(wavs[1], rate=24_000, autoplay=True)

Complete Code

!git clone https://github.com/2noise/ChatTTS
!mv ChatTTS empty
!mv empty/* .


!pip install omegaconf -q
!pip install vocos -q
!pip install vector_quantize_pytorch -q
!pip install nemo_text_processing -q
!pip install WeTextProcessing -q


import torch
from ChatTTS.core import Chat
from IPython.display import Audio

chat = Chat()
chat.load_models()


texts=["ChatTTS is a text-to-speech model designed specifically for dialogue scenario such as LLM assistant. It supports both English and Chinese languages.",
    "ChatTTS是专门为对话场景设计的文本转语音模型,例如LLM助手对话任务。它支持英文和中文两种语言。"]

wavs = chat.infer(texts, use_decoder=True)


from IPython.display import Audio

Audio(wavs[0], rate=24_000, autoplay=True)
# Audio(wavs[1], rate=24_000, autoplay=True)

@2024 @ChatTTS.Site all rights reserved. We are not official website of ChatTTS.