Location>code7788 >text

MoneyPrinterPlus integrates GPT_SoVITS!

Popularity:396 ℃/2024-08-14 14:36:36

Recently there are a lot of excellent speech synthesis TTS tools, currently MoneyPrinterPlus has integrated ChatTTS and fasterWhisper. at the request of our friends, recently MoneyPrinterPlus has also integrated GPT_SoVITS which is an excellent speech synthesis tool.

Today I will give you a detailed explanation on how to use GPT_SoVITS in MoneyPrinterPlus.

software preparation

Of course, the prerequisite is that you need to download the MoneyPrinterPlus software.

Download Address:/ddean2009/MoneyPrinterPlus

If you use it well, you may give a star to support it. After v4.4, MoneyPrinterPlus now supports GPT_SoVITS local TTS voice service.

Start GPT_SoVITS

How to install the tool GPT_SoVITS is not covered here.

Let's talk about how to configure it with MoneyPrinterPlus.

GPT_SoVITS has many functions, including speech training, model fine-tuning, TTS speech inference, voice change and other functions.

Here we use the core TTS speech inference function of GPT_SoVITS.

First we start GPT_SoVITS:

In 1-GPT-SoVITS-TTS --- " 1C Reasoning --- " Open TTS Reasoning webUI

image-20240814101451460

Then you can see the TTS reasoning interface like this:

image-20240814103843764

You can choose the language of the reference audio, the reference text, and the reference audio.

Then you can enter the text to be synthesized, the category of the synthesized speech, the speech rate, top_k, top_p and temperature.

Of course, none of the above matters, we don't need to call GPT_SoVITS through the webUI, what we need is to interact with GPT_SoVITS through the API.

API startup GPT_SoVITS

If you download the GPT_SoVITS collection package, then you can directly execute the following command to start the GPT_SoVITS API:

Start api: runtime\

After launching, you will see the following:

image-20240814002404640

The above startup is the simplest startup without specifying the reference audio, if you want to specify the reference audio, you can execute the following command:

python -dr "" -dt "one, two, three." -dl "zh"

Of course, there are some other startup parameters as follows:

`-dr` - `Default reference audio path`
`-dt` - `Default reference text`
`-dl` - `Default reference audio language, `Chinese`, `English`, `Japanese`, `Korean`, `Cantonese`, `zh`, `en`, `ja`, `ko`, `yue`.

`-d` - `Deductive device, "cuda", "cpu"`
`-a` - `binding address, default "127.0.0.1"`
`-p` - `Binds a port, default 9880, can be specified in `
`-fp` - `override Use full-precision`.
``-hp`` - ``Override Use half-precision``.
`-sm` - `Streaming return mode, not enabled by default, "close", "c", "normal", "n", "keepalive", "k"`
---mt` - `Audio encoding format returned, streaming default ogg, non-streaming default wav, "wav", "ogg", "aac"`
--cp` - `The text cut symbol setting, defaults to empty, starts with ",. ,." String passed in as `

`-hb` - `path to cnhubert`
` -b` - `bert path`

People can make their own choices according to their needs.

By default the API will start on port 9880, we can use the following command to test the API startup:

Use the reference audio specified in the execution parameter.
GET.
    `http://127.0.0.1:9880?text=The late emperor demised before he was halfway through his career., today the next three points, Yizhou fatigue, this is really critical survival of the autumn also. &text_language=zh`
POST.
``json
{
    "text": "The first emperor's business is not yet half but the middle way collapses, today the next three points, Yizhou is tired, this is the autumn of the crisis and survival." ,
    "text_language": "zh"
}
```

Configuring MoneyPrinterPlus

Okay, back to our MoneyPrinterPlus page.

In the Basic Configuration page, Local Voice TTS, we select GPTSoVITS, and then enter the address of GPTSoVITS.

image-20240814105434708

In the AI Video or Video Mixing area, in the Video TTS Speech Synthesis area, we select Local Services.

image-20240814002245394

Most of the parameters needed for GPTsoVITS are listed here.

You can use Reference Audio, Reference Audio Text, and Reference Audio Voice.

Then you can condition temperature,top_P, top_K and other information.

image-20240814002312805

Click on Audition Sound and if you hear a synthesized sound, then congratulations, your configuration was successful.

Also, in the log file of the GPTsoVITS service, you can see some of the progress of speech synthesis.

image-20240814001906393

summarize

GPTsoVITS is a very powerful service, and used in combination with MoneyPrinterPlus, you will be unbeatable.
Click on me for more highlights.