July 20, 2020

Ubuntu + TTS

TTS = Text-To-Speech, Speech Synthesizer

I've been using/playing around with TTS since 8-bit computer, Apple II+ with SAM card + s/w, and DOS, Windows, 68K Mac, OSX, AT&T Natural Voice, online voice services.

I loved the idea computer talking back with information.  I wrote some programs using TTS on various OSes for amusement.

Today's computers are more than powerful and big enough to do TTS locally.  I don't like relying on online TTS service: network and external service dependencies, for offline or low network bandwidth devices, and also for privacy reason.  So for this posting, it's about offline TTS (and free).

I find Mac and Windows TTS have the best voice quality for offline TTS.  For my use, create a simple TTS server and output speech from various programs.

Time is good -- things got better, cheaper, easier.  Voice quality is very good, it's free, and easy to use (python).

After moving to Ubuntu, I've been looking for a good s/w with high quality voice.

There are a lot of options for Linux, of course.  Since TTS is just another feature I want to use, my requirements are: good enough voice quality, easy to use, python friendly and free.

Unfortunately, the s/w I tried, the voice quality was all pretty bad.  Very robotic, 8-bit era voice.

I missed Mac/Windows TTS, so thought about using Mac/Windows TTS:
  • Using Mac -- I have an old and no longer being used MacBook I could use it for TTS server, but that's an overkill.  
  • Using Wine -- I installed Windows TTS on Wine, works great -- nice voice quality.  But running on Wine is not so great -- uses too much resources, and running python server+REST API on it is another big hassle.  And I'm not even sure I can run Wine+PythonAPI+TTS as a Linux service.  I've been reading up many postings on this, and it seems it's too much effort. 
So I decided to go back looking for Linux native solution and found one finally.  The voice quality is still little worse than Mac/Windows, but good enough, easy to use, light-weight, python friendly.  Perfect.
 
It's called "flite",
$ sudo apt-get install flite

Get more voices,
$ wget -r --no-parent --no-directories --accept flitevox http://www.festvox.org/flite/packed/flite-2.0/voices/

Test it,
$ flite -t "Hello World!" -voice cmu_us_fem.flitevox

I like these voices:
  • cmu_us_fem.flitevox
  • cmu_us_ljm.flitevox
  • cmu_us_slt.flitevox
Python lib:
Another happy day finding a good solution.


No comments: