htsengine(1) General Commands Manual htsengine(1)


hts_engine — HMM-based speech synthesis engine


hts_engine [options] [infile]


This manual page documents briefly the hts_engine command.

This manual page was written for the Debian distribution because the original program does not have a manual page. Instead, it has documentation in the GNU Info format; see below.

hts_engine is a program that synthesize speech waveform from HMMs trained by the HMM-based speech synthesis system (HTS).


A summary of options is included below.

-m htsvoice
HTS voice files
-od s
filename of output label with duration
-om s
filename of output spectrum
-of s
filename of output Log F0
-ol s
filename of output low-pass filter
-or s
filename of output raw audio (generated speech)
-ow s
filename of output wav audio (generated speech)
-ot s
filename of output trace information
use phoneme alignment for duration
-i i f1 .. fi
enable interpolation & specify number(i),coefficient(f)
-s i
sampling frequency [auto][ 1-- ]
-p i
frame period (point) [auto][ 1-- ]
-a f
all-pass constant [auto][0.0--1.0]
-b f
postfiltering coefficient [0.0][0.0--1.0]
-r f
speech speed rate [1.0][0.0-- ]
-fm f
add half-tone [0.0][ -- ]
-u f
voiced/unvoiced threshold[0.5][0.0--1.0]
-jm f
weight of GV for spectrum [1.0][0.0-- ]
-jf f
weight of GV for Log F0 [1.0][0.0-- ]
-z i
audio buffer size (if i==0, turn off) [ 0][0-- ]
label file

generated spectrum, log F0, and low-pass filter coefficient sequences are saved in natural endian, binary (float) format.


If you installed hts-voice-nitech-jp-atr503-m001 in the current directory, the following command let you make a voice file from input.lab:

% hts_engine -s 48000 -p 240 -a 0.55 \ 
-m nitech_jp_atr503_m001.htsvoice \ 
-ow output.wav \ 


