Software: Difference between revisions

From SynSIG
No edit summary
No edit summary
Line 1: Line 1:
== CSLU Toolkit ==
== CSLU Toolkit ==
* The CSLU Toolkit was created to provide the basic framework and tools for people to build, investigate and use interactive language systems. These systems incorporate leading-edge speech recognition, natural language understanding, speech synthesis and facial animation.
* The [http://cslu.cse.ogi.edu/toolkit/ CSLU Toolkit] was created to provide the basic framework and tools for people to build, investigate and use interactive language systems. These systems incorporate leading-edge speech recognition, natural language understanding, speech synthesis and facial animation.
* [http://cslu.cse.ogi.edu/toolkit/ CSLU Toolkit]


== FreeTTS ==
== FreeTTS ==
Line 7: Line 6:


== HMM-Based Speech Synthesis System (HTS) ==
== HMM-Based Speech Synthesis System (HTS) ==
* The basic core system of HTS, availble from NITECH, was implemented as a modified version of HTK together with SPTK (see below), and is released as HMM-Based Speech Synthesis System (HTS) in a form of patch code to HTK. HTS version 1.1.1 comes with a small run-time synthesis engine (less than 1 MB including acoustic models), which can run without the HTK library. The current version does not include any text analyzer but the Festival Speech Synthesis System can be used as a text analyzer.
The basic core system of [http://hts.ics.nitech.ac.jp/ HTS], availble from NITECH, was implemented as a modified version of HTK together with SPTK (see below), and is released as HMM-Based Speech Synthesis System (HTS) in a form of patch code to HTK. HTS version 1.1.1 comes with a small run-time synthesis engine (less than 1 MB including acoustic models), which can run without the HTK library. The current version does not include any text analyzer but the Festival Speech Synthesis System can be used as a text analyzer.
* [http://hts.ics.nitech.ac.jp/]


== KPE ==
== KPE ==
* The KPE80 program provides a graphical interface for the implementation of the Klatt 1980 [[formant synthesiser]]. The interface allows users to display and edit Klatt parameters using a graphical display which includes the time-amplitude waveform of both the original speech and its synthetic copy, and some signal analysis facilities.
[http://www.enhance.phon.ucl.ac.uk/public/examples/copysyn/kpe/kpe.htm KPE] provides a graphical interface for the implementation of the Klatt 1980 [[formant synthesiser]]. The interface allows users to display and edit Klatt parameters using a graphical display which includes the time-amplitude waveform of both the original speech and its synthetic copy, and some signal analysis facilities.
* [http://www.enhance.phon.ucl.ac.uk/public/examples/copysyn/kpe/kpe.htm KPE] and many other [http://www.enhance.phon.ucl.ac.uk/ University College London softwares]
See also the other [http://www.enhance.phon.ucl.ac.uk/ University College London software].


== MBROLA ==
== MBROLA ==
* The aim of the MBROLA project, initiated by the TCTS Lab of the Faculté Polytechnique de Mons (Belgium), is to obtain a set of speech synthesizers for as many languages as possible, and provide them free for non-commercial applications. The ultimate goal is to boost academic research on speech synthesis, and particularly on prosody generation, known as one of the biggest challenges taken up by [[Text-To-Speech synthesizers]] for the years to come.
The aim of the [http://tcts.fpms.ac.be/synthesis/mbrola.html MBROLA] project, initiated by the TCTS Lab of the Faculté Polytechnique de Mons (Belgium), is to obtain a set of diphone-based speech synthesizers for as many languages as possible, and provide them free for non-commercial applications.
* [http://tcts.fpms.ac.be/synthesis/mbrola.html MBROLA]


== MARY
== MARY
Line 23: Line 20:


== Praat ==
== Praat ==
* A system for doing phonetics by computer. The computer program Praat is a research, publication, and productivity tool for phoneticians. With it, you can analyse, synthesize, and manipulate speech, and create high-quality pictures for your articles and thesis.
[http://fonsg3.let.uva.nl/praat/manual/Praat_program.html Praat] is a system for doing phonetics by computer. The computer program Praat is a research, publication, and productivity tool for phoneticians. With it, you can analyse, synthesize, and manipulate speech, and create high-quality pictures for your articles and thesis.
* [http://fonsg3.let.uva.nl/praat/manual/Praat_program.html Praat]


== Speech Signal Processing Toolkit (SPTK) ==
== Speech Signal Processing Toolkit (SPTK) ==
* The main feature of the Speech Signal Processing Toolkit, available from NITECH, is that not only standard speech analysis and synthesis techniques (e.g., LPC analysis, PARCOR analysis, LSP analysis, PARCOR synthesis filter, LSP synthesis filter, and vector quantization techniques) but also speech analysis and synthesis techniques developed at the research group can easily be used.
The main feature of the [http://kt-lab.ics.nitech.ac.jp/~tokuda/SPTK/ Speech Signal Processing Toolkit], available from NITECH, is that not only standard speech analysis and synthesis techniques (e.g., LPC analysis, PARCOR analysis, LSP analysis, PARCOR synthesis filter, LSP synthesis filter, and vector quantization techniques) but also speech analysis and synthesis techniques developed at the research group can easily be used.
* http://kt-lab.ics.nitech.ac.jp/~tokuda/SPTK/


== TrackDraw ==
== TrackDraw ==
* TrackDraw is a graphical interface for controlling the parameters of a speech synthesizer.
[http://www.utdallas.edu/~assmann/TRACKDRAW/trackdraw.html TrackDraw] is a graphical interface for controlling the parameters of a speech synthesizer.
* [http://www.utdallas.edu/~assmann/TRACKDRAW/trackdraw.html TrackDraw]


== Wavesurfer ==
== Wavesurfer ==
* Wavesurfer is a tool for doing speech analysis. The analysis features include formants and pitch extraction and real time spectrograms. The Wavesurfer tool built on top of the [http://www.speech.kth.se/snack/ Snack] speech visualization module, is highly modular and extensible at several levels.
[http://www.speech.kth.se/wavesurfer/ WaveSurfer] is a tool for doing speech analysis. The analysis features include formants and pitch extraction and real time spectrograms. The Wavesurfer tool built on top of the [http://www.speech.kth.se/snack/ Snack] speech visualization module, is highly modular and extensible at several levels.
* [http://www.speech.kth.se/wavesurfer/ WaveSurfer]

Revision as of 12:06, 19 May 2006

CSLU Toolkit

  • The CSLU Toolkit was created to provide the basic framework and tools for people to build, investigate and use interactive language systems. These systems incorporate leading-edge speech recognition, natural language understanding, speech synthesis and facial animation.

FreeTTS

FreeTTS is a speech synthesis system written entirely in the JavaTM programming language. It is based upon Flite: a small run-time speech synthesis engine developed at Carnegie Mellon University. Flite is derived from the Festival Speech Synthesis System from the University of Edinburgh and the FestVox project from Carnegie Mellon University.

HMM-Based Speech Synthesis System (HTS)

The basic core system of HTS, availble from NITECH, was implemented as a modified version of HTK together with SPTK (see below), and is released as HMM-Based Speech Synthesis System (HTS) in a form of patch code to HTK. HTS version 1.1.1 comes with a small run-time synthesis engine (less than 1 MB including acoustic models), which can run without the HTK library. The current version does not include any text analyzer but the Festival Speech Synthesis System can be used as a text analyzer.

KPE

KPE provides a graphical interface for the implementation of the Klatt 1980 formant synthesiser. The interface allows users to display and edit Klatt parameters using a graphical display which includes the time-amplitude waveform of both the original speech and its synthetic copy, and some signal analysis facilities. See also the other University College London software.

MBROLA

The aim of the MBROLA project, initiated by the TCTS Lab of the Faculté Polytechnique de Mons (Belgium), is to obtain a set of diphone-based speech synthesizers for as many languages as possible, and provide them free for non-commercial applications.

== MARY MARY is a multi-lingual (German, English, Tibetan) and multi-platform (Windows, Linux, MacOs X and Solaris) speech synthesis system. It comes with an easy-to-use installer - no technical expertise should be required for installation. It enables expressive speech synthesis, using both diphone and unit-selection synthesis

Praat

Praat is a system for doing phonetics by computer. The computer program Praat is a research, publication, and productivity tool for phoneticians. With it, you can analyse, synthesize, and manipulate speech, and create high-quality pictures for your articles and thesis.

Speech Signal Processing Toolkit (SPTK)

The main feature of the Speech Signal Processing Toolkit, available from NITECH, is that not only standard speech analysis and synthesis techniques (e.g., LPC analysis, PARCOR analysis, LSP analysis, PARCOR synthesis filter, LSP synthesis filter, and vector quantization techniques) but also speech analysis and synthesis techniques developed at the research group can easily be used.

TrackDraw

TrackDraw is a graphical interface for controlling the parameters of a speech synthesizer.

Wavesurfer

WaveSurfer is a tool for doing speech analysis. The analysis features include formants and pitch extraction and real time spectrograms. The Wavesurfer tool built on top of the Snack speech visualization module, is highly modular and extensible at several levels.