Blizzard Challenge 2010 Workshop: Difference between revisions

From SynSIG
 
(8 intermediate revisions by the same user not shown)
Line 43: Line 43:
The workshop format is single-track with oral presentations from the participants in the Challenge, the organisers and invited speakers.
The workshop format is single-track with oral presentations from the participants in the Challenge, the organisers and invited speakers.


We anticipate that the workshop will start at 9am and run until around 6pm and the draft programme is as follows:
Each system presentation should last for a maximum of 15 minutes, including time for questions. Presenters should bear in mind that the audience will mainly comprise speech synthesis experts; therefore presentations do not need to include extensive background material.
 
We will start at 9am and run until around 6pm. The programme is as follows:


* 08.30 Registration desk opens
* 08.30 Registration desk opens
Line 50: Line 52:
** Summary of results
** Summary of results


* 10.00-11.00 System presentations
* 10.00-11.00 System presentations (4)
 
* 11.00-11.30 Coffee break
 
* 11.30-12.40 System presentations
 
* 12.40 - 14.00 Lunch
 
* 14.00-16.00 System presentations
 
* 16.00-16.30 Coffee break
 
* 16.30-17.30 System presentations
 
* 17.30 Simon King, on behalf of the organisers:
** wrap-up
** discussion of future Blizzard Challenges
 
* 18.00 Close
 
 
* Presenters (schedule will be fixed once all presenters have confirmed attendance):
** Shifeng Pan: "The WISTON Text to Speech System for Blizzard Challenge 2010",  Jianhua Tao, Shifeng Pan, Ya Li, Zhengqi Wen, Yang Wang (National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences)
** Shifeng Pan: "The WISTON Text to Speech System for Blizzard Challenge 2010",  Jianhua Tao, Shifeng Pan, Ya Li, Zhengqi Wen, Yang Wang (National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences)
** Alan Black: "Blizzard 2010: CMU Statistical Synthesis"
** Alan Black: "Blizzard 2010: CMU Statistical Synthesis"
** Junichi Yamagishi: "The CSTR/EMIME system for Blizzard Challenge 2010", Junichi Yamagishi, Oliver Watts, et al
** Sebastian  Moeller: "Comparison of Approaches for Instrumentally Predicting the Quality of Text-to-Speech Systems: Data from Blizzard Challenges 2008 and 2009",  Florian Hinterleitner, Sebastian  Moeller, Tiago H. Falk, Tim Polzehl (Quality and Usability Lab, Deutsche Telekom Laboratories, TU Berlin, Germany / Bloorview Research Institute, Toronto, Canada)
** Antti Suni and Tuomo Raitio: "The GlottHMM Speech Synthesis Entry for Blizzard Challenge 2010", Antti Suni, Tuomo Raitio, Martti Vainio, Paavo Alku  (Department of Speech Sciences, University of Helsinki / Department of Signal Processing and Acoustics, Aalto University, Helsinki, Finland)
** Minghui Dong: "I<sup>2</sup>R Text-to-Speech System for Blizzard Challenge 2010", Minghui Dong, Paul Chan, Ling Cen, Bin Ma, Haizhou Li  (Human Language Technology Department, Institute for Infocomm Research, A*STAR, Singapore)
** Minghui Dong: "I<sup>2</sup>R Text-to-Speech System for Blizzard Challenge 2010", Minghui Dong, Paul Chan, Ling Cen, Bin Ma, Haizhou Li  (Human Language Technology Department, Institute for Infocomm Research, A*STAR, Singapore)
** Pirros Tsiakoulis: "The ILSP Text-to-Speech System for the Blizzard Challenge 2010", Spyros Raptis, Aimilios Chalamandaris, Pirros Tsiakoulis, Sotiris Karabetsos  (Institute for Language and Speech Processing / Research Center "Athena" / INNOETICS LTD, Athens, Greece)
** TBC: "The Lessac Technologies System for Blizzard Challenge 2010", Rattima Nitisaroj, Reiner Wilhelms-Tricarico, Brian Mottershead, John Reichenbach, Gary Marple  (Lessac Technologies, Inc., USA)
** Aby Louw: "Introducing the Speect speech synthesis platform", Johannes A. Louw, Daniel R. van Niekerk, Georg I. Schl&uuml;nz  (Human Language Technologies Research Group, Meraka Institute, CSIR, Pretoria, South Africa)
** Aby Louw: "Introducing the Speect speech synthesis platform", Johannes A. Louw, Daniel R. van Niekerk, Georg I. Schl&uuml;nz  (Human Language Technologies Research Group, Meraka Institute, CSIR, Pretoria, South Africa)
* 11.00-11.30 Coffee break
* 11.30-12.45 System presentations (4) + instrumental measures
** Tim Bunnell: "The ModelTalker System",Timothy Bunnell, Jason Lilley, Chris Pennington, Bill Moyers, James Polikoff1  (Speech Research Lab, Nemours Biomedical Research, Wilmington DE, USA / Department of Linguistics, University of Delaware, USA / AgoraNet Inc., Newark DE, USA)
** Tim Bunnell: "The ModelTalker System",Timothy Bunnell, Jason Lilley, Chris Pennington, Bill Moyers, James Polikoff1  (Speech Research Lab, Nemours Biomedical Research, Wilmington DE, USA / Department of Linguistics, University of Delaware, USA / AgoraNet Inc., Newark DE, USA)
** Yao Qian: "An HMM Trajectory Tiling (HTT) Approach to High Quality TTS - Microsoft Entry to Blizzard Challenge 2010", Yao Qian , Zhi-Jie Yan ,Yi-Jian Wu , Frank K. Soong , Guoliang Zhang, Lijuan Wang  (Microsoft Research Asia / Microsoft China, Beijing, China)
** Yoshinori Shiga: "NICT Blizzard Challenge 2010 Entry", Yoshinori Shiga, Tomoki Toda, Shinsuke Sakai, Jinfu Ni, Hisashi Kawai, Keiichi Tokuda, Minoru Tsuzaki, Satoshi Nakamura  (National Institute of Information and Communications Technology (NICT), Japan / Nara Institute of Science and Technology, Japan / Nagoya Institute of Technology, Japan/ Kyoto City University of Arts, Japan)
** Keiichiro Oura: "Overview of NIT HMM-based speech synthesis system for Blizzard Challenge 2010", Keiichiro Oura, Kei Hashimoto, Sayaka Shiota, Keiichi Tokuda  (Department of Computer Science and Engineering, Nagoya Institute of Technology, Japan)
** Keiichiro Oura: "Overview of NIT HMM-based speech synthesis system for Blizzard Challenge 2010", Keiichiro Oura, Kei Hashimoto, Sayaka Shiota, Keiichi Tokuda  (Department of Computer Science and Engineering, Nagoya Institute of Technology, Japan)
** TBC: "The NTNU Concatenative Speech Synthesizer", Dyre Meen, Torbj&oslash;rn Svendsen  (Department of Electronics and Telecommunication, Norwegian University of Science and Technology)
** Torbj&oslash;rn Svendsen: "The NTNU Concatenative Speech Synthesizer", Dyre Meen, Torbj&oslash;rn Svendsen  (Department of Electronics and Telecommunication, Norwegian University of Science and Technology)
** Yuan-Fu Liao: "The NTUT Blizzard Challenge 2010 Entry", Yuan-Fu Liao, Shao-He Lyu and Ming-Long Wu (Department of Electronic Engineering, National Taipei University of Technology, Taipei, Taiwan)
** Yuan-Fu Liao: "The NTUT Blizzard Challenge 2010 Entry", Yuan-Fu Liao, Shao-He Lyu and Ming-Long Wu (Department of Electronic Engineering, National Taipei University of Technology, Taipei, Taiwan)
** Sebastian  Moeller: "Comparison of Approaches for Instrumentally Predicting the Quality of Text-to-Speech Systems: Data from Blizzard Challenges 2008 and 2009",  Florian Hinterleitner, Sebastian  Moeller, Tiago H. Falk, Tim Polzehl (Quality and Usability Lab, Deutsche Telekom Laboratories, TU Berlin, Germany / Bloorview Research Institute, Toronto, Canada)
* 12.45 - 14.00 Lunch
* 14.00-16.00 System presentations (8)
** Lukas Latacz: "The VUB Blizzard Challenge 2010 Entry: Towards Automatic Voice Building", Lukas Latacz, Wesley Mattheyses, Werner Verhelst  (Vrije Universiteit Brussel, Department ETRO-DSSP / Interdisciplinary Institute for Broadband Technology – IBBT, Belgium)
** Yoshinori Shiga: "NICT Blizzard Challenge 2010 Entry", Yoshinori Shiga, Tomoki Toda, Shinsuke Sakai, Jinfu Ni, Hisashi Kawai, Keiichi Tokuda, Minoru Tsuzaki, Satoshi Nakamura  (National Institute of Information and Communications Technology (NICT), Japan / Nara Institute of Science and Technology, Japan / Nagoya Institute of Technology, Japan/ Kyoto City University of Arts, Japan)
** Reiner Wilhelms-Tricarico: "The Lessac Technologies System for Blizzard Challenge 2010", Rattima Nitisaroj, Reiner Wilhelms-Tricarico, Brian Mottershead, John Reichenbach, Gary Marple  (Lessac Technologies, Inc., USA)
** Zhenhua Ling: "The USTC System for Blizzard Challenge 2010", Yuan Jiang, Zhen-Hua Ling, Ming Lei, Cheng-Cheng Wang, Lu Heng, Yu Hu, Li-Rong Dai, Ren-Hua Wang  (iFLYTEK Speech Lab, University of Science and Technology of China, Hefei, China)
** Zhenhua Ling: "The USTC System for Blizzard Challenge 2010", Yuan Jiang, Zhen-Hua Ling, Ming Lei, Cheng-Cheng Wang, Lu Heng, Yu Hu, Li-Rong Dai, Ren-Hua Wang  (iFLYTEK Speech Lab, University of Science and Technology of China, Hefei, China)
** Lukas Latacz: "The VUB Blizzard Challenge 2010 Entry: Towards Automatic Voice Building", Lukas Latacz, Wesley Mattheyses, Werner Verhelst  (Vrije Universiteit Brussel, Department ETRO-DSSP / Interdisciplinary Institute for Broadband Technology – IBBT, Belgium)
** Pirros Tsiakoulis: "The ILSP Text-to-Speech System for the Blizzard Challenge 2010", Spyros Raptis, Aimilios Chalamandaris, Pirros Tsiakoulis, Sotiris Karabetsos  (Institute for Language and Speech Processing / Research Center "Athena" / INNOETICS LTD, Athens, Greece)
** Junichi Yamagishi: "The CSTR/EMIME system for Blizzard Challenge 2010", Junichi Yamagishi, Oliver Watts, et al
** Antti Suni and Tuomo Raitio: "The GlottHMM Speech Synthesis Entry for Blizzard Challenge 2010", Antti Suni, Tuomo Raitio, Martti Vainio, Paavo Alku  (Department of Speech Sciences, University of Helsinki / Department of Signal Processing and Acoustics, Aalto University, Helsinki, Finland)
** Yao Qian: "An HMM Trajectory Tiling (HTT) Approach to High Quality TTS - Microsoft Entry to Blizzard Challenge 2010", Yao Qian , Zhi-Jie Yan ,Yi-Jian Wu , Frank K. Soong , Guoliang Zhang, Lijuan Wang  (Microsoft Research Asia / Microsoft China, Beijing, China)
* 16.00-16.30 Coffee break
* 16.30 Simon King, on behalf of the organisers:
** wrap-up
** discussion of future Blizzard Challenges
* ~18.00 Close
 
 




Line 122: Line 114:
= Published proceedings =
= Published proceedings =


[http://festvox.org/blizzard/ The papers will be published on festvox.org]
[http://festvox.org/blizzard/ The papers are be published on festvox.org]

Latest revision as of 14:46, 23 August 2011

Call for participation

The Blizzard Challenge 2010 Workshop is the culmination of the Blizzard Challenge 2010 which is an open speech synthesis evaluation campaign using common data sets and a large listening test. The aims of the workshop are to present the results from the listening tests and for participants in the Challenge to describe their systems.

The workshop is a satellite of SSW7

Who can attend the workshop ?

The workshop is open to all and we encourage participation from anyone interested in speech synthesis.


Who can submit a paper to the workshop ?

All participants in the Challenge are expected to submit a paper describing their entry (even if they cannot attend the workshop in person). Papers will be refereed by the Programme Committee.

Programme Committee

  • Simon King, University of Edinburgh, UK
  • Alan Black, Carnegie Mellon Univerisity, USA
  • Keiichi Tokuda, Nagoya Institute of Technology, Japan

Paper submission instructions

  • Use the Interspeech 2010 authors' kit, but your paper should be up to SIX pages in length
  • Remember that Blizzard is a scientific investigation - we are all trying to understand why some techniques work better than others.
  • With this in mind, please write a detailed, technical paper aimed at a specialist audience. Focus on analysis and evaluation. Try to explain WHY your system performed the way it did, and what makes it different from other systems. Explain why your system is designed in a particular way. For example, report internal evaluations you have done to select certain methods.
  • Submit your paper by email to blizzard@festvox.org by the 30th July 2010.

Location and date

Date:

Saturday 25th September 2010 (all day)

Location:

The location is the same as SSW7 - the ATR building in Kansai Science City, near Kyoto, Japan.

The shuttle bus provided during SSW7 from Nara Washington Hotel Plaza to the venue will also operate on the day of the Blizzard workshop.

Programme

The workshop format is single-track with oral presentations from the participants in the Challenge, the organisers and invited speakers.

Each system presentation should last for a maximum of 15 minutes, including time for questions. Presenters should bear in mind that the audience will mainly comprise speech synthesis experts; therefore presentations do not need to include extensive background material.

We will start at 9am and run until around 6pm. The programme is as follows:

  • 08.30 Registration desk opens
  • 09.00 Simon King, on behalf of the organisers
    • Welcome, introduction, overview
    • Summary of results
  • 10.00-11.00 System presentations (4)
    • Shifeng Pan: "The WISTON Text to Speech System for Blizzard Challenge 2010", Jianhua Tao, Shifeng Pan, Ya Li, Zhengqi Wen, Yang Wang (National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences)
    • Alan Black: "Blizzard 2010: CMU Statistical Synthesis"
    • Minghui Dong: "I2R Text-to-Speech System for Blizzard Challenge 2010", Minghui Dong, Paul Chan, Ling Cen, Bin Ma, Haizhou Li (Human Language Technology Department, Institute for Infocomm Research, A*STAR, Singapore)
    • Aby Louw: "Introducing the Speect speech synthesis platform", Johannes A. Louw, Daniel R. van Niekerk, Georg I. Schlünz (Human Language Technologies Research Group, Meraka Institute, CSIR, Pretoria, South Africa)
  • 11.00-11.30 Coffee break
  • 11.30-12.45 System presentations (4) + instrumental measures
    • Tim Bunnell: "The ModelTalker System",Timothy Bunnell, Jason Lilley, Chris Pennington, Bill Moyers, James Polikoff1 (Speech Research Lab, Nemours Biomedical Research, Wilmington DE, USA / Department of Linguistics, University of Delaware, USA / AgoraNet Inc., Newark DE, USA)
    • Keiichiro Oura: "Overview of NIT HMM-based speech synthesis system for Blizzard Challenge 2010", Keiichiro Oura, Kei Hashimoto, Sayaka Shiota, Keiichi Tokuda (Department of Computer Science and Engineering, Nagoya Institute of Technology, Japan)
    • Torbjørn Svendsen: "The NTNU Concatenative Speech Synthesizer", Dyre Meen, Torbjørn Svendsen (Department of Electronics and Telecommunication, Norwegian University of Science and Technology)
    • Yuan-Fu Liao: "The NTUT Blizzard Challenge 2010 Entry", Yuan-Fu Liao, Shao-He Lyu and Ming-Long Wu (Department of Electronic Engineering, National Taipei University of Technology, Taipei, Taiwan)
    • Sebastian Moeller: "Comparison of Approaches for Instrumentally Predicting the Quality of Text-to-Speech Systems: Data from Blizzard Challenges 2008 and 2009", Florian Hinterleitner, Sebastian Moeller, Tiago H. Falk, Tim Polzehl (Quality and Usability Lab, Deutsche Telekom Laboratories, TU Berlin, Germany / Bloorview Research Institute, Toronto, Canada)
  • 12.45 - 14.00 Lunch
  • 14.00-16.00 System presentations (8)
    • Lukas Latacz: "The VUB Blizzard Challenge 2010 Entry: Towards Automatic Voice Building", Lukas Latacz, Wesley Mattheyses, Werner Verhelst (Vrije Universiteit Brussel, Department ETRO-DSSP / Interdisciplinary Institute for Broadband Technology – IBBT, Belgium)
    • Yoshinori Shiga: "NICT Blizzard Challenge 2010 Entry", Yoshinori Shiga, Tomoki Toda, Shinsuke Sakai, Jinfu Ni, Hisashi Kawai, Keiichi Tokuda, Minoru Tsuzaki, Satoshi Nakamura (National Institute of Information and Communications Technology (NICT), Japan / Nara Institute of Science and Technology, Japan / Nagoya Institute of Technology, Japan/ Kyoto City University of Arts, Japan)
    • Reiner Wilhelms-Tricarico: "The Lessac Technologies System for Blizzard Challenge 2010", Rattima Nitisaroj, Reiner Wilhelms-Tricarico, Brian Mottershead, John Reichenbach, Gary Marple (Lessac Technologies, Inc., USA)
    • Zhenhua Ling: "The USTC System for Blizzard Challenge 2010", Yuan Jiang, Zhen-Hua Ling, Ming Lei, Cheng-Cheng Wang, Lu Heng, Yu Hu, Li-Rong Dai, Ren-Hua Wang (iFLYTEK Speech Lab, University of Science and Technology of China, Hefei, China)
    • Pirros Tsiakoulis: "The ILSP Text-to-Speech System for the Blizzard Challenge 2010", Spyros Raptis, Aimilios Chalamandaris, Pirros Tsiakoulis, Sotiris Karabetsos (Institute for Language and Speech Processing / Research Center "Athena" / INNOETICS LTD, Athens, Greece)
    • Junichi Yamagishi: "The CSTR/EMIME system for Blizzard Challenge 2010", Junichi Yamagishi, Oliver Watts, et al
    • Antti Suni and Tuomo Raitio: "The GlottHMM Speech Synthesis Entry for Blizzard Challenge 2010", Antti Suni, Tuomo Raitio, Martti Vainio, Paavo Alku (Department of Speech Sciences, University of Helsinki / Department of Signal Processing and Acoustics, Aalto University, Helsinki, Finland)
    • Yao Qian: "An HMM Trajectory Tiling (HTT) Approach to High Quality TTS - Microsoft Entry to Blizzard Challenge 2010", Yao Qian , Zhi-Jie Yan ,Yi-Jian Wu , Frank K. Soong , Guoliang Zhang, Lijuan Wang (Microsoft Research Asia / Microsoft China, Beijing, China)
  • 16.00-16.30 Coffee break
  • 16.30 Simon King, on behalf of the organisers:
    • wrap-up
    • discussion of future Blizzard Challenges
  • ~18.00 Close



  • Papers not being presented at the workshop
    • "Multilingual TTS System of Nokia Entry for Blizzard 2010", Bufan Zhang, Jari Alhonen, Yong Guan and Jilei Tian (Nokia Research Center, Beijing)

Practical information

Registration

Registration procedure

Presenters

Please mail blizzard@festvox.org by 31st August 2010 to tell us who will present your paper at the workshop.

Attendees

There is no need to register in advance for the Blizzard Challenge Workshop 2010.

If you are attending SSW7, please wear your SSW name badge to gain entry to the Blizzard Workshop.

If you are only attending the Blizzard Workshop, please simply register on the day at the on-site registration desk and receive a name badge.

Cost

There is no cost to attend the Blizzard Workshop this year, due to support from the EMIME project and from NICT.

Accommodation and travel

Please refer to the SSW7 website for suggestions. A shuttle bus will provided from the Nara Washington Hotel Plaza, with the same schedule as during SSW.

Published proceedings

The papers are be published on festvox.org