Education: Difference between revisions

From SynSIG
No edit summary
No edit summary
 
Line 1: Line 1:
{{webmaster}}
{{webmaster}}
== SPCC ==
== SPCC ==
The Speech Processing Courses in Crete (SPCC) are targeting to teach graduate students and researchers the latest advancements of speech processing covering theory, hands on, and establishing contacts between the academics and industry. The school will provide the chance to students and professionals to meet world leaders in speech technology, exchanging ideas, sharing experiences and vision.
The Speech Processing Courses in Crete (SPCC) are targeting to teach graduate students and researchers the latest advancements of speech processing covering theory, hands on, and establishing contacts between the academics and industry. The school will provide the chance to students and professionals to meet world leaders in speech technology, exchanging ideas, sharing experiences and vision.
The Summer School is organized by the University of Crete, Greece.
The Summer School is organized by the University of Crete, Greece.
=== 2020 ===
* webpage: http://spcc.csd.uoc.gr/
* For 2020, the school topic is '''Neural approaches for speech enhancement, synthesis, and coding''', the topic includes:
** Basic components of neural vocoders: Wavenet, Parallel Wavenet, and WaveRNN
** Deep generative models for speech compression
** Neural auto-regressive, source-filter and glottal vocoders for speech and music signals
** Neural vocoders for coding, synthesis, and enhancement
=== 2019 ===
* webpage: http://spcc.csd.uoc.gr/2019/
* For 2019, the school topic is '''Conversational Speech Synthesis: from design to evaluation''', the topic includes:
** Introduction to modern statistical dialogue systems and requests for a conversational speech synthesis system
** Modern Acoustic Modelling Approaches (WaveNet, Tacotron)
** Advanced flexibility for effective conversational TTS: Style token and Voice Conversion
** Evaluation of conversational and multimodal TTS
=== 2018 ===
* webpage: http://spcc.csd.uoc.gr/SPCC2018/
* For 2018, the school topic is '''Towards Flexible and Intelligible End-to-End Speech synthesis systems''', The topic includes:
** Modern Acoustic Modelling Approaches (WaveNet, Tacotron)
** Contemporary Unit Selection: The art of creating thousands of voices in real products
** Advanced Voice Conversion using Deep Learning (Wavenet, GAN, etc)
** Intelligibility and Cognitive Effort in Speech Synthesis


=== 2017 ===
=== 2017 ===
* webpage: http://spcc.csd.uoc.gr/
* webpage: http://spcc.csd.uoc.gr/SPCC2017/
* the school topic is: '''Towards Intelligible and Conversational Speech Synthesis Engines''', The topic includes:
* the school topic is: '''Towards Intelligible and Conversational Speech Synthesis Engines''', The topic includes:
** Modern Acoustic Modelling Approaches (DNN/LSTM, WaveNet)
** Modern Acoustic Modelling Approaches (DNN/LSTM, WaveNet)
Line 58: Line 86:
*[http://spcc.csd.uoc.gr/SPCC2014/material/SPCC14-DialogueModeling_Pietquin.pdf Dr Olivier Pietquin, Univeristy of Lille 1, France : Statistical Dialogue Modeling]
*[http://spcc.csd.uoc.gr/SPCC2014/material/SPCC14-DialogueModeling_Pietquin.pdf Dr Olivier Pietquin, Univeristy of Lille 1, France : Statistical Dialogue Modeling]
*[http://spcc.csd.uoc.gr/SPCC2014/material/SPCC14-DSR_Katsamanis.pdf Dr Nassos Katsamanis, NTUA, Athens : Distant Speech Recognition]
*[http://spcc.csd.uoc.gr/SPCC2014/material/SPCC14-DSR_Katsamanis.pdf Dr Nassos Katsamanis, NTUA, Athens : Distant Speech Recognition]


== Keynotes & tutorials at International conferences and workshops ==
== Keynotes & tutorials at International conferences and workshops ==
* [http://www.eusipco2017.org/wp-content/uploads/2017/09/SimonKing_Keynote-talk_EUSIPCO_2017.pdf  Simon King, Speech synthesis: where did the signal processing go? @ EUSIPCO2017]
* [http://www.eusipco2017.org/wp-content/uploads/2017/09/SimonKing_Keynote-talk_EUSIPCO_2017.pdf  Simon King, Speech synthesis: where did the signal processing go? @ EUSIPCO2017]
* [http://www.speech.zone/courses/one-off/merlin-interspeech2017/ Simon King, Oliver Watts, Srikanth Ronanki, Zhizheng Wu, Felipe Espic, Deep Learning for Text-to-Speech Synthesis, using the Merlin toolkit @ Interspeech 2017]
* [http://www.speech.zone/courses/one-off/merlin-interspeech2017/ Simon King, Oliver Watts, Srikanth Ronanki, Zhizheng Wu, Felipe Espic, Deep Learning for Text-to-Speech Synthesis, using the Merlin toolkit @ Interspeech 2017]
* [https://www.superlectures.com/interspeech2016/isca-medalist-for-leadership-and-extensive-contributions-to-speech-and-language-processing John Makhoul: A 50-year retrospective on speech and languag processing @ Interspeech 2016]
* [https://www.superlectures.com/interspeech2016/isca-medalist-for-leadership-and-extensive-contributions-to-speech-and-language-processing John Makhoul: A 50-year retrospective on speech and languag processing @ Interspeech 2016]
* [https://www.superlectures.com/odyssey2016/voice-conversion-and-spoofing-countermeasures-for-speaker-verification Haizhou Li, Voice conversion and spoofing countermeasures for speaker verification @ Odyssey 2016]
* [https://www.superlectures.com/odyssey2016/voice-conversion-and-spoofing-countermeasures-for-speaker-verification Haizhou Li, Voice conversion and spoofing countermeasures for speaker verification @ Odyssey 2016]
* [https://www.superlectures.com/odyssey2016/understanding-individual-level-speech-variability-from-novel-speech-production-data-to-robust-speaker-recognition Shri Narayanan, Understanding individual-level speech variability: From novel speech production data to robust speaker recognition @ Odyssey 2016]
* [https://www.superlectures.com/odyssey2016/understanding-individual-level-speech-variability-from-novel-speech-production-data-to-robust-speaker-recognition Shri Narayanan, Understanding individual-level speech variability: From novel speech production data to robust speaker recognition @ Odyssey 2016]
* [https://www.superlectures.com/iscslp2014/tutorial-4-deep-learning-for-speech-generation-and-synthesis Yao Qian and Frank K. Soong, Deep Learning for Speech Generation and Synthesis @ ISCSLP 2014]
* [https://www.superlectures.com/iscslp2014/tutorial-4-deep-learning-for-speech-generation-and-synthesis Yao Qian and Frank K. Soong, Deep Learning for Speech Generation and Synthesis @ ISCSLP 2014]
* [https://www.superlectures.com/odyssey2014/speaking-in-adverse-conditions-from-behavioural-observations-to-intelligibility-enhancing-speech-modifications Martin Cooke, Speaking in adverse conditions: from behavioural observations to intelligibility-enhancing speech modifications @ Odyssey 2014]
* [https://www.superlectures.com/odyssey2014/speaking-in-adverse-conditions-from-behavioural-observations-to-intelligibility-enhancing-speech-modifications Martin Cooke, Speaking in adverse conditions: from behavioural observations to intelligibility-enhancing speech modifications @ Odyssey 2014]
* [https://www.superlectures.com/asru2011/speech-synthesis-as-a-statistical-machine-learning-problem Keiichi Tokuda, Speech Synthesis as A Statistical Machine Learning Problem @ ASRU 2011]
* [https://www.superlectures.com/asru2011/speech-synthesis-as-a-statistical-machine-learning-problem Keiichi Tokuda, Speech Synthesis as A Statistical Machine Learning Problem @ ASRU 2011]
* [https://www.sp.nitech.ac.jp/~tokuda/tokuda_interspeech09_tutorial.pdf Keiichi Tokuda, Heiga Zen, Fundamentals and recent advances in HMM-based speech synthesis @ Interspeech 2009]
* [https://www.sp.nitech.ac.jp/~tokuda/tokuda_interspeech09_tutorial.pdf Keiichi Tokuda, Heiga Zen, Fundamentals and recent advances in HMM-based speech synthesis @ Interspeech 2009]


== iTuens U  ==
== Podcasts ==
Simon King,  Using speech synthesis to give everyone their own voice, Inaugural lecture, University of Edinburgh  
Simon King,  Using speech synthesis to give everyone their own voice, Inaugural lecture, University of Edinburgh  
https://itunes.apple.com/jp/podcast/prof-simon-king-using-speech-synthesis-to-give-everyone/id738501766?i=1000170300147&mt=2
https://itunes.apple.com/jp/podcast/prof-simon-king-using-speech-synthesis-to-give-everyone/id738501766?i=1000170300147&mt=2
Line 84: Line 106:
Keiichi Tokuda, Human-like singing and talking machine, Human Language Technology Lecture Series, MIT   
Keiichi Tokuda, Human-like singing and talking machine, Human Language Technology Lecture Series, MIT   
https://itunes.apple.com/jp/podcast/human-like-singing-and-talking-machines/id787393959?i=1000344067611&mt=2
https://itunes.apple.com/jp/podcast/human-like-singing-and-talking-machines/id787393959?i=1000344067611&mt=2
== Youtube videos ==
=== SynSIG ===
SynSIG has a dedicated channel: https://www.youtube.com/channel/UCiNEMZxIjvlsBKlBdAqT-VQ
=== Others ===
* Kim Silverman - Speech Synthesis https://www.youtube.com/watch?v=7mjh0PSUv0M
* Prof. Simon King - Using Speech Synthesis to give Everyone their own Voice https://www.youtube.com/watch?v=xzL-pxcpo-E
* Zhen-Hua Ling - HMM-based Speech Synthesis: Fundamentals and Its Recent Advances https://www.youtube.com/watch?v=MPdOp72bOCA


== Other kind of teaching materials ==
== Other kind of teaching materials ==
Line 91: Line 124:


* [http://tcts.fpms.ac.be/projects/ttsbox/ TTSBOX, A Matlab tutorial toolbox on corpus-based Text-to-Speech synthesis], by Thierry Dutoit, Faculté Polytechnique de Mons, Belgium.
* [http://tcts.fpms.ac.be/projects/ttsbox/ TTSBOX, A Matlab tutorial toolbox on corpus-based Text-to-Speech synthesis], by Thierry Dutoit, Faculté Polytechnique de Mons, Belgium.
==Samples==
* Access to some [[speech samples]]
== Educational Softwares ==
* See our list of [[Software|educational softwares]]


== Historical images ==
== Historical images ==

Latest revision as of 14:23, 14 January 2021

SPCC

The Speech Processing Courses in Crete (SPCC) are targeting to teach graduate students and researchers the latest advancements of speech processing covering theory, hands on, and establishing contacts between the academics and industry. The school will provide the chance to students and professionals to meet world leaders in speech technology, exchanging ideas, sharing experiences and vision. The Summer School is organized by the University of Crete, Greece.

2020

  • webpage: http://spcc.csd.uoc.gr/
  • For 2020, the school topic is Neural approaches for speech enhancement, synthesis, and coding, the topic includes:
    • Basic components of neural vocoders: Wavenet, Parallel Wavenet, and WaveRNN
    • Deep generative models for speech compression
    • Neural auto-regressive, source-filter and glottal vocoders for speech and music signals
    • Neural vocoders for coding, synthesis, and enhancement


2019

  • webpage: http://spcc.csd.uoc.gr/2019/
  • For 2019, the school topic is Conversational Speech Synthesis: from design to evaluation, the topic includes:
    • Introduction to modern statistical dialogue systems and requests for a conversational speech synthesis system
    • Modern Acoustic Modelling Approaches (WaveNet, Tacotron)
    • Advanced flexibility for effective conversational TTS: Style token and Voice Conversion
    • Evaluation of conversational and multimodal TTS


2018

  • webpage: http://spcc.csd.uoc.gr/SPCC2018/
  • For 2018, the school topic is Towards Flexible and Intelligible End-to-End Speech synthesis systems, The topic includes:
    • Modern Acoustic Modelling Approaches (WaveNet, Tacotron)
    • Contemporary Unit Selection: The art of creating thousands of voices in real products
    • Advanced Voice Conversion using Deep Learning (Wavenet, GAN, etc)
    • Intelligibility and Cognitive Effort in Speech Synthesis

2017

  • webpage: http://spcc.csd.uoc.gr/SPCC2017/
  • the school topic is: Towards Intelligible and Conversational Speech Synthesis Engines, The topic includes:
    • Modern Acoustic Modelling Approaches (DNN/LSTM, WaveNet)
    • Text Normalization and Linguistic Analysis
    • Prosody
    • Advanced Vocoders and Modifications (Voice Conversion)
    • Intelligibility and Cognitive Effort in Speech Synthesis

2016

  • webpage: http://spcc.csd.uoc.gr/SPCC2016/
  • the school topic is: Advancements in Modern Speech Synthesis Engines, The topic includes:
    • Advanced Speech Signal Modelling and Modifications
    • Current Acoustic Modelling Approaches
    • Challenges in Fornt-End Processing
    • Listening Context Aware Speech Synthesis Systems
    • Text Normalization and Linguistic Analysis

Lecture slides used for SPCC 2016 are available online

2015

  • webpage: http://spcc.csd.uoc.gr/SPCC2015/
  • the school topic is: From Diphones to Modern Speech Synthesis Engines, The topic includes:
    • Speech Signal Modelling and Modifications
    • Acoustic Modelling: HMM, LDM, DNN
    • Approaches: Diphones, Unit Selection, Statistical, Hybrid
    • Listening Context Aware speech synthesis systems

2014

Lecture slides used for SPCC 2014 are available online


Keynotes & tutorials at International conferences and workshops

Podcasts

Simon King, Using speech synthesis to give everyone their own voice, Inaugural lecture, University of Edinburgh https://itunes.apple.com/jp/podcast/prof-simon-king-using-speech-synthesis-to-give-everyone/id738501766?i=1000170300147&mt=2

Keiichi Tokuda, Human-like singing and talking machine, Human Language Technology Lecture Series, MIT https://itunes.apple.com/jp/podcast/human-like-singing-and-talking-machines/id787393959?i=1000344067611&mt=2

Youtube videos

SynSIG

SynSIG has a dedicated channel: https://www.youtube.com/channel/UCiNEMZxIjvlsBKlBdAqT-VQ

Others

Other kind of teaching materials

Historical images

Take a look at our gallery of historical images.

External Links