Blizzard Challenge 2012 Rules: Difference between revisions
From SynSIG
Simon.King (talk | contribs) |
Simon.King (talk | contribs) |
||
Line 27: | Line 27: | ||
* You are allowed to use external data in any way you wish, subject to any exclusions given in these rules | * You are allowed to use external data in any way you wish, subject to any exclusions given in these rules | ||
* Use of external data is entirely optional and is not compulsory | * Use of external data is entirely optional and is not compulsory | ||
* You may obtain and use the original recordings by John Greenman from librivox.org of the four books | * You may use the provided audio files, or you may obtain and use the original recordings by John Greenman directly from librivox.org of the following four books by Mark Twain: | ||
* | ** A Tramp Abroad | ||
** Life on the Mississippi | |||
** The Adventures of Tom Sawyer | |||
** The Man That Corrupted Hadleyburg, and Other Stories | |||
* You must not use any additional data from the same speaker (John Greenman), or recordings of any other material by the same author (Mark Twain), or any text by the same author (Mark Twain). | |||
* You may exclude any parts of the provided databases if you wish. | * You may exclude any parts of the provided databases if you wish. | ||
* Use of the provided segmentations, transcriptions or labels is optional. | * Use of the provided segmentations, transcriptions or labels is optional. |
Revision as of 17:56, 14 November 2011
THESE RULES ARE CURRENTLY UNDER CONSTRUCTION AND ARE SUBJECT TO CHANGE
DATABASE ACCESS
- After registration and completion of the required licenses, download passwords will issued, as described on the main Blizzard 2012 page.
REGISTRATION FEE
- A registration fee of 500GBP (approx 800USD) is payable by all participants in task EH2.1 to offset the costs of running the challenge, including paying local assistants and listeners. The fee must be paid by April 2012 (exact date to be confirmed later). You can pay this fee using Edinburgh University's online payments system: (details will be published here later) and register for the event called 'Blizzard Challenge 2012'. After doing this, please also email blizzard@festvox.org to notify us that you have paid. If you are really unable to use the online payments system, please contact blizzard@festvox.org for assistance with other methods of payment. However, we strongly prefer the epay system because it reduces the costs and admin work for us. If you must pay by bank transfer, please contact us in plenty of time (several weeks before the payment deadline); an additional charge of 50GBP will be made for any payments not made using the epay system.
EXPERT LISTENERS
- Each participant is expected to provide at least ten speech experts as listeners of the evaluation tests. Native speakers are preferable, where possible. The organisers would also appreciate assistance in advertising the Challenge as widely as possible (e.g., to your students or colleagues).
BUILDING VOICES
- It is not permissible for a single participant to submit multiple entries for any task, because the listening test will become unmanageable. This rule will only be relaxed in the event of a small number of participants.
- Participants involved in joint projects or consortia who wish to submit multiple systems (e.g., an individual entry and a joint system) should contact the organisers in advance to agree this. We will try to accommodate all reasonable requests, provided the listening test remains manageable.
Phase One
- Task EH1.1: build a voice from the supplied audiobook data, which comprises around 50 hours of speech material, of which around 32 hours have high-confidence transcriptions, with the remainder having transcriptions of lower confidence. This voice should be demonstrated at the Blizzard Challenge Workshop 2011, in Turin, Italy. There is no formal evaluation in Phase One.
Phase Two
- Task EH2.1 - build a voice from the supplied audiobook data, which comprises around 50 hours of speech material, of which around 32 hours have high-confidence transcriptions, with the remainder having transcriptions of lower confidence. Sentences synthesised using this voice should be submitted to the Blizzard organisers for formal evaluation, by the date specified in the timeline.
USE OF EXTERNAL DATA
- "External data" is defined as data, of any type, that is not part of the provided database.
- You are allowed to use external data in any way you wish, subject to any exclusions given in these rules
- Use of external data is entirely optional and is not compulsory
- You may use the provided audio files, or you may obtain and use the original recordings by John Greenman directly from librivox.org of the following four books by Mark Twain:
- A Tramp Abroad
- Life on the Mississippi
- The Adventures of Tom Sawyer
- The Man That Corrupted Hadleyburg, and Other Stories
- You must not use any additional data from the same speaker (John Greenman), or recordings of any other material by the same author (Mark Twain), or any text by the same author (Mark Twain).
- You may exclude any parts of the provided databases if you wish.
- Use of the provided segmentations, transcriptions or labels is optional.
- If you are in any doubt about how to apply these rules, please contact the organizers immediately.
SYNTHESISING THE TEST EXAMPLES
- Phase One: a set of test sentences will be distributed before the 2011 workshop, but no formal listening test is planned. The test sentences will be drawn from contiguous (e.g., paragraph-sized) sections of novels and will have similar segmentation, transcriptions and labels to the distributed corpus.
RETENTION OF SUBMITTED SYNTHETIC SPEECH SAMPLES
- Any examples that you submit for evaluation will be retained by the Blizzard organisers for future use.
- You must include in your submission of the test sentences a statement of whether you give the organisers permission to publically distribute your waveforms and the corresponding listening test results in anonymised form. In the past, all participants have agreed to this and we strongly encourage you to give this consent.
LISTENING TEST
- The listening test design is not yet specified and participants are encouraged to contribute ideas for the evaluation of synthesised audiobooks, or other tasks based on this data.
PAPER
- Each participant will be expected to submit a six-page paper describing their entry for review.
- One of the authors of each accepted paper should present it at the Blizzard 2012 Workshop
- In addition, each participant will be expected to complete a form giving the general technical specification of their system, to facilitate easy cross-system comparisons (e.g. is it unit selection? does it predict prosody? etc. etc)
HOW ARE THESE RULES ENFORCED?
- This is a challenge, which is designed to answer scientific questions, and not a competition. Therefore, we rely on your honesty in preparing your entry.