Deliverable identification




НазваниеDeliverable identification
страница1/16
Дата конвертации13.04.2013
Размер0.6 Mb.
ТипДокументы
  1   2   3   4   5   6   7   8   9   ...   16

LE2 - 4001

SPEECHDAT


DELIVERABLE IDENTIFICATION


Identification number


LE2-4001 - SD1.3.1

Type


Technical Report



Supplementary notes








Key words


Telephone, speech, database, file, signal, label, table, index

Abstract

This document provides a specification of the telephone speech databases to be collected by the partners in the SpeechDat project.


As the specification has been developed in consultation with all partners, and is intended to provide a common resource for recording all speech databases and the specification must be identically used in each language database. This is intended to ensure that the databases are very similar in terms of their exchange value and potential for speech technology developments in each of the represented languages.


This document provides a detailed specification of the database format, taking in particular account the following aspects: media format and database format in terms of distinction of speech files, label files, database tables and documentation.

Status of the abstract


Public




Received on





Recipient's catalogue number





DOCUMENT EVOLUTION


Version

Date

Status

Notes

1.0

24/07/96

first draft

discussed by WP1 partners



4.4




final report

delivered to CEC on

Contents

Introduction 4

1 CD-ROM contents 5

1.1 Data files 6

1.2 Documentation files 6

1.3 Test/train subset partitioning 7

2 Directory structure 7

3 File nomenclature 9

4 Speech File Format 13

5 Label file format 13

5.1 Label file header 14

5.2 Label file body 19

5.3 Example of label file 21

6 Table files 22

6.1 Speaker information file 23

6.2 Recording condition information file 26

6.3 Session information table 28

6.4 Pronunciation lexicon file 30

7 Index files 31

7.1 Contents index file 33

7.2 Corpus contents files 35

7.3 Speaker list files (SDB only) 36

8 Documentation files 37

8.1 Root directory 37

8.2 \DOC directory 38

8.3 \SOURCE directory 42

8.4 \PROMPT directory 43

9 Summary of all the files described 44

Appendix A   ISO 9660 media format 45

Appendix B   ISO 639 Two-letters Language Codes 48

Appendix C   ISO 8859-1 (Latin 1) characters code 50

Appendix D   SAM format speaker file 54

Appendix E   SAM format recording condition file 56

Bibliography 58
  1   2   3   4   5   6   7   8   9   ...   16

Добавить в свой блог или на сайт

Похожие:

Deliverable identification iconDesire: Project Deliverable

Deliverable identification iconDeliverable reference number: D. Wp. Jra 1

Deliverable identification iconProject and Deliverable Information Sheet

Deliverable identification iconProject and Deliverable Information Sheet

Deliverable identification iconSciencEduc – Work package 3 – Deliverable N° 3

Deliverable identification iconProject and Deliverable Information Sheet

Deliverable identification iconRevised Exploitation Plan Deliverable 2

Deliverable identification iconD3 Agent-to-Human interaction principles Deliverable D3

Deliverable identification icon1 substance identification

Deliverable identification iconDeliverable d 1: State-of-the-Art Photonic Switching Technologies


Разместите кнопку на своём сайте:
lib.convdocs.org


База данных защищена авторским правом ©lib.convdocs.org 2012
обратиться к администрации
lib.convdocs.org
Главная страница