International Journal of Computer Science Issues

Annotating Speech Corpus for Prosody Modeling in Indian Language Text to Speech Systems

Kiruthiga S and Krishnamoorthy K

A spoken language system, it may either be a speech synthesis or a speech recognition system, starts with building a speech corpora. We give a detailed survey of issues and a methodology that selects the appropriate speech unit in building a speech corpus for Indian language Text to Speech systems. The paper ultimately aims to improve the intelligibility of the synthesized speech in Text to Speech synthesis systems. To begin with, an appropriate text file should be selected for building the speech corpus. Then a corresponding speech file is generated and stored. This speech file is the phonetic representation of the selected text file. The speech file is processed in different levels viz., paragraphs, sentences, phrases, words, syllables and phones. These are called the speech units of the file. Researches have been done taking these units as the basic unit for processing. This paper analyses the researches done using phones, diphones, triphones, syllables and polysyllables as their basic unit for speech synthesis. The paper also provides a recommended set of combinations for polysyllables. Concatenative speech synthesis involves the concatenation of these basic units to synthesize an intelligent, natural sounding speech. The speech units are annotated with relevant prosodic information about each unit, manually or automatically, based on an algorithm. The database consisting of the units along with their annotated information is called as the annotated speech corpus. A Clustering technique is used in the annotated speech corpus that provides way to select the appropriate unit for concatenation, based on the lowest total join cost of the speech unit.

Keywords: Speech corpus, Indian languages, Concatenative Speech synthesis, Syllables, Prosody

Download Full-Text

ABOUT THE AUTHORS

Kiruthiga S
Kiruthiga S is a research scholar in Anna University Coimbatore. She obtained her Master of Engineering Degree in 2006 under Anna University Chennai. She completed her Bachelor of Engineering Degree under Bharathiyar University in the year 2003. She has worked as Lecturer in Sona College of Technology Salem for 4 years. Her research interest includes Text to Speech Synthesis system and other Natural Language Processing applications.

Krishnamoorthy K
Krishnamoorthy K is a Professor in Sudharsan Engineering College, Pudukkottai. He is one of the recognized research supervisor in Anna University Coimbatore. He obtained his Ph. D degree in 2007 and Master of Engineering Degree in 2003 both under Dayananda Sagar University, Bangalore. He has an experience as an academician for a span of a decade. His research interest includes Natural Language Processing and Digital Image Processing

International Journal of Computer Science Issues More than a traditional journal...

Annotating Speech Corpus for Prosody Modeling in Indian Language Text to Speech Systems

International Journal of Computer Science Issues

More than a traditional journal...