Abstract:
Automatic Speech Recognition (ASR) technology has gained significant importance in modern communication systems, enabling the conversion of spoken language into written text. This research paper presents an in-depth analysis of voice recognition in the context of the Gujarati language, a tonal and multilingual language with unique phonetic characteristics. The study focuses on a meticulously curated Gujarati speech corpus, comprising diverse speakers of various ages, genders, and regional backgrounds. The corpus is subjected to
detailed acoustic analysis, exploring prosodic features and tonal variations inherent in the language. Through the development and evaluation of ASR models, this research investigates the challenges and opportunities posed by the Gujarati language's phonemic complexity and tonal nuances. The findings shed light on the impact of corpus characteristics, including speaker diversity and phonemic inventory, on ASR model performance. As the field of voice recognition continues to advance, this research contributes valuable insights into effective
ASR model design and training strategies for tonal languages, specifically focusing on the linguistic and acoustic peculiarities of Gujarati. The outcomes of this study offer directions for further advancements in ASR technology and corpus analysis, addressing the challenges of accurately capturing the intricate linguistic features of tonal languages for robust voice recognition systems.