Communications of International Proceedings

Expanding the Functionality of Speech Corpora for Broader Applications

AI and Advanced Data-Driven Technologies: 44AI 2024

Michal HALON and Andrzej PACUT

NASK National Research Institute, Warsaw, Poland

Volume 2024 (22), Article ID 4441424, AI and Advanced Data-Driven Technologies: 44AI 2024

Abstract

Numerous speech corpora have been developed for a variety of purposes. While some are designed for specific topics, they often can be adapted for broader applications, especially when suitable datasets for specific domains or languages are scarce. This paper introduces a method for adapting existing speech corpora for use in biometric recognition and personalized speech synthesis. This involves verifying the accuracy of key metadata in the processed dataset, such as gender labeling and speaker attribution. To accomplish this, we propose a method that analyzes biometric verification distributions of voice samples. Potential inaccuracies are flagged for subsequent human expert listening analysis. The method was tested on the Clarin-PL polish speech corpus using Phonexia software, resulting in improved biometric recognition metrics, including Equal Error Rate (EER) and False Acceptance/False Rejection Rates (FAR/FRR). Our findings demonstrate that this approach can significantly enhance the reliability and applicability of speech corpora in extended applications, especially for those where suitable datasets are scarce. By reducing the need for extensive manual verification, the proposed method facilitates broader utilization of existing speech corpora for advanced biometric and speech synthesis tasks.

Keywords: Speech corpus, voice biometrics, speech synthesis.

Expanding the Functionality of Speech Corpora for Broader Applications

Michal HALON and Andrzej PACUT

NASK National Research Institute, Warsaw, Poland

Abstract

+Articles

+General Information