Silent pauses play a crucial role in text-to-speech synthesis, where they help make the text reading sound more natural.
In this work, our goal is to predict these silent pauses from texts to improve automatic reading systems. As this task has not been extensively studied for French, it is necessary to build training data dedicated to the prediction of pauses.
We propose a strategy for inferring pauses, based on temporal information from transcribed speech, in order to obtain such a corpus. We then show that with the help of a model based on Transformers and appropriate data, it is possible to obtain promising results for the prediction of pauses produced by a speaker during text reading.
This paper has been accepted for publication in the proceedings of the TALN 2023 conference.
The full article is in French.