Examining the Impact of Transcript Variation on Podcast Search and Re-Ranking

Abstract

Effectively retrieving and ranking spoken audio – such as podcasts – is an important problem in the field of Information Retrieval. A typical approach for the effective retrieval of podcasts is to reduce the problem to text retrieval by automatically transcribing the audio files to textual data, followed by segmentation, indexing, and retrieval. In this work, we examine how automatic transcription algorithms impact the effectiveness of podcast retrieval and re-ranking in dense and sparse retrieval configurations, motivated by the wide spectrum of quality-vs-cost transcription models currently available. Our results demonstrate that the choice of transcription model has a measurable impact on both end-to-end retrieval and late-stage re-ranking pipelines, for both dense and sparse retrievers. Our study highlights the issues and limitations of employing automatic speech recognition (ASR) models in podcast search and motivates future research on this important problem.

Publication
Proceedings of the 47th European Conference on Information Retrieval (ECIR 2025)
Date
Links