Effectively retrieving and ranking spoken audio – such as podcasts – is an important problem in the field of Information Retrieval. A typical approach for the effective retrieval of podcasts is to reduce the problem to text retrieval by automatically transcribing the audio files to textual data, followed by segmentation, indexing, and retrieval. In this work, we examine how automatic transcription algorithms impact the effectiveness of podcast retrieval and re-ranking in dense and sparse retrieval configurations, motivated by the wide spectrum of quality-vs-cost transcription models currently available. Our results demonstrate that the choice of transcription model has a measurable impact on both end-to-end retrieval and late-stage re-ranking pipelines, for both dense and sparse retrievers. Our study highlights the issues and limitations of employing automatic speech recognition (ASR) models in podcast search and motivates future research on this important problem.