Examining the Additivity of Top-$k$ Query Processing Innovations


Research activity spanning more than five decades has led to index organizations, compression schemes, and traversal algorithms that allow extremely rapid response to ranked queries against very large text collections. However, little attention has been paid to the interactions between these many components, and the additivity of algorithmic improvements has not been explored. Here we examine the extent to which efficiency improvements add up. We employ four query processing algorithms, four compression codecs, and all possible combinations of four distinct further optimizations, and compare the performance of the 256 resulting systems to determine when and how different optimizations interact. Our results over two test collections show that efficiency enhancements are, for the most part, additive, and that there is little risk of negative interactions. In addition, our detailed profiling across this large pool of systems leads to key insights as to why the various individual enhancements work well, and indicates that optimizing ‘simpler’ implementations can result in higher query throughput than is available from non-optimized versions of the more ‘complex’ techniques, with clear implications for the choices needing to be made by practitioners.

Proceedings of the 29th International Conference on Information and Knowledge Management (CIKM 2020)