ERR is not C/W/L: Exploring the Relationship Between Expected Reciprocal Rank and Other Metrics


We explore the relationship between expected reciprocal rank (ERR) and the metrics that are available under the C/W/L framework. On the surface, it appears that the user browsing model associated with ERR can be directly injected into a C/W/L arrangement, to produce system measurements equivalent to those generated from ERR. That assumption is now known to be invalid, and demonstration of the impossibility of ERR being described via C/W/L choices forms the first part of our work. Given that ERR cannot be accommodated within the C/W/L framework, we then explore the extent to which practical use of ERR correlates with metrics that do fit within the C/W/L user browsing model. In this part of the investigation we present a range of shallow-evaluation C/W/L variants that have very high correlation with ERR when compared in experiments involving a large number of TREC runs. That is, while ERR itself is not a C/W/L metric, there are other weighted-precision computations that fit with the user model assumed by C/W/L, and yield system comparisons almost indistinguishable from those generated via the use of ERR.

Proceedings of the 2021 ACM SIGIR International Conference on the Theory of Information Retrieval (ICTIR 2021)
Winner of the best paper award