Immediate-Access Indexing Using Space-Efficient Extensible Arrays

Abstract

The array is a fundamental data object in most programs. Its key functionality – storage of and access to a set of same-type elements in $O$(1) time per operation – is also widely employed in other more sophisticated data structures. In an extensible array the number of elements in the set is unknown at the time the program is initiated, and the array might continue to grow right through the program’s execution. In this paper we explore the use of extensible arrays in connection with the task of inverted index construction. We develop and test a space-efficient extensible array arrangement that has been previously described but not to our knowledge employed in practice, and show that it adds considerable flexibility to the index construction process while incurring only modest run-time overheads as a result of access indirections.

Publication
Proceedings of the 26th Australasian Document Computing Symposium (ADCS 2022)
Date
Links

Errata: “12 segments” in Figure 2 should actually say “24 segments” - Thanks to Vivian McKnight who pointed this out, and my sincere apologies to Alistair Moffat who had the figure right in the first place before I re-created it for the paper…