Hierarchical Navigable Small World: a scalable nearest neighbor search

janalsncm 6 hours ago

HNSW was the first approximate nearest neighbor search algorithm I encountered. There are others that are useful in different contexts (i.e. depending on your data size and available memory). And you can combine them sometimes. This video is a really awesome overview of them:

https://youtu.be/SKrHs03i08Q

sakras 8 hours ago

What really made HNSW click for me was when someone described it as a "skiplist graph". It really does seem to share a lot of the same properties and intuition as a simple skiplist, where descending levels corresponds to finer-grained navigation through the space.

A couple of other thoughts with this perspective: - if a drawing of a skiplist is 2D, HNSW seems to be a 3D generalization. It makes me wonder if you could apply an HNSW-like scheme for a traditional relational database index (maybe a multicolumn index?). - if a skiplist is a probabilistic alternative to a B+-tree, what's HNSW a probabilistic alternative to?

One thing that still mystifies me is how in the context of vector indexing, the HNSW seems to "bootstrap" itself by using itself to find the neighbors of a newly-inserted point. It's not clear to me why that doesn't diverge to some garbage results as the index grows.

3abiton 8 hours ago

Interesting read, I worked on graph theory for a while some long time ago and did not even know some of these terminologies.