🔥 Burn Fat Fast. Discover How! 💪

​​Filterable HNSW - part 2 In a previous article on the filte | Neural Networks Engineering

​​Filterable HNSW - part 2

In a previous article on the filter when searching for nearest neighbors, we discussed the theoretical background.
This time I am going to present a C++ implementation with Python bindings.

As a base implementation of HNSW I took hnswlib, stand-alone header-only implementation of HNSW.

With new implementation it is possible now to assign an arbitrary number of tags to any point with a simple code:

# ids - list of point ids
# tag - tag id
hnsw.add_tags(ids, tag)

The group of points under the same tag could be searched separately from others:

query_vector = ...
tag_to_search_in = 42
# Search among points with this tag
condition = [[(False, tag_to_search_in)]]
labels, dist = hnsw.knn_query(query_vector, k=10, conditions=condition)

These groups could also be combined using boolean expressions. For example (A | !B) & C is represented as [[(0, A), (1, B)], [(0, C)]], where A, B, C are logical clauses if respective tag is assigned to a point.

If the group is large enough ( >> 1/M fraction of all points), knn_query should work fine. But if the group is smaller, it may need to build additional connections in the HNSW graph for these groups.

hnsw.index_tagged(tag=42, m=8)

Based on the HNSW with categorical filtering, it is possible to build build a tool that can search in specified geo-region only.

Find a full version of this article with more examples and explanations in my blog.