Inline filtered search with adaptive L#1131
Conversation
…en/two-queue-adaptive-l
…en/two-queue-adaptive-l
…en/two-queue-adaptive-l
…en/two-queue-adaptive-l
hildebrandmw
left a comment
There was a problem hiding this comment.
Thanks Magdalen! In addition to my inline comments - can you also add some integration tests exercising the functionality here? These go a surprisingly long way towards protecting the algorithm.
Also, can we bikeshed InlineSearch a little? Maybe FilteredSearch? Or InlineFilteredSearch? Not a huge deal, but InlineSearch seems a little opaque to me.
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #1131 +/- ##
==========================================
+ Coverage 89.45% 90.56% +1.11%
==========================================
Files 484 487 +3
Lines 91407 92443 +1036
==========================================
+ Hits 81765 83722 +1957
+ Misses 9642 8721 -921
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
I've added some integration test. In addition to throwing some of the existing cases in multihop at it, I also designed a test that should return different results depending on whether or not the adaptive L feature is enabled. Regarding the naming, I named it |
There was a problem hiding this comment.
Pull request overview
This PR adds a new “inline filtered search” path (with optional adaptive L scaling) to the diskann graph search API and wires it through the benchmark harness and test suite.
Changes:
- Introduces
InlineSearch+AdaptiveLindiskann/src/graph/search/inline_filter_search.rsand re-exports them fromdiskann::graph::search. - Adds end-to-end and algorithm-focused tests for inline filtered traversal behavior.
- Extends
diskann-benchmark/diskann-benchmark-coreto support a newtopk-inline-filtersearch phase, including an example JSON and integration test.
Reviewed changes
Copilot reviewed 15 out of 15 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| diskann/src/graph/test/cases/multihop.rs | Exposes test helpers/filters (pub(super)) for reuse by the new inline test suite. |
| diskann/src/graph/test/cases/mod.rs | Registers the new inline test module. |
| diskann/src/graph/test/cases/inline.rs | Adds end-to-end tests for index.search(InlineSearch { .. }), including adaptive-L behavior. |
| diskann/src/graph/search/multihop_search.rs | Alters scratch initialization (currently introduces a safety/correctness regression if scratch is reused). |
| diskann/src/graph/search/mod.rs | Adds the inline search module and publicly re-exports InlineSearch/AdaptiveL. |
| diskann/src/graph/search/inline_filter_search.rs | Implements inline filtered search and adaptive-L sizing; includes unit tests. |
| diskann-benchmark/src/main.rs | Adds an integration test that runs the new inline-filter example config. |
| diskann-benchmark/src/inputs/graph_index.rs | Adds TopkInlineFilter search phase + adaptive-L config parsing/validation. |
| diskann-benchmark/src/backend/index/spherical.rs | Registers and implements the inline-filter search plugin for spherical backend. |
| diskann-benchmark/src/backend/index/search/plugins.rs | Adds the TopkInlineFilter plugin type/kind mapping. |
| diskann-benchmark/src/backend/index/search/knn.rs | Adds Knn trait implementation for benchmark_core::search::graph::InlineSearch. |
| diskann-benchmark/src/backend/index/benchmarks.rs | Registers and implements the inline-filter search plugin for the main backend. |
| diskann-benchmark/example/graph-index-inline-filter.json | Adds a runnable benchmark example for topk-inline-filter with adaptive_l. |
| diskann-benchmark-core/src/search/graph/mod.rs | Exposes the new benchmark-core inline graph search helper module/type. |
| diskann-benchmark-core/src/search/graph/inline.rs | Adds benchmark-core InlineSearch wrapper + tests for inline filtering behavior. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
hildebrandmw
left a comment
There was a problem hiding this comment.
Thanks Magdalen, my main concern here is getting in decent test coverage using the baseline testing infrastructure. It's more robust than ad-hoc metrics since it can store and validate a lot more information.
Also to bike shed, I do think InlineSearch is a bit vague. Maybe InlineFilteredSearch? Or FilteredKnn?
Refactor integration tests to (1) ensure that the piecewise regime of adaptive-L is properly hit and (2) ensure there is differentiation between the adaptive algorithm and the standard fixed algorithm. This uses a 1D grid with an exactly seeded filter to ensure the target specificities are hit during the search to the query. By putting additional filter points just outside the window that can be reached with a standard search, we can verify that the various regimes of adaptive-L are properly hit. --------- Co-authored-by: Mark Hildebrand <mhildebrand@microsoft.com>
## Breaking changes since 0.53.0 ### Graph search: `DataProvider` contract collapsed into `SearchAccessor` (#1067) `Accessor`, `BuildQueryComputer`, `ExpandBeam`, `SearchExt`, and `AsNeighbor`/`NeighborAccessor` are merged into a single `SearchAccessor` trait. The indexing layer no longer has a notion of element types. - **Upgrade:** Implement `SearchAccessor` instead of the removed traits; use `SearchAccessor::expand_beam` for search. `SearchStrategy`/`InsertStrategy`/`DefaultSearchStrategy`/`DefaultPostProcessor` now carry a lifetime, and the query is passed into `search_accessor` (accessors may now borrow the query). `SearchPostProcess` no longer takes a `QueryComputer` (only requires `HasId`). The blanket `workingset::Fill` impl for `workingset::Map` was removed — implement `Fill` yourself, or use the new synchronous `Map::fill` helper. ### Insert/prune: consolidated into `PruneAccessor` (#1138, follow-up to #1067) Removed `DelegateNeighbor`, `AsNeighbor`, `AsNeighborMut`, `HasElementRef`, `BuildDistanceComputer`, `workingset::Fill`, and `workingset::AsWorkingSet`, folded into a single `PruneAccessor` trait. - **Upgrade:** Implement `PruneAccessor` (provides `neighbors()` for neighbor delegation and `fill()` returning both a `View` and the distance computer). Note `neighbors()` now borrows `&mut self`. ### `VectorId`: removed scalar conversion traits/bounds (#1145, #1133) Dropped `VectorIdTryFrom`, `TryIntoVectorId`, methods `vector_id_try_from`/`try_into_vector_id`, helpers `vecid_from_u32`/`vecid_from_usize`, and `IdConversionError`/`ErrorToVectorId`. Internal IDs no longer need to convert to/from `usize`. - **Upgrade:** Where `usize` conversion is still required (e.g. roaring-treemap keys in `diskann-label-filter`), add an explicit `IntoUsize` bound (now required on `RoaringAttributeStore`, `InlineBetaStrategy`, `QueryBitmapEvaluator`/`BitmapFilter`). `DiskANNIndex::prune_range` now takes `impl IntoIterator<Item = DP::InternalId> + Send` instead of `Range<DP::InternalId>` — construct the iterator for your ID type at the call site. `SimpleNeighborProviderAsync` and `bftree::VectorProvider` are no longer generic over the ID type (fixed to `u32`). ### `DiskIndexReader`: dropped vestigial `VectorType` generic (#1161) - **Upgrade:** Replace `DiskIndexReader::<T>::new(...)` with `DiskIndexReader::new(...)`. ### Filtered search renames (#1149) `MultihopSearch` → `MultihopFilterSearch`; benchmark config phases `MultiHopSearchPhase`/`InlineSearchPhase` → `MultihopFilterSearchPhase`/`InlineFilterSearchPhase`. - **Upgrade:** Update references to the new names. ### `diskann-garnet` FFI: BIN/Q8 quantizers, bumped to 2.0.0 (#1050) Vectors are now stored as `Poly<[u8], AlignOfEight>`; a type-erased `GarnetQuantizer` trait replaces index/provider type parameterization. New FFI: `insert()` returns a success/training-ready flag, plus `build_quant_table()` and `backfill_quant_vectors()` for caller-driven async training/backfill. Accessor renamed to `DynamicAccessor`; the FSM is now lockable and gained `visit_used()`. - **Upgrade:** Garnet consumers must adopt the new 2.0.0 FFI surface (handle the new `insert()` return flag and drive `build_quant_table`/`backfill_quant_vectors`). ## Notable fixes & features (non-breaking) - **Concurrency:** Fixed an ID double-minting race during concurrent insert/delete; delete now removes mapping/attributes before marking the ID free (#1146). Fixed garnet handling of missing quant vectors during `delete()` (#1130). - **Quantization:** Fixed quantizer detection in `train_quantizer()` / `set_element()` (#1140). - **Filtered search:** Added inline filtered search with adaptive `L` (#1131), per the filtered-search RFC (#1128). - **bftree:** Fixed a bug writing the length to the neighbor list (#1150). - **Benchmarks/infra:** Multi-vector MaxSim benchmark with BYOTE factory (#1027); `bf_tree` benchmark infrastructure (#1106); spherical exhaustive benchmark threadpool fix (#1148); right-sized `tiled_reduce` tile buffer (#1123). - `BufferedDistance` accepts `UnalignedSlice`: added a `PreprocessedDistanceFunction<UnalignedSlice<'_, T>>` impl for `BufferedDistance` (#1113).
Introduce a `FilteredAccessor` as a filter aware version of
`SearchAccessor`.
## Motivation
Our current hook for performing filtered search relies on passing a
`QueryLabelProvider` externally to all call sites. This has the
following disadvantages:
* The caller must somehow know exactly the ID space of the provider.
* The `QueryLabelProvider` is hard-coded to use the accessor's internal
ID, which may not always be the best fit depending on context.
* `QueryLabelProvider` evaluations always live behind a trait object and
are just queried one at a time.
* The search accessor has no knowledge of the filter.
* Splitting the label check with distance computations opens a
time-of-check, time-of-use window that can be tricky to resolve.
## How
This PR boils down to the following enums and trait:
```rust
#[derive(Debug, Clone, Copy, PartialEq, Eq, Hash)]
#[repr(transparent)]
pub struct Accept<T>(T);
#[derive(Debug, Clone, Copy, PartialEq, Eq, Hash)]
#[repr(transparent)]
pub struct Reject<T>(T);
#[derive(Debug, Clone, Copy)]
pub enum Decision<T> {
Accept(Accept<T>),
Reject(Reject<T>),
}
pub trait FilteredAccessor: HasId + Send + Sync {
fn start_point_distances<F>(
&mut self,
f: F,
) -> impl std::future::Future<Output = ANNResult<()>> + Send
where
F: FnMut(Decision<Self::Id>, f32) + Send;
fn expand_beam_filtered<Itr, P, F>(
&mut self,
ids: Itr,
pred: P,
on_neighbors: F,
) -> impl std::future::Future<Output = ANNResult<()>> + Send
where
Itr: Iterator<Item = Self::Id> + Send,
P: HybridPredicate<Self::Id> + Send + Sync,
F: FnMut(Decision<Self::Id>, f32) + Send;
fn expand_beam_accept_only<Itr, P, F>(
&mut self,
ids: Itr,
pred: P,
on_neighbors: F,
) -> impl std::future::Future<Output = ANNResult<()>> + Send
where
Itr: Iterator<Item = Self::Id> + Send,
P: HybridPredicate<Accept<Self::Id>> + Send + Sync,
F: FnMut(Accept<Self::Id>, f32) + Send;
fn terminate_early(&mut self) -> bool {
false
}
fn num_starting_points(&self) -> impl std::future::Future<Output = ANNResult<usize>> + Send {
// provided
}
}
```
The existing multi-hop search is rewritten to use `FilteredAccessor` and
#1131 can be easily rewritten this way as well. This makes the
`FilteredAccessor` "filter-aware" and allows the stated problems to be
solved (the TOCTOU potential is really pushed down to the accessor, but
that's the layer that knows the concurrency story so it's in a much
better place to solve it anyways).
To reduce code churn, I added a new module
`diskann::graph::ext::labeled` that is the new home for
`QueryLabelProvider` and introduces a wrapping `FilteredAccessor` and
associated strategy. The functionality in this module is intended as a
compatibility layer for code that is currently relying on
`QueryLabelProvider`.
## Two new `expand_*` methods
Unfortunately, the semantics of filtered search for
`MultiHopFilteredSearch` basically requires the use of two different
`expand_*` methods. This is because the accept-only path *really* needs
only accepted items to be passed to `pred.eval_mut`. This PR models that
by bespoke `Accept`/`Reject` types and because of this nuance, two
methods seemed like the most robust way of expressing this.
## Bug Fixes in MultiHop
With this change, it was clear that multi-hop search was slightly buggy
with respect to start point handling: start points were treated as
always fulfilling the predicate. This PR goes ahead and fixes that bug
by tracking rejected start points explicitly and filtering them out in
the post-process call.
## What's up with `QueryVisitDecision`
This PR gets rid of `QueryVisitDecision`. It is conflating a large
number of unrelated decisions.
* It was almost never used properly: no code outside of the core
`diskann` ever bothered with it, leading to semantic issues with, for
example, beta-filtered search never calling `on_visit`.
* Multi-hop search wasn't even using it right when adding items from
`two_hop_neighbors`. To actually use it right would require a redundant
interrogation of the `QueryLabelProvider`.
* The decision to terminate is weird - that decision belongs with the
accessor.
* Tweaking distance is sketchy and not clear if it's intended to
influence navigation or simply be recorded and passed to post process.
Upcoming ID refactors will help restore this functionality *and* make it
clear what the intent is.
Dropping this extra functionality means we can ditch the multi-hop tests
specifically testing this functionality.
## Suggested Review Order
* Start with the first commit. Since it seemed cleanest to still use
`SearchStrategy`, this commit relaxes the `SearchAccessor` trait bound
on `SearchStrategy::SearchAccessor` to allow the disjoint
`FilteredAccessor` to join the party. This commit is the one responsible
for most of the churn in random parts of the code.
* `diskann/src/graph/glue.rs`: The new `FilteredAccessor` and associated
enums.
* `diskann/src/graph/search/multihop_search.rs`: Refactored multi-hop
search to use the new trait.
* `diskann/src/graph/ext/labeled.rs`: The adaptor layer that allows code
currently written for `QueryLabelProvider` to keep working.
*
`diskann/test/generated/graph/test/cases/multihop/callback_filtering_grid.json`
was deleted since it's corresponding test was deleted. We no longer
support distance tweaking.
This PR implements the recommendation in the filtered search RFC to implement inline filtering with the adaptive L method as an optional addition.