fix(vector/lancedb): return [] from retrieve() for an empty id list#3083
Open
Vasilije1990 wants to merge 1 commit into
Open
fix(vector/lancedb): return [] from retrieve() for an empty id list#3083Vasilije1990 wants to merge 1 commit into
Vasilije1990 wants to merge 1 commit into
Conversation
retrieve() built an "id IN ()" filter for an empty data_point_ids list, which lance rejects as a SQL parse error. Sibling adapters return [] for the empty case (pgvector via range(0,0), and others guard explicitly), and has_new_chunks() reaches this path with an empty list when data_chunks is empty. Guard the empty case before building the filter. Re-applies #3077, which was merged to main and reverted by #3082; it now lands on dev (the correct base branch per the repo branching policy). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
LanceDBAdapter.retrieve()builds anid IN (...)filter fromdata_point_ids. For an empty list that becomesid IN (), which lance rejects:This guards the empty case and returns
[]before building the filter — matching the existinglen == 1special-case that already works around lance's single-element-tuple SQL quirk.Why this is the correct behavior (validated against sibling adapters)
range(0, len(ids), BATCH)isrange(0, 0)→ no iterations → returns[].opengauss,singlestore:if not data_point_ids: return []) or loop over ids (empty loop →[]).has_new_chunks()reaches this path with an empty id list whendata_chunksis empty.lancedb was the outlier that crashed.
Re-apply note
This re-applies #3077, which was merged to
mainand then reverted by #3082. It now lands ondev, the correct base branch per the repo branching policy.Test
Adds
test_retrieve_empty_ids_returns_emptyassertingretrieve(collection, [])returns[]and short-circuits before the filter path. Fulltest_lancedb_lifecycle.pysuite: 5 passed.🤖 Generated with Claude Code