Skip to content

fix(vector/lancedb): return [] from retrieve() for an empty id list#3083

Open
Vasilije1990 wants to merge 1 commit into
devfrom
fix/lancedb-retrieve-empty-ids-dev
Open

fix(vector/lancedb): return [] from retrieve() for an empty id list#3083
Vasilije1990 wants to merge 1 commit into
devfrom
fix/lancedb-retrieve-empty-ids-dev

Conversation

@Vasilije1990

Copy link
Copy Markdown
Contributor

What

LanceDBAdapter.retrieve() builds an id IN (...) filter from data_point_ids. For an empty list that becomes id IN (), which lance rejects:

RuntimeError: lance error: Invalid user input: Error parsing statement: ... WHERE id IN ()

This guards the empty case and returns [] before building the filter — matching the existing len == 1 special-case that already works around lance's single-element-tuple SQL quirk.

Why this is the correct behavior (validated against sibling adapters)

  • pgvectorrange(0, len(ids), BATCH) is range(0, 0) → no iterations → returns [].
  • Several other adapters guard explicitly (opengauss, singlestore: if not data_point_ids: return []) or loop over ids (empty loop → []).
  • has_new_chunks() reaches this path with an empty id list when data_chunks is empty.

lancedb was the outlier that crashed.

Re-apply note

This re-applies #3077, which was merged to main and then reverted by #3082. It now lands on dev, the correct base branch per the repo branching policy.

Test

Adds test_retrieve_empty_ids_returns_empty asserting retrieve(collection, []) returns [] and short-circuits before the filter path. Full test_lancedb_lifecycle.py suite: 5 passed.

🤖 Generated with Claude Code

retrieve() built an "id IN ()" filter for an empty data_point_ids list,
which lance rejects as a SQL parse error. Sibling adapters return [] for
the empty case (pgvector via range(0,0), and others guard explicitly), and
has_new_chunks() reaches this path with an empty list when data_chunks is
empty. Guard the empty case before building the filter.

Re-applies #3077, which was merged to main and reverted by #3082; it now
lands on dev (the correct base branch per the repo branching policy).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant