Implement TABLETS_ROUTING_V2#913
Conversation
|
Warning Review limit reached
Next review available in: 54 minutes Enable usage-based reviews in Billing to review now. Otherwise, wait until the next included review is available. How can I continue?After more reviews become available, a review can be triggered using the To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based reviews. How do review limits work?CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan review availability. For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, additional reviews become available more gradually as earlier reviews age out of the rolling window. Please refer docs for additional details. Review details⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (15)
📝 WalkthroughWalkthroughThe driver adds tablets routing v2 support across feature negotiation, tablet version block generation, execute message framing, and shard selection. Response handling now parses tablets-routing-v1/v2 payloads into cluster tablet metadata, and schema parsing marks keyspaces as strongly consistent from Scylla system tables. Token-aware routing now uses tablet replicas and prefers the tablet leader for strongly consistent keyspaces. Sequence Diagram(s)sequenceDiagram
participant Client
participant ResponseFuture
participant HostConnection
participant ExecuteMessage
participant ScyllaServer
participant ClusterMetadata
Client->>ResponseFuture: execute(query)
ResponseFuture->>ResponseFuture: compute tablet_version_block
ResponseFuture->>HostConnection: borrow_connection(..., query)
HostConnection-->>ResponseFuture: connection
ResponseFuture->>ExecuteMessage: set tablet_version_block
ResponseFuture->>ScyllaServer: send EXECUTE
ScyllaServer-->>ResponseFuture: custom_payload tablets-routing-v1/v2
ResponseFuture->>ClusterMetadata: cache tablets and keyspace consistency
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Pull request overview
Implements Scylla TABLETS_ROUTING_V2 negotiation and request/response handling in the driver, including encoding the per-request tablet_version_block, parsing the new tablets-routing-v2 custom payload (with tablet_version), and adding leader-aware routing behavior for strongly-consistent keyspaces.
Changes:
- Add TABLETS_ROUTING_V2 protocol feature negotiation and ensure V2 subsumes V1 in STARTUP options.
- Attach a per-request
tablet_version_blockto EXECUTE messages on V2-negotiated connections; parse and cache V2 tablet routing payloads includingtablet_version. - Introduce strongly-consistent keyspace detection (Scylla-only) and leader-first routing in
TokenAwarePolicy; add unit and integration coverage.
Reviewed changes
Copilot reviewed 12 out of 12 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/unit/test_tablets.py | Adds unit tests for tablet_version_block encoding and for storing tablet_version on Tablet. |
| tests/unit/test_protocol_features.py | Adds negotiation tests ensuring V2 is preferred over V1 in STARTUP options. |
| tests/unit/test_policies.py | Adds unit tests for leader-aware routing behavior (leader-first, fallback, and non-SC cases). |
| tests/integration/standard/test_tablets_routing_v2.py | Adds opt-in end-to-end tests validating negotiation, payload decoding, and wire behavior against a V2-capable Scylla cluster. |
| cassandra/tablets.py | Adds tablet_version_block helpers and extends Tablet to store tablet_version. |
| cassandra/query.py | Adds memoization for routing-key token computation via Statement.routing_token(). |
| cassandra/protocol.py | Extends ExecuteMessage to optionally append tablet_version_block byte when present. |
| cassandra/protocol_features.py | Adds TABLETS_ROUTING_V2 constant and negotiation logic; makes V2 mutually exclusive with V1 in STARTUP. |
| cassandra/pool.py | Adds per-pool V2 capability detection and reuses memoized routing token to avoid repeated hashing. |
| cassandra/policies.py | Adds leader-first routing for strongly-consistent keyspaces when tablet replicas are available. |
| cassandra/metadata.py | Adds KeyspaceMetadata.strongly_consistent and populates it from Scylla system_schema.scylla_keyspaces. |
| cassandra/cluster.py | Computes/attaches tablet_version_block per connection and parses V2 routing payloads based on the serving connection’s negotiated features. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Actionable comments posted: 5
🧹 Nitpick comments (2)
tests/unit/test_policies.py (1)
1043-1082: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low valueOptional: test name/docstring don't match what's actually exercised.
This test asserts the "no tablet_version (V1)" path, but the production code (
make_query_plan) never readstablet.tablet_version; leader-first is gated solely onks_meta.strongly_consistent, which is set toFalsehere. So the test passes because of thestrongly_consistent=Falseflag, not becausetablet_version=None. Consider renaming/reframing it (or settingstrongly_consistent=Truewithtablet_version=None) so the test actually guards the behavior its name implies; otherwise it overlaps withtest_no_leader_routing_for_eventually_consistent_keyspace.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/unit/test_policies.py` around lines 1043 - 1082, The test name and docstring in test_no_leader_routing_without_tablet_version do not match the behavior actually being exercised, because make_query_plan only gates leader-first routing on keyspace metadata strong consistency, not Tablet.tablet_version. Update the test so it truly covers the intended case by either renaming/reframing it to reflect strongly_consistent=False, or by setting cluster.metadata.keyspaces['ks'].strongly_consistent to True while keeping tablet_version=None to verify the V1 tablet path; keep the assertions aligned with the TokenAwarePolicy routing behavior.cassandra/policies.py (1)
510-514: 🚀 Performance & Scalability | 🔵 TrivialOptional: Cache the child query plan to prevent double invocation and potential inconsistency.
child.make_query_plan(keyspace, query)is invoked at line 512 to filter replicas and is called again at line 551 to build the final plan. If the child policy is stateful (e.g.,RoundRobinPolicyorDCAwareRoundRobinPolicy), the second invocation may advance internal pointers and return a different host sequence, causing the filtering logic to diverge from the execution order and introducing unnecessary overhead.♻️ Suggested approach
Materialize the child plan once and reuse it:
- if tablet is not None: - replicas_mapped = set(map(lambda r: r[0], tablet.replicas)) - child_plan = child.make_query_plan(keyspace, query) - - replicas = [host for host in child_plan if host.host_id in replicas_mapped] + child_plan = list(child.make_query_plan(keyspace, query)) + if tablet is not None: + replicas_mapped = set(map(lambda r: r[0], tablet.replicas)) + replicas = [host for host in child_plan if host.host_id in replicas_mapped]Update the subsequent loop (line 551) to iterate over the cached
child_planinstead of re-invokingchild.make_query_plan(...).🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@cassandra/policies.py` around lines 510 - 514, Cache the result of child.make_query_plan(keyspace, query) in the policy logic so it is only evaluated once, then reuse that same child_plan for both the replica filtering in the tablet block and the final plan construction loop. Update the query planning flow in the policy method that handles tablet replicas to iterate over the cached child_plan instead of calling child.make_query_plan(...) a second time, which avoids stateful policy divergence and redundant work.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@cassandra/cluster.py`:
- Around line 5055-5075: The ExecuteMessage tablet_version_block is only being
set in the pooled connection path, so control-connection fallback can miss the
mandatory V2 trailing byte. Update _query_control_connection() in
cassandra/cluster.py to use the same tablets_routing_v2 check and
_compute_tablet_version_block(self.query) logic as the ExecuteMessage handling
in the main send path, placing it after self._connection = connection and before
connection.send_msg(...) so V2 control connections always include the block and
non-V2 ones do not.
- Around line 5212-5231: The tablet-routing cache is using only
self.query.keyspace when adding tablets, which can store entries under a None
key and miss cache hits for session-level keyspaces. Update the tablet handling
in ResponseFuture to use the effective keyspace consistently, matching
_compute_tablet_version_block() by falling back to self.keyspace when
self.query.keyspace is unset, and then pass that resolved keyspace into
metadata._tablets.add_tablet().
In `@cassandra/metadata.py`:
- Around line 2658-2660: The fallback in metadata schema refresh is too broad
because the except block in the system_schema.scylla_keyspaces read path catches
every Exception, which masks unexpected failures. Narrow the handler around the
code in metadata.py that logs “Could not read system_schema.scylla_keyspaces” so
it only catches the specific expected driver/server schema-unavailable error(s),
and let other refresh errors propagate. Keep the existing debug log and exc_info
behavior for the expected case, but do not use a blanket Exception catch in this
branch.
In `@cassandra/pool.py`:
- Around line 451-453: The tablets_routing_v2 property in Pool should not
iterate the live _connections view directly, because concurrent
replacement/shutdown can mutate it and closed connections can incorrectly keep
V2 enabled. Snapshot the current connections first, then compute the flag only
from still-open/live connections before checking each connection’s
features.tablets_routing_v2.
In `@cassandra/query.py`:
- Line 278: The cached routing token in Statement is being reused without
considering token_class, which can cause the wrong ring/token to be used across
sessions or clusters. Update the token cache logic in Statement-related routing
code so _routing_token is keyed by token_class as well, and make the
lookup/reuse path in the affected routing methods distinguish tokens by the
partitioner class used. Ensure the changes cover the Statement fields and the
routing/token selection flow that currently reads and writes _routing_token.
---
Nitpick comments:
In `@cassandra/policies.py`:
- Around line 510-514: Cache the result of child.make_query_plan(keyspace,
query) in the policy logic so it is only evaluated once, then reuse that same
child_plan for both the replica filtering in the tablet block and the final plan
construction loop. Update the query planning flow in the policy method that
handles tablet replicas to iterate over the cached child_plan instead of calling
child.make_query_plan(...) a second time, which avoids stateful policy
divergence and redundant work.
In `@tests/unit/test_policies.py`:
- Around line 1043-1082: The test name and docstring in
test_no_leader_routing_without_tablet_version do not match the behavior actually
being exercised, because make_query_plan only gates leader-first routing on
keyspace metadata strong consistency, not Tablet.tablet_version. Update the test
so it truly covers the intended case by either renaming/reframing it to reflect
strongly_consistent=False, or by setting
cluster.metadata.keyspaces['ks'].strongly_consistent to True while keeping
tablet_version=None to verify the V1 tablet path; keep the assertions aligned
with the TokenAwarePolicy routing behavior.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 587d3541-27be-4132-8dda-c90e66cff85c
📒 Files selected for processing (12)
cassandra/cluster.pycassandra/metadata.pycassandra/policies.pycassandra/pool.pycassandra/protocol.pycassandra/protocol_features.pycassandra/query.pycassandra/tablets.pytests/integration/standard/test_tablets_routing_v2.pytests/unit/test_policies.pytests/unit/test_protocol_features.pytests/unit/test_tablets.py
|
v2:
Tested against scylladb/scylladb#30291 (scylladb/scylladb@58ab555). |
Add per-connection negotiation of the TABLETS_ROUTING_V2 extension, the successor to TABLETS_ROUTING_V1. When the server advertises it in the SUPPORTED response, the driver echoes it back during STARTUP to opt in; a driver that negotiates v2 does not negotiate v1. While the feature is experimental the wire name carries the `_EXPERIMENTAL` suffix (TABLETS_ROUTING_V2_EXPERIMENTAL), and the server only advertises it when started with the `strongly-consistent-tables` experimental feature enabled. Also add the optional trailing tablet_version_block byte to the EXECUTE message body, written only when set, so later commits can carry the cached tablet version to the server.
Store the server-provided 64-bit tablet_version on each cached Tablet (None until learned) and add helpers to encode it into the one-byte tablet_version_block exchanged on the wire: * choose_tablet_version_block() packs a randomly chosen block index in the high nibble and that block's value in the low nibble, matching the server's locator::compare_tablet_version_block layout. Blocks are indexed from the least significant bits, so block i covers bits [i*4, i*4 + 4) of the version. A random index avoids any shared mutable counter on the hot path while still probing every nibble often enough to detect a server-side version change quickly. * random_tablet_version_block() returns a random byte for cold start, when no version is cached yet.
Add KeyspaceMetadata.strongly_consistent, derived from the per-keyspace `consistency` option in system_schema.scylla_keyspaces. A keyspace is strongly consistent when its consistency is `local` or `global` (null/`eventual` means eventually consistent). The lookup is cached per schema refresh and degrades to "eventually consistent" on non-Scylla clusters or older Scylla versions that lack the table or column. This flag lets the leader-aware routing logic tell which tablet tables actually have a Raft leader.
With TABLETS_ROUTING_V2 the server returns, on a tablet_version mismatch, the tablet's replica set with the Raft leader first plus the new tablet_version. Use this to route strongly-consistent reads and writes straight to the leader: * Every EXECUTE on a V2 connection carries a tablet_version_block computed from the cached version (or a random byte on cold start and for non-single-partition requests). The block is attached per connection, keyed off the connection that actually serves the request, so a rolling upgrade that mixes v1/v2 connections never desyncs the frame. * On the response, the routing payload is parsed according to what the serving connection negotiated; the v2 tuple additionally carries the tablet_version, which is stored back on the tablet. * TokenAwarePolicy yields the leader (the first replica) first, but only for strongly-consistent keyspaces -- a tablet_version is assigned to eventually-consistent tablet tables too and must not be mistaken for a leader hint. To keep this hot path cheap, the ring token is memoized once per statement (Statement.routing_token) and reused by the load balancing policy, the shard selection in the pool, and the tablet_version_block computation, instead of re-hashing the routing key three times.
Cover the end-to-end behaviour against a live ScyllaDB started with the `STRONGLY_CONSISTENT_TABLES` cluster feature: v2 negotiation, payload-driven cache population, the tablet_version_block matching rules (no payload on a matching block, exactly one matching value per index, precedence over v1), the protocol error on a missing block, and leader-aware routing targeting the Raft leader for a strongly-consistent keyspace.
…routing Extend the "Tablet Awareness" section of the Scylla-specific guide to cover the V2 protocol extension: the per-connection negotiation, the tablet_version_block byte that lets the server skip re-sending routing information the driver already has, and leader-aware routing for strongly-consistent (Raft-backed) tablet tables. Also document the strongly_consistent flag now exposed on keyspace metadata.
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@cassandra/metadata.py`:
- Around line 805-810: Update the documentation for the strongly_consistent
metadata field so it matches the behavior in _is_strongly_consistent() and the
new tests. In the keyspace metadata docstring near strongly_consistent, change
the description to state that only ScyllaDB keyspaces with consistency set to
global are marked True, while local, eventual, and non-ScyllaDB cases remain
False. Keep the public contract aligned with the implementation by tightening
the wording around strongly_consistent.
In `@tests/integration/standard/test_tablets_routing_v2.py`:
- Around line 279-294: The global monkeypatch on
ProtocolFeatures.add_startup_options in the test setup can leak if Cluster
creation or shutdown fails. Move the restore logic for add_startup_options into
a nested finally that always runs after the patch is applied, and keep the
cleanup tied to the cluster/session lifecycle in the same test helper around
Cluster.connect and cluster.shutdown. Use the existing
ProtocolFeatures.add_startup_options, Cluster, and session setup block to ensure
the original value is restored no matter which step raises.
- Line 41: The module-level skip logic in the test setup is too broad because
the `except Exception as exc` block can mask unrelated startup problems. Narrow
the handler around the cluster start path to only skip on the specific expected
unsupported-feature/startup failure, using the same setup flow where the cluster
is initialized, and re-raise any other exception so real regressions still fail
fast.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 712ffcf1-f346-4fd7-931d-303a5bc26920
📒 Files selected for processing (12)
cassandra/cluster.pycassandra/metadata.pycassandra/policies.pycassandra/pool.pycassandra/query.pycassandra/tablets.pydocs/scylla-specific.rsttests/integration/standard/test_tablets_routing_v2.pytests/unit/test_metadata.pytests/unit/test_policies.pytests/unit/test_response_future.pytests/unit/test_tablets.py
🚧 Files skipped from review as they are similar to previous changes (6)
- tests/unit/test_tablets.py
- cassandra/policies.py
- cassandra/query.py
- tests/unit/test_policies.py
- cassandra/tablets.py
- cassandra/cluster.py
| strongly_consistent = False | ||
| """ | ||
| A boolean indicating whether this keyspace uses strongly-consistent (Raft-based) | ||
| tablets. ``True`` only for ScyllaDB keyspaces whose ``consistency`` option is | ||
| not eventual (i.e. ``local`` or ``global``). ``False`` for eventually-consistent | ||
| keyspaces and for non-ScyllaDB clusters. |
There was a problem hiding this comment.
📐 Maintainability & Code Quality | 🟡 Minor | ⚡ Quick win
Document strongly_consistent as global-only.
These lines still say local and global map to True, but _is_strongly_consistent() and the new tests only mark global as strongly consistent. Please tighten the docstring so the public metadata contract matches the implementation.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@cassandra/metadata.py` around lines 805 - 810, Update the documentation for
the strongly_consistent metadata field so it matches the behavior in
_is_strongly_consistent() and the new tests. In the keyspace metadata docstring
near strongly_consistent, change the description to state that only ScyllaDB
keyspaces with consistency set to global are marked True, while local, eventual,
and non-ScyllaDB cases remain False. Keep the public contract aligned with the
implementation by tightening the wording around strongly_consistent.
| # advertisement of TABLETS_ROUTING_V2_EXPERIMENTAL. | ||
| 'experimental_features': ['lwt', 'udf', 'strongly-consistent-tables'], | ||
| }) | ||
| except Exception as exc: |
There was a problem hiding this comment.
📐 Maintainability & Code Quality | 🟡 Minor | ⚡ Quick win
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Find existing integration-test patterns and harness exception types to catch specifically.
rg -n -C3 "def use_cluster|pytest.skip\\(|Could not start a Scylla cluster|except .*Exception" tests cassandraRepository: scylladb/python-driver
Length of output: 50378
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== tests/integration/standard/test_tablets_routing_v2.py =="
nl -ba tests/integration/standard/test_tablets_routing_v2.py | sed -n '1,120p'
echo
echo "== tests/integration/__init__.py (use_cluster) =="
nl -ba tests/integration/__init__.py | sed -n '421,560p'
echo
echo "== integration test patterns for skip/start failures =="
rg -n -C2 "pytest\.skip\(|Could not start|except .*Exception|DependencyException|NoHostAvailable|OperationTimedOut|FeatureNotSupported|Unsupported" tests/integration tests | sed -n '1,220p'Repository: scylladb/python-driver
Length of output: 260
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== tests/integration/standard/test_tablets_routing_v2.py =="
sed -n '1,140p' tests/integration/standard/test_tablets_routing_v2.py | cat -n
echo
echo "== tests/integration/__init__.py (use_cluster) =="
sed -n '421,560p' tests/integration/__init__.py | cat -n
echo
echo "== relevant integration skip / exception patterns =="
rg -n -C2 "pytest\.skip\(|Could not start|DependencyException|Unsupported|OperationTimedOut|NoHostAvailable|except .*Exception" tests/integration tests | sed -n '1,220p'Repository: scylladb/python-driver
Length of output: 32197
Limit the module-level skip to the expected startup failure. Catching Exception here can hide setup bugs, port conflicts, and config regressions behind a skip. Narrow this to the cluster-start/unsupported-feature failure you expect, or re-raise everything else.
🧰 Tools
🪛 Ruff (0.15.20)
[warning] 41-41: Do not catch blind exception: Exception
(BLE001)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@tests/integration/standard/test_tablets_routing_v2.py` at line 41, The
module-level skip logic in the test setup is too broad because the `except
Exception as exc` block can mask unrelated startup problems. Narrow the handler
around the cluster start path to only skip on the specific expected
unsupported-feature/startup failure, using the same setup flow where the cluster
is initialized, and re-raise any other exception so real regressions still fail
fast.
Source: Linters/SAST tools
|
v2.1:
Also tested against scylladb/scylladb#30291 (scylladb/scylladb@58ab555). |
| tablets. ``True`` only for ScyllaDB keyspaces whose ``consistency`` option is | ||
| not eventual (i.e. ``local`` or ``global``). ``False`` for eventually-consistent | ||
| keyspaces and for non-ScyllaDB clusters. |
In dfccfff, we introduced tablet awareness.
Thanks to it, the driver was finally able to learn about the replicas
of a given tablet and successfully route requests to the right nodes
and shards. Unfortunately, the feature isn't sufficient to handle
strongly consistent tables.
Just like an eventually consistent tablet, a strongly consistent one
has a set of replicas that own it. The difference is that the latter
also has a distinguished replica called the leader that coordinates all
of the writes and (almost) all reads to the tablet. That creates a need
for the driver to route its requests to the leader instead of an
arbitrary replica. However, the current tablet awareness doesn't allow
the driver to learn about the identity of the leader.
To solve that problem, we devise a new protocol extension that will
effectively function as a new version of tablet awareness. It affects
both eventually and strongly consistent tablets. This PR implements it
in the driver.
We introduce the notion of a tablet version -- a 64-bit hash that
corresponds to the replicas of a tablet and (in the case of strongly
consistent tablets) its leader.
With every EXECUTE request, the driver provides information about the
tablet version it knows. If the server detects a mismatch, the response
to the request will contain full routing information: the boundary
tokens of the tablet, the list of its replicas, and the tablet version.
The driver then updates its cache and routes subsequent requests to the
right replicas.
Tests have been provided to verify that the implementation is correct.
Fixes: SCYLLADB-291
Refs: SCYLLADB-288
Pre-review checklist
./docs/source/.Fixes:annotations to PR description.