Skip to content

Replace endpoint-based host lookup with host-id-first topology updates #922

Description

@dkropachev

Parent epic: #921

Problem

Several topology paths still use endpoint/address as a way to find or identify an existing host. That conflicts with the target model where host_id is the only stable host identity and endpoint is connectivity metadata.

Known risky areas include:

  • control connection refresh logic that searches by endpoint before falling back to host_id;
  • Cluster.add_host() behavior that short-circuits on endpoint mappings;
  • metadata secondary indexes such as _host_id_by_endpoint being treated as identity rather than lookup state;
  • session pools, load-balancing policies, token maps, and route state that may rely on Host equality/hash semantics.

Desired behavior

Scope

  • Audit host lookup/update callsites in cassandra/cluster.py, cassandra/metadata.py, cassandra/pool.py, cassandra/policies.py, client routes, and tests.
  • Remove endpoint-first host identity decisions when host_id is available.
  • Keep _host_id_by_endpoint as an index, but make its ownership and update points explicit.
  • Update tests that currently expect endpoint-based Host equality or set/dict behavior.

Acceptance criteria

  • Existing host lookup during topology refresh is host-id-first.
  • No internal session/pool/policy path depends on endpoint-based Host equality for identity.
  • Endpoint lookup remains available for address-based lookup APIs and event translation.
  • Tests cover same endpoint/different host_id and same host_id/different endpoint detection.

Part of #921. Related to #867 and #382.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    Fields

    No fields configured for Task.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions