explore: reuse + unify the OGC engine (pager, aggregation, ambient state) — net −64 LOC#4
Merged
Merged
Conversation
2853603 to
e76428d
Compare
e76428d to
9ae73db
Compare
…ient state
Reuse and unify the OGC engine's HTTP pagination / aggregation / error-recovery /
ambient-state plumbing instead of carrying parallel implementations. Net source
reduction of ~66 LOC, behavior-preserving, plus one rate-limit correctness fix.
wateruse reuse:
- wateruse drives the engine's generic `_paginate` (with an injected
`raise_for_status` for the NWDC `{detail}` envelope), `_run_sync` (anyio portal,
Jupyter-safe), and `_combine_chunk_frames` / `_combine_chunk_responses`
aggregators — replacing a hand-rolled pager, thread bridge, and bespoke
aggregation. `_resolve_locations` becomes a `_LOCATION_BUILDERS` table dispatch,
dropping a 3-way if/elif and a duplicated selector enumeration.
Engine unification:
- `planning._merge_response`: one low-level "fold N responses into one" behind both
pagination (`_paginate`) and chunked/fan-out aggregation
(`_combine_chunk_responses`), replacing two near-duplicate implementations; deletes
`engine._aggregate_paginated_response`.
- `utils.Ambient[T]`: a generic ContextVar-with-scope class collapsing each per-call
ambient (`_row_cap`, `_ogc_base_url`, `_dialect`, the chunker's `_chunked_client`)
from a var + hand-written `@contextmanager` setter pair into one declaration.
`with _x(value):` call sites unchanged; readers shorten to `_x.get()`.
- `_paginate`'s verbatim per-page progress block deduped into a `report_page` closure.
- `_combine_chunk_responses`: dropped a dead single-response branch.
- `_QUOTA_HEADER` moved to the base `planning` module — dedups the literal and fixes
a layering inversion (planning had hard-coded it, unable to import from chunking).
- `_cql2_param`: CQL2 filter list built as a comprehension.
- `engine._check_id_format`: inlined into its only caller; dead re-export dropped.
Rate-limit correctness fix:
- `x-ratelimit-remaining` now reports the LOWEST value any concurrent sub-request
saw (the quota actually left after a fan-out), via a shared `_lowest_remaining`,
instead of the last-by-index — fixing a latent inaccuracy in the OGC chunker too.
Behavior-preserving (live-verified); offline OGC/wateruse/utils/progress suites
green; ruff + mypy --strict clean.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01Sjb14HkwuCydKSKMsaXsgd
08a8f9c to
9b74119
Compare
Owner
Author
|
Folded into DOI-USGS#328: the squashed refactor commit |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Exploration, draft. Stacked on the wateruse branch (DOI-USGS#328) so the diff is only the change. Question: can codifying the OGC engine's HTTP pagination / aggregation / error-recovery / ambient-state patterns reduce complexity (branches, duplication, indirection) while staying readable? Yes — by reusing and unifying what the engine already has rather than building parallel cores. Iterated as a logged try-measure-keep-or-revert loop. Source net −64 LOC across 6 refactor commits, plus a rate-limit correctness fix.
Commits (newest first)
7899b60dAmbient[T]class for the per-call ContextVars (_row_cap,_ogc_base_url,_dialect,_chunked_client)ebc3ded6_QUOTA_HEADERto the base layer (planning)35e43dff_resolve_locationsc3018c3c_combine_chunk_responses_lowest_remainingalready handles it)381d82f6_paginate's verbatim per-page progress block →report_pagea5b3d029planning._merge_response9ae73db7_paginate/_run_sync/_combine_chunk_*; rate-limit unified on lowest-remainingTwo headline abstractions
planning._merge_response(a5b3d02) —engine._aggregate_paginated_response(pagination) andplanning._combine_chunk_responses(fan-out) did the same low-level op (copy a base response, rebuild headers from a chosen response, set elapsed, optionally override url). Now one function behind both.utils.Ambient[T](7899b60d) — every ambient ContextVar needed a_x_var = ContextVar(...)plus a hand-written@contextmanagersetter repeating theset/try/finally/resetdance (which leaks the value if aresetis dropped).Ambientbundles the var with.get()and a__call__scope into one object, so each ambient is a single declaration.with _x(value):call sites are unchanged; readers shorten (_x_var.get()→_x.get()). AGeneric[_T]keeps exact value types (incl.Ambient[int | None]). This is the class abstraction that belongs — it bundles state with behaviour.Correctness fix (from
9ae73db)x-ratelimit-remainingreports the lowest value any concurrent sub-request saw (the quota actually left), not last-by-index — one shared_lowest_remaining, fixing a latent inaccuracy in the OGC chunker too.Rejected, with reasons (uncommitted
ITERATION_LOG.md)Several candidates were measured/reasoned and dropped because they added complexity or indirection without removing duplication: extracting
_fan_out(no clean 2nd consumer — the chunker's gather carries retry/resume the lean fan-out omits); DRY-ing the two wateruse validators (per-selector messages must survive; flat funcs read better); a_scopedfunction for the ContextVar dance (superseded by theAmbientclass — it left the var+setter pairing intact, +7 LOC vs −36); aPagerclass (ceremony, no shared state); unifying the two HTTP-status raisers (would push OGC concerns into the generic one; they already shareerror_for_status).Numbers & verification
engine.py−80,wateruse.py−35,chunking.py−31,utils.py+36 for the sharedAmbient/_merge_responsehelpers,planning.py+46 for_merge_response/_lowest_remaining/_QUOTA_HEADER+ docs — logic moved into shared, tested helpers).{detail}errors; works inside a running loop.ruff+mypy --strictclean. (Live tests that hit the real gateway flake on transient 502s — unrelated to the diff.)Recommendation
Viable merge candidate — net reduction, de-duplicated transport (one pager / response-folder / sync bridge / ambient mechanism), a real rate-limit fix, no readability regression. Run it past the live OGC suites, then promote from draft.
🤖 Generated with Claude Code