TA eval: result-shape-constraint shaped-product hook (nonnull-pair) by evaleev · Pull Request #565 · ValeevGroup/SeQuant

evaleev · 2026-06-27T13:15:27Z

SeQuant half of the nonnull-pair result-shape-constraint feature (the mpqc side is ValeevGroup/mpqc4 PR for feature/result-shape-nonnull-pair-constraints).

Stacked on #559 (feature/cost-model-batch-aware). Base is set to that branch so the diff is just the 6 result-shape commits; retarget to master once #559 merges.

What this adds

A method-supplied, opaque result-shape provider reached through the TA backend context, letting a consumer impose a TA::SparseShape on a binary-Product node's result during sequant::evaluate. Generic eval/CacheManager stay TA-free; all TA specifics live in TAEvalContext + the hook closure.

Commits:

eeca28641, dd1706c14 — de-risk spikes: standard-layer ToT product + T×ToT + dot_inner denest honor an imposed SparseShape.
47a3b4607 — thread an optional TA result-shape provider to the binary-product site (type-erased shaped_product_hook on CacheManager; eval.hpp/cache_manager.hpp name only ResultPtr/Result/std::any).
bcbe360d1 — emit the shape-constrained product via the standard expression layer ((l*r).set_shape(s) / dot_inner(...).set_shape(s)); empty hook ⇒ byte-identical default.
dd6847195 — make_hook declines scalar-operand products (no TiledRange) before the trange computation, fixing a segfault the moment a real provider is active.
3709ee68f — recognize the arena-inner-tile ToT kind (DistArray<Tensor<ArenaTensor<NumericT>>>) via an InnerTileT template parameter (default Tensor<NumericT>, so existing behavior is unchanged), so the hook fires on CSV/PNO intermediates instead of declining them; plus a graceful decline (tot_product_needs_inner_reorder) when the ToT general product would need a non-identity inner result permutation TA can't yet emit (falls through to einsum prod() — lossless).

Scope / safety

Default behavior is byte-identical: with no provider the hook returns nullptr immediately.
Sparse-policy / TA-backend only; the generic eval path is untouched when the hook is empty.

Tested

Eval-level shape spikes (560 assertions). End-to-end via the mpqc consumer: closed-shell CSV-CCk (PNO-CCSD) is lossless ON vs OFF and the targeted (g.C)(g.C) giant shrinks to its surviving-pair support (3× on a water-trimer test), with energy preserved to ~1e-14.

🤖 Generated with Claude Code

https://claude.ai/code/session_01Y9QnUcKzvPp5bJSS5hvCyc

Adds TEST_CASE("shape_spike_tot_general_product", "[shape-spike]") that de-risks the core assumption of the result-shape-constraints feature: that a ToT general product can be evaluated through TiledArray's standard expression layer (A(la) * B(ra)) with an imposed SparseShape via .set_shape(s), rather than through TA::einsum. The test uses the same ToT*ToT->ToT annotation as the existing ToT_times_ToT_to_ToT section (contraction over outer i_3 and inner a_4; Hadamard i_1,i_2; result outer (i_2,i_1)), but with SparsePolicy and a multi-tile outer TiledRange (2 tiles per occ mode) so that 4 outer result tiles exist and tile (0,0) can be masked to zero by the imposed shape. Outcome: PASS -- (A(la)*B(ra)).set_shape(s) evaluates without throwing, honors the imposed mask (tile (0,0) is absent in the result), and the surviving tiles match TA::einsum to floating-point precision. 130 assertions pass.

Adds two more [shape-spike] test cases covering the other two contraction kinds that the SeQuant TA backend's prod() emits as shape-eligible intermediates: Case A (shape_spike_T_times_ToT_general_product): T x ToT -> ToT mixed operand product (flat DF-integral-like g times PNO-coefficient-like ToT C). (T_op(la) * ToT_op(ra)).set_shape(s) evaluates, honors the imposed SparseShape, and matches the TA::einsum baseline on surviving tiles. Case B (shape_spike_ToT_inner_contraction_to_flat_T): ToT x ToT with the inner (composite) indices fully contracted and the outer (occ) indices surviving, denesting to a flat tensor-of-scalars result -- the einsum<DeNest::True> / dot_inner path (result.hpp:581). The standard-layer equivalent is the .dot_inner() expression; DotInnerExpr derives from Expr so it exposes set_shape(), and the override is honored: C(c) = A(a + inner.a).dot_inner(B(b + inner.b)).set_shape(s); evaluates, honors the shape, and matches the einsum<DeNest::True> baseline. Both PASS. 280 assertions across 3 shape-spike cases.

…ct site Add TAEvalContext (SeQuant/core/eval/backends/tiledarray/eval_context.hpp) holding a result_shape_provider callback (node x trange -> optional<SparseShape>). Thread it to the binary-Product site in evaluate() via a new type-erased product_node_visitor field on CacheManager: the visitor is invoked with std::any(std::cref(node)) at each Product node before prod() is called. An empty visitor (the default) is a no-op; existing callers compile and behave identically. Chosen mechanism: cache-carried visitor (option a from the design brief). Rationale: CacheManager is already threaded to the binary-product site (custom_evaluator follows the same pattern); no evaluate() signature changes are needed; the std::any wrapping keeps TA types out of generic eval headers. Test [shape-provider]: ToT*ToT->ToT eval with a provider that increments a counter and returns nullopt; asserts counter>=1 (provider reached) and that the result equals the no-visitor reference (nullopt => no behavior change).

…ion layer Grow the Task 1 product-node seam into a shaped-product hook that actually applies a provider-returned TA::SparseShape. The hook (CacheManager:: shaped_product_hook_) is type-erased as ResultPtr(any node, Result const& left, Result const& right, ann), so eval.hpp and cache_manager.hpp stay free of TiledArray types; all TA specifics (trange computation, provider call, set_shape) live in the TA backend (eval_context.hpp + result.hpp). At the binary-Product site, eval consults the hook before prod(): a non-null return replaces the product, a null return (empty hook or provider nullopt) falls through to the existing prod(). Default-empty => byte-identical. result.hpp gains: - detail::result_outer_trange / outer_annot_labels: build the result outer TiledRange by matching result outer labels to operand TiledRange1's; - result_outer_trange_from_results: the same from type-erased operands; - apply_shaped_product<NumericT,PolicyT>: emits both Task-0 forms selected by operand nesting / de_nest --- (lhs(la) * rhs(ra)).set_shape(s) for general products (T*T, T*ToT, ToT*ToT->ToT) and lhs(la).dot_inner(rhs(ra)). set_shape(s) for the DeNest::True ToT*ToT->flat path --- fencing before return so the shape outlives the lazy assignment. TAEvalContext::make_hook<NumericT,PolicyT> builds the hook and captures the provider BY VALUE (Task 1 review minor) so it does not dangle on ctx. Tests [shape-provider]: a real shape (zero an outer tile) on both a *-form general product and a dot_inner denest-to-flat product -- zeroed tile is_zero, survivors equal the unshaped einsum baseline; plus full-ones no-op, nullopt decline, and no-hook cases all equal to the unshaped reference.

A product may legitimately have a scalar (ResultScalar) operand, which carries no TiledRange; computing the result outer trange would mis-cast it. Such a product has no outer tensor to shape, so decline early (return null -> normal prod) before the trange computation. Required once a real result_shape_provider is active.

The result-shape-constraint hook (make_hook) and its apply path (result_outer_trange_from_results, apply_shaped_product) hardcoded the nested (ToT) operand kind as DistArray<Tensor<Tensor<NumericT>>>. The CSV/PNO path produces ToT operands with an arena-pinned inner tile (DistArray<Tensor<ArenaTensor<NumericT>>>), a distinct (exact-type-id) Result kind, so the is_tensor_like guard declined every CSV intermediate -- including the (g.C)(g.C) giant the feature targets -- before the provider was consulted. Add an InnerTileT template parameter (default TA::Tensor<NumericT>, so existing behavior is unchanged) threaded through make_hook -> result_outer_trange_from_results / apply_shaped_product, so the consumer can instantiate the hook with the arena inner tile and have CSV ToT intermediates recognized and shaped. Also: TiledArray's expression-layer general ToT product cannot emit a result whose inner annotation needs a non-identity permutation (cont_engine throws). Detect that case (tot_product_needs_inner_reorder) and decline so the eval falls through to the unshaped einsum prod() (which handles the reorder) -- lossless, just not shaped on those nodes. With this, the cross-pair (g.C)(g.C) giant is shaped to its surviving-pair support (3x smaller in a water-trimer test: 23.65 MB -> 7.88 MB), energy preserved to ~1e-14.

…mized builds The "quadratic bubble" test ran single_term_opt on a 12-leaf network ~29 times, only one of which was ever asserted; the rest were a diagnostic std::wcout sweep. In Debug the DP is ~100x slower (~100 s/call, ~48 min total), exceeding CTest's default timeout and failing CI's Debug builds (and cancelling the Valgrind/Sanitizer jobs). Collapse the sweep to 5 verified early-K/late-K crossover assertions and gate the test on __OPTIMIZE__. NDEBUG is defined in every SeQuant build type (asserts use SEQUANT_ASSERT_BEHAVIOR_), so it cannot distinguish Debug from Release; __OPTIMIZE__ is set by GCC/Clang only at -O1+. The test now runs in ~4 s in Release and is excluded entirely at -O0.

mode_batches_of_trange1 closed a batch as soon as the accumulated whole-tile size reached or exceeded target_batch_size, so the realized batch could exceed the target by nearly a full tile: any target a hair above the tile size rounded UP to two tiles, doubling the batch. That defeats the memory bound the target is meant to enforce and, for CSV/PNO giant intermediates, doubled the materialized aux (K) slice (e.g. aux_target_size=243 with 236-wide K tiles -> 472-wide batches), exposing a TiledArray SUMMA sparse-broadcast edge case. Close the batch before a tile would push it over the target, so the realized batch never exceeds target_batch_size except for the one-tile floor (a lone tile larger than the target). The batch count now changes only at multiples of the tile size. Docs updated to reflect the upper-bound (<=) semantics at the BatchPolicy interface, the runtime evaluator, and the optimizer cost model.

The result-shape shaped-product path (result.hpp::apply_shaped_product) and its eval test use TiledArray's dot_inner (ToT*ToT->T) expression, added after the previously pinned b8c1d75 -- so the Linux/MacOS Build CI failed to compile test_eval_ta.cpp ("no member named 'dot_inner'"). Bump to f20abfb44 (the tag MPQC tracks) which provides dot_inner. Forward bump (53 commits); MADNESS follows transitively via TiledArray.

evaleev added 7 commits June 25, 2026 22:07

evaleev force-pushed the feature/result-shape-nonnull-pair-constraints branch from 9f4ccb9 to 1cb3052 Compare June 29, 2026 17:53

Base automatically changed from feature/cost-model-batch-aware to master June 29, 2026 19:04

evaleev added 2 commits June 29, 2026 23:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

TA eval: result-shape-constraint shaped-product hook (nonnull-pair)#565

TA eval: result-shape-constraint shaped-product hook (nonnull-pair)#565
evaleev wants to merge 9 commits into
masterfrom
feature/result-shape-nonnull-pair-constraints

evaleev commented Jun 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

evaleev commented Jun 27, 2026

What this adds

Scope / safety

Tested

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant