TA eval: result-shape-constraint shaped-product hook (nonnull-pair)#565
Open
evaleev wants to merge 9 commits into
Open
TA eval: result-shape-constraint shaped-product hook (nonnull-pair)#565evaleev wants to merge 9 commits into
evaleev wants to merge 9 commits into
Conversation
Adds TEST_CASE("shape_spike_tot_general_product", "[shape-spike]") that
de-risks the core assumption of the result-shape-constraints feature: that a
ToT general product can be evaluated through TiledArray's standard expression
layer (A(la) * B(ra)) with an imposed SparseShape via .set_shape(s), rather
than through TA::einsum.
The test uses the same ToT*ToT->ToT annotation as the existing
ToT_times_ToT_to_ToT section (contraction over outer i_3 and inner a_4;
Hadamard i_1,i_2; result outer (i_2,i_1)), but with SparsePolicy and a
multi-tile outer TiledRange (2 tiles per occ mode) so that 4 outer result
tiles exist and tile (0,0) can be masked to zero by the imposed shape.
Outcome: PASS -- (A(la)*B(ra)).set_shape(s) evaluates without throwing,
honors the imposed mask (tile (0,0) is absent in the result), and the
surviving tiles match TA::einsum to floating-point precision.
130 assertions pass.
Adds two more [shape-spike] test cases covering the other two contraction kinds that the SeQuant TA backend's prod() emits as shape-eligible intermediates: Case A (shape_spike_T_times_ToT_general_product): T x ToT -> ToT mixed operand product (flat DF-integral-like g times PNO-coefficient-like ToT C). (T_op(la) * ToT_op(ra)).set_shape(s) evaluates, honors the imposed SparseShape, and matches the TA::einsum baseline on surviving tiles. Case B (shape_spike_ToT_inner_contraction_to_flat_T): ToT x ToT with the inner (composite) indices fully contracted and the outer (occ) indices surviving, denesting to a flat tensor-of-scalars result -- the einsum<DeNest::True> / dot_inner path (result.hpp:581). The standard-layer equivalent is the .dot_inner() expression; DotInnerExpr derives from Expr so it exposes set_shape(), and the override is honored: C(c) = A(a + inner.a).dot_inner(B(b + inner.b)).set_shape(s); evaluates, honors the shape, and matches the einsum<DeNest::True> baseline. Both PASS. 280 assertions across 3 shape-spike cases.
…ct site Add TAEvalContext (SeQuant/core/eval/backends/tiledarray/eval_context.hpp) holding a result_shape_provider callback (node x trange -> optional<SparseShape>). Thread it to the binary-Product site in evaluate() via a new type-erased product_node_visitor field on CacheManager: the visitor is invoked with std::any(std::cref(node)) at each Product node before prod() is called. An empty visitor (the default) is a no-op; existing callers compile and behave identically. Chosen mechanism: cache-carried visitor (option a from the design brief). Rationale: CacheManager is already threaded to the binary-product site (custom_evaluator follows the same pattern); no evaluate() signature changes are needed; the std::any wrapping keeps TA types out of generic eval headers. Test [shape-provider]: ToT*ToT->ToT eval with a provider that increments a counter and returns nullopt; asserts counter>=1 (provider reached) and that the result equals the no-visitor reference (nullopt => no behavior change).
…ion layer Grow the Task 1 product-node seam into a shaped-product hook that actually applies a provider-returned TA::SparseShape. The hook (CacheManager:: shaped_product_hook_) is type-erased as ResultPtr(any node, Result const& left, Result const& right, ann), so eval.hpp and cache_manager.hpp stay free of TiledArray types; all TA specifics (trange computation, provider call, set_shape) live in the TA backend (eval_context.hpp + result.hpp). At the binary-Product site, eval consults the hook before prod(): a non-null return replaces the product, a null return (empty hook or provider nullopt) falls through to the existing prod(). Default-empty => byte-identical. result.hpp gains: - detail::result_outer_trange / outer_annot_labels: build the result outer TiledRange by matching result outer labels to operand TiledRange1's; - result_outer_trange_from_results: the same from type-erased operands; - apply_shaped_product<NumericT,PolicyT>: emits both Task-0 forms selected by operand nesting / de_nest --- (lhs(la) * rhs(ra)).set_shape(s) for general products (T*T, T*ToT, ToT*ToT->ToT) and lhs(la).dot_inner(rhs(ra)). set_shape(s) for the DeNest::True ToT*ToT->flat path --- fencing before return so the shape outlives the lazy assignment. TAEvalContext::make_hook<NumericT,PolicyT> builds the hook and captures the provider BY VALUE (Task 1 review minor) so it does not dangle on ctx. Tests [shape-provider]: a real shape (zero an outer tile) on both a *-form general product and a dot_inner denest-to-flat product -- zeroed tile is_zero, survivors equal the unshaped einsum baseline; plus full-ones no-op, nullopt decline, and no-hook cases all equal to the unshaped reference.
A product may legitimately have a scalar (ResultScalar) operand, which carries no TiledRange; computing the result outer trange would mis-cast it. Such a product has no outer tensor to shape, so decline early (return null -> normal prod) before the trange computation. Required once a real result_shape_provider is active.
The result-shape-constraint hook (make_hook) and its apply path (result_outer_trange_from_results, apply_shaped_product) hardcoded the nested (ToT) operand kind as DistArray<Tensor<Tensor<NumericT>>>. The CSV/PNO path produces ToT operands with an arena-pinned inner tile (DistArray<Tensor<ArenaTensor<NumericT>>>), a distinct (exact-type-id) Result kind, so the is_tensor_like guard declined every CSV intermediate -- including the (g.C)(g.C) giant the feature targets -- before the provider was consulted. Add an InnerTileT template parameter (default TA::Tensor<NumericT>, so existing behavior is unchanged) threaded through make_hook -> result_outer_trange_from_results / apply_shaped_product, so the consumer can instantiate the hook with the arena inner tile and have CSV ToT intermediates recognized and shaped. Also: TiledArray's expression-layer general ToT product cannot emit a result whose inner annotation needs a non-identity permutation (cont_engine throws). Detect that case (tot_product_needs_inner_reorder) and decline so the eval falls through to the unshaped einsum prod() (which handles the reorder) -- lossless, just not shaped on those nodes. With this, the cross-pair (g.C)(g.C) giant is shaped to its surviving-pair support (3x smaller in a water-trimer test: 23.65 MB -> 7.88 MB), energy preserved to ~1e-14.
…mized builds The "quadratic bubble" test ran single_term_opt on a 12-leaf network ~29 times, only one of which was ever asserted; the rest were a diagnostic std::wcout sweep. In Debug the DP is ~100x slower (~100 s/call, ~48 min total), exceeding CTest's default timeout and failing CI's Debug builds (and cancelling the Valgrind/Sanitizer jobs). Collapse the sweep to 5 verified early-K/late-K crossover assertions and gate the test on __OPTIMIZE__. NDEBUG is defined in every SeQuant build type (asserts use SEQUANT_ASSERT_BEHAVIOR_), so it cannot distinguish Debug from Release; __OPTIMIZE__ is set by GCC/Clang only at -O1+. The test now runs in ~4 s in Release and is excluded entirely at -O0.
9f4ccb9 to
1cb3052
Compare
mode_batches_of_trange1 closed a batch as soon as the accumulated whole-tile size reached or exceeded target_batch_size, so the realized batch could exceed the target by nearly a full tile: any target a hair above the tile size rounded UP to two tiles, doubling the batch. That defeats the memory bound the target is meant to enforce and, for CSV/PNO giant intermediates, doubled the materialized aux (K) slice (e.g. aux_target_size=243 with 236-wide K tiles -> 472-wide batches), exposing a TiledArray SUMMA sparse-broadcast edge case. Close the batch before a tile would push it over the target, so the realized batch never exceeds target_batch_size except for the one-tile floor (a lone tile larger than the target). The batch count now changes only at multiples of the tile size. Docs updated to reflect the upper-bound (<=) semantics at the BatchPolicy interface, the runtime evaluator, and the optimizer cost model.
The result-shape shaped-product path (result.hpp::apply_shaped_product) and its
eval test use TiledArray's dot_inner (ToT*ToT->T) expression, added after the
previously pinned b8c1d75 -- so the Linux/MacOS Build CI failed to compile
test_eval_ta.cpp ("no member named 'dot_inner'"). Bump to f20abfb44 (the tag
MPQC tracks) which provides dot_inner. Forward bump (53 commits); MADNESS
follows transitively via TiledArray.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
SeQuant half of the nonnull-pair result-shape-constraint feature (the mpqc side is ValeevGroup/mpqc4 PR for
feature/result-shape-nonnull-pair-constraints).Stacked on #559 (
feature/cost-model-batch-aware). Base is set to that branch so the diff is just the 6 result-shape commits; retarget tomasteronce #559 merges.What this adds
A method-supplied, opaque result-shape provider reached through the TA backend context, letting a consumer impose a
TA::SparseShapeon a binary-Product node's result duringsequant::evaluate. Generic eval/CacheManager stay TA-free; all TA specifics live inTAEvalContext+ the hook closure.Commits:
eeca28641,dd1706c14— de-risk spikes: standard-layer ToT product + T×ToT +dot_innerdenest honor an imposedSparseShape.47a3b4607— thread an optional TA result-shape provider to the binary-product site (type-erasedshaped_product_hookonCacheManager;eval.hpp/cache_manager.hppname onlyResultPtr/Result/std::any).bcbe360d1— emit the shape-constrained product via the standard expression layer ((l*r).set_shape(s)/dot_inner(...).set_shape(s)); empty hook ⇒ byte-identical default.dd6847195—make_hookdeclines scalar-operand products (no TiledRange) before the trange computation, fixing a segfault the moment a real provider is active.3709ee68f— recognize the arena-inner-tile ToT kind (DistArray<Tensor<ArenaTensor<NumericT>>>) via anInnerTileTtemplate parameter (defaultTensor<NumericT>, so existing behavior is unchanged), so the hook fires on CSV/PNO intermediates instead of declining them; plus a graceful decline (tot_product_needs_inner_reorder) when the ToT general product would need a non-identity inner result permutation TA can't yet emit (falls through to einsumprod()— lossless).Scope / safety
nullptrimmediately.Tested
Eval-level shape spikes (560 assertions). End-to-end via the mpqc consumer: closed-shell CSV-CCk (PNO-CCSD) is lossless ON vs OFF and the targeted
(g.C)(g.C)giant shrinks to its surviving-pair support (3× on a water-trimer test), with energy preserved to ~1e-14.🤖 Generated with Claude Code
https://claude.ai/code/session_01Y9QnUcKzvPp5bJSS5hvCyc