Add executor-backed and MPI batch evaluators#685
Open
hmgaudecker wants to merge 8 commits into
Open
Conversation
Adds `executor_batch_evaluator(executor)`, which adapts any `concurrent.futures.Executor` (ProcessPoolExecutor, ThreadPoolExecutor, mpi4py.futures.MPIPoolExecutor, …) to optimagic's BatchEvaluator protocol. A module-level `_CloudpickleTask` wrapper serializes the criterion with cloudpickle so closures and locally defined functions survive any executor's plain-pickle transport (e.g. spawn-based process pools). Adds the named `mpi_batch_evaluator`, symmetric to joblib/pathos/threading: it lazily imports mpi4py (optional `optimagic[mpi]` extra), configures cloudpickle as MPI's serializer, caches a single module-level MPIPoolExecutor across batches, and raises a clear error when no worker ranks are available (program not launched via `python -m mpi4py.futures`). It delegates the actual mapping to executor_batch_evaluator. Registers "mpi" in BatchEvaluatorLiteral and process_batch_evaluator. Adds a how-to guide on distributed optimization with MPI and a CHANGES entry. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Adds a pixi `mpi` feature (mpi4py + mpich, Linux-only) and a `tests-mpi-py314` environment, an `mpi` pytest marker, and an integration test that launches `_mpi_helper.py` under `mpiexec -n 3 python -m mpi4py.futures`. The helper fans a locally defined closure out through `mpi_batch_evaluator` and asserts the results come back in input order, exercising cloudpickle-over-MPI transport end to end. The test skips only when mpiexec / mpi4py are absent. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Adds a `run-tests-mpi` CI job (ubuntu-latest, `tests-mpi-py314`) that first asserts mpiexec + mpi4py are present so the MPI tests cannot silently skip, then runs them. Modernizes the shared CI tooling: setup-pixi v0.9.4 → v0.9.6, pinned pixi-version v0.65.0 → v0.70.2 (the pixi that wrote the v7 lock), actions/checkout v4 → v5, and codecov-action v4 → v5 across all jobs. Regenerates pixi.lock to v7 with the new MPI environment. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
It is the only entry point for a bring-your-own-executor batch evaluator (the named evaluators are reachable by string), so it belongs in the public namespace next to the BatchEvaluator protocol. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Driver/worker distributed optimization needs the worker ranks to evaluate the driver's broadcast points through the exact same conversion and value post-processing the driver uses — otherwise a worker handed the raw user criterion receives the optimizer's internal parameter vector instead of the external params and fails. `build_internal_fun` exposes that internal `x -> (value, history_entry, log_entry)` callable so a worker can build an interchangeable evaluator on its own resources from the same params/bounds/constraints/algorithm. The per-point batch evaluation is now pure (no logging side effect); the process that owns the history and log database records every point — including those evaluated on a remote worker — exactly once, stamping each with the running optimization step. This also makes the joblib path log from the parent rather than concurrently from each subprocess. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
A point evaluated on an unstepped worker arrives with step=None; the recording process must attribute it to the running optimization step. The new test fails if the re-stamp is dropped, locking the behavior the MPI driver/worker split relies on. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Stacked on #684 (the
batch_evaluatorkwarg). Review/merge #684 first; this PR's diff is against that branch.What
Adds two batch evaluators so a single-driver optimizer can fan its batched criterion
evaluations across a
concurrent.futures.Executor:executor_batch_evaluator(executor)— a generic factory returning aBatchEvaluator-protocol callable backed by any executor (ProcessPoolExecutor,ThreadPoolExecutor, Dask,mpi4py.futures.MPIPoolExecutor, …). Order-preserving viaexecutor.map; reuses the existingunpack/catchmachinery soerror_handling("raise" reraises, "continue" → traceback string) matches the other evaluators.
mpi_batch_evaluator— a named peer tojoblib/pathos/threading, registeredso
batch_evaluator="mpi"works (added toBatchEvaluatorLiteralandprocess_batch_evaluator). Lazily importsmpi4py(clear error → optionaloptimagic[mpi]extra), configures cloudpickle as the MPI serializer, and caches amodule-level
MPIPoolExecutor(one pool per process; a per-batch pool would becatastrophic).
Plus an
optimagic[mpi]optional extra and a how-to (how_to_distributed_optimization)that explains the single-driver model and the
python -m mpi4py.futureslaunchprecondition.
Why
The batched evaluations an algorithm like tranquilo requests were locked to in-process
backends. On a cluster you want them spread across nodes. The right MPI pattern is a
single optimizer on the driver rank with the worker ranks parked by the
mpi4py.futureslauncher — not running the optimizer on every rank (that divergesunder floating-point nondeterminism). An executor's single-submitter model makes that
the only expressible pattern, so the trap is structurally avoided.
Pickling (the crux)
stdlib
ProcessPoolExecutorserializes tasks with plainpickleregardless of startmethod, so closures/locally-defined criteria (optimagic's
partial_func_of_paramsoutput) would be rejected under
spawn/forkserverand under MPI. A module-level_CloudpickleTaskwrapper carries acloudpicklepayload that plain pickle transportsintact — giving joblib/loky-level parity for any executor. cloudpickle is already a hard
dependency, so this adds nothing new.
Tests
37 pass. Notably a closure evaluated through
ProcessPoolExecutor(mp_context=spawn)(the case that fails without the wrapper), the ThreadPool path, raise/continue parity,
process_batch_evaluator("mpi")resolution, and the no-mpi4py clear-error path. Alllocal/deterministic — no cluster needed.
Points for reviewer attention
executor_batch_evaluatorhas no string alias (unlike"mpi"), so it's onlyreachable via
from optimagic.batch_evaluators import executor_batch_evaluator.Should it be re-exported at the top level (
optimagic.executor_batch_evaluator) fordiscoverability? Left as-is for now.
mpi_batch_evaluatorcaches one module-levelMPIPoolExecutor, never explicitly shut down (dies with the process). No API to resetbetween independent optimizations in one process — fine for the
one-optimization-per-process HPC model, but worth a conscious call.
executor.num_workers == 0. Best-effort: launchedwithout
-m mpi4py.futures, mpi4py may fall back to dynamicMPI_Comm_spawnratherthan reporting 0. The how-to and the error message both state the launcher precondition
explicitly. Could use a maintainer's eye with cluster access — I couldn't run real MPI.
Notes
CHANGES.mdreferences{gh}NNN`` — I'll replace with this PR's number once assigned.pixi.lockintentionally not regenerated: thempiextra isn't pulled into anypixi environment (so
pixi install --frozenpasses against the current lock), and apixi lockhere wanted an unrelated repo-wide v6→v7 format upgrade. Left for themaintainer.
🤖 Generated with Claude Code