Conversation
…, fix for start_lpjml to not close coupled program to early, deprecation warning function
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #17 +/- ##
==========================================
+ Coverage 78.47% 81.43% +2.96%
==========================================
Files 7 7
Lines 1640 1988 +348
==========================================
+ Hits 1287 1619 +332
- Misses 353 369 +16 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Pull request overview
This PR updates pycoupler for compatibility with newer pycopanlpjml/LPJmL coupling expectations, including a new default coupled model name, improved Slurm coupled-job handling, additional port/process cleanup, and expanded test coverage for utilities, run submission, and NetCDF export helpers.
Changes:
- Default coupled model name updated to
copan, with support for overriding via a coupled config. - Slurm coupled-job submission now patches
slurm.jcfto ensure the coupler process is waited on (and submits viasbatchwhen needed). - Adds NetCDF writing + grid transform utilities for
LPJmLData/LPJmLDataSet, plus substantial new tests.
Reviewed changes
Copilot reviewed 16 out of 17 changed files in this pull request and generated 10 comments.
Show a summary per file
| File | Description |
|---|---|
pycoupler/config.py |
Adjusts coupled config defaults/behavior (model default + coupled config support). |
pycoupler/coupler.py |
Adds port cleanup utilities; fixes historic years handling; improves input sending conversion logic. |
pycoupler/run.py |
Introduces start_lpjml and deprecated alias; patches Slurm submission flow for coupled runs. |
pycoupler/utils.py |
Adds standardized deprecation warning helper. |
pycoupler/data.py |
Implements transform() and NetCDF export helpers for LPJmL data structures. |
tests/test_config.py |
Updates expectations for new default coupled model. |
tests/test_couple.py |
Adjusts time assignment and historic years expectation in coupling test. |
tests/test_run.py |
Extends Slurm submission tests to cover slurm.jcf patching and sbatch flow. |
tests/test_run_additional.py |
Adds additional run-related tests incl. deprecated alias warning behavior. |
tests/test_utils.py |
Extends detect_io_type test coverage for invalid UTF-8 bytes. |
tests/test_utils_additional.py |
Adds coverage for create_subdirs and read_json. |
tests/test_data.py |
Adds comprehensive coverage for grid transforms and NetCDF writing. |
tests/test_coupler_utils.py |
Adds coverage for new port cleanup utilities. |
tests/data/invalid_utf8.bin |
Adds a sample binary file (currently appears redundant with the updated test). |
pycoupler/release.py |
Clarifies release script instructions. |
pycoupler/__init__.py |
Minor formatting cleanup. |
CITATION.cff |
Bumps version to 1.7.0. |
Comments suppressed due to low confidence (1)
pycoupler/run.py:265
slurm_jcf_diris documented/used as the directory whereslurm.jcfshould be written, but thelpjsubmitsubprocess is still executed withoutcwd=.... Iflpjsubmitwritesslurm.jcfto its working directory (typical),_patch_slurm_and_submit()will look inslurm_jcf_dir/slurm.jcfand fail even though the file exists elsewhere. Consider ensuring the directory exists and runninglpjsubmitwithcwd=slurm_jcf_dir(or otherwise directinglpjsubmitoutput) soslurm_jcf_pathmatches where the file is actually created.
# run in coupled mode and pass coupling program/model
needs_coupler_wait = bool(couple_to)
if slurm_jcf_dir is None:
slurm_jcf_dir = os.getcwd()
slurm_jcf_path = Path(slurm_jcf_dir) / "slurm.jcf"
couple_file = None
if couple_to:
python_path = "python3"
if venv_path:
python_path = os.path.join(venv_path, "bin/python")
if not os.path.isfile(python_path):
raise FileNotFoundError(
f"venv path contains no python binary at '{python_path}'."
)
bash_script = f"""#!/bin/bash
# Define the path to the config file
config_file="{config_file}"
# Call the Python script with the config file as an argument
{python_path} {couple_to} $config_file
"""
couple_file = f"{output_path}/copan_lpjml.sh"
with open(couple_file, "w") as file:
file.write(bash_script)
# Change the permissions of the file to make it executable
run(["chmod", "+x", couple_file])
cmd.extend(["-couple", couple_file])
if needs_coupler_wait:
cmd.append("-norun")
cmd.extend([str(ntasks), config_file])
# Intialize submit_status in higher scope
submit_status = None
# set LPJROOT to model_path to be able to call lpjsubmit
try:
os.environ["LPJROOT"] = config.model_path
# call lpjsubmit via subprocess and return status if successfull
submit_status = run(cmd, capture_output=True)
except Exception as e:
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
sieben-gea
left a comment
There was a problem hiding this comment.
This pull requests tries to do too many things at once. Please, in the future, split many unrelated changes like these into multiple PRs and, at the very least, into meaningful commits with proper commit messages.
I found that the code lacks coherence and has many subtle problems. In particular the changes to data.py seem half-done to me. It might be that those changes work in your particular use case, but they are not general enough to be used in production.
Most of the tests you submitted are also less meaningful then they might seems. Many of the only test if a particular function is called, which ignores their side effects and can not be considered a full test.
Please fully read all the changes you made and explain to me what you were trying to do (and also how these changes are related to the changes in pycopanlpjml, because I don’t yet see how).
| for inp in inputs: | ||
| sock_input = getattr(self.input, inp) | ||
| if not hasattr(sock_input, "__dict__"): | ||
| continue # skip scalars (e.g. delta_year in LPJmL v6) |
There was a problem hiding this comment.
I don’t understand this? What changed in LPJmL v6?
| @@ -558,6 +571,8 @@ def _set_input_sockets(self, inputs=[]): | |||
| """Set sockets for inputs and outputs (via corresponding ids)""" | |||
| for inp in inputs: | |||
| sock_input = getattr(self.input, inp) | |||
There was a problem hiding this comment.
Where is self.input set? We depend on it being in the config, but there is no method that sets it.
| def regrid(self, sim_path, model_path=None, country_code="BEL", overwrite=False): | ||
| def regrid( | ||
| self, sim_path, model_path=None, country_code="BEL", overwrite=False | ||
| ): # noqa: E501 |
There was a problem hiding this comment.
We should disable this error globally. It makes no sense, especially since we already enforce formatting through black (and I also am not a fan of enforcing such a short line length, because I like to fit more code on my screen and this is trading off vertical space with horizontal space). PEP 8 allows up to 99 characters in a line if this is consistent within a team.
| try: | ||
| # Find processes using the port | ||
| result = subprocess.run( | ||
| ["lsof", "-ti", f":{port}"], capture_output=True, text=True, timeout=5 |
There was a problem hiding this comment.
This is way to broad. Killing processes of other users is not acceptable, especially on the cluster. Try something more specific, like:
lsof -u $USER -a -i :<PORT NUMBER> -sTCP:LISTEN -t
Another thought: Why is a port even needed? If we communicate via sockets anyway, we should use socket files on linux (as discussed here: https://serverfault.com/questions/195328/unix-socket-vs-tcp-ip-hostport).
| if result.returncode == 0 and result.stdout.strip(): | ||
| pids = result.stdout.strip().split("\n") | ||
| killed_count = 0 | ||
| for pid in pids: | ||
| if pid.strip(): | ||
| try: | ||
| kill_result = subprocess.run( | ||
| ["kill", "-9", pid.strip()], | ||
| timeout=5, | ||
| capture_output=True, | ||
| ) | ||
| if kill_result.returncode == 0: | ||
| killed_count += 1 | ||
| except subprocess.TimeoutExpired: |
There was a problem hiding this comment.
All of this could have been a pipe: lsof … -t | xargs kill -9
| "Cannot write LPJmLData with a 'cell' dimension that lacks " # noqa: E501 | ||
| "'lon' and 'lat' coordinates." | ||
| ) | ||
| lpjml = lpjml.transform("lon_lat") |
There was a problem hiding this comment.
This is transforming the instance itself!
| _suppress_coordinate_fill(dataset) | ||
| if global_attrs: | ||
| dataset.attrs.update(dict(global_attrs)) | ||
| dataset.attrs.setdefault("Conventions", "CF-1.8") |
There was a problem hiding this comment.
Why is this not part of _ensure_cf_metadata() and have you checked this version is actually the one adhered to?
| kwargs = dict(kwargs) | ||
| per_variable = kwargs.pop("split_vars", per_variable) | ||
| kwargs.pop("file_prefix", None) | ||
| kwargs.pop("suffix", None) |
There was a problem hiding this comment.
Why do you remove these?
| return LPJmLMetaData(read_json(file_name)) | ||
|
|
||
|
|
||
| def _netcdf_encoding( |
There was a problem hiding this comment.
If you already created a function for this, please use it in LpjmlData as well.
|
|
||
| aligned = xr.Dataset(data_vars) | ||
| aligned.attrs.update(ds.attrs) | ||
| aligned.attrs.setdefault("Conventions", "CF-1.8") |
There was a problem hiding this comment.
Why would you do this here?
This MR introduces new features for the compatibility with the latest pycopanlpjml version: