En/review eg/losse val grad#39
Conversation
…ction instead of only a grad function. This is necessary for the augmented lagragian modifications in the parallel pull request, since in lbfgsb the val_and_grad option is more efficient and the only viable option.
| args = tuple_unraveler(dofs) | ||
| return {name: value for name, value in zip(self.args_names, args)} | ||
|
|
||
| self._dofs_to_pytree = _named_unraveler |
There was a problem hiding this comment.
This fixes the problem that we had, that the custom_loss did not have a named unraveler, only the composite loss, right?
There was a problem hiding this comment.
I would need to test this to make sure everything works properly. In theory, it makes sense, but I need more time to figure it out properly.
There was a problem hiding this comment.
Yes, The examples to be added after were technically working. I guess we could add more later?
There was a problem hiding this comment.
Added some tests on test_multiobjectives.py
| self._dofs_to_args = None | ||
|
|
||
| def _ensure_unravelers(self): | ||
| if self._starting_dofs is None or self._dofs_to_args is None or self._dofs_to_pytree is None: |
There was a problem hiding this comment.
Maybe use and instead of or
There was a problem hiding this comment.
I am not sure I understand this change. If none of those are defined donn't we always want to define them?
rogeriojorge
left a comment
There was a problem hiding this comment.
Mostly 3 minor stuff:
- losses.py:111, custom_loss.call_pytree():
dofs_to_pytree now returns a named dict, but call_pytree() still does self.fun(*dofs_pytree, **self.kwargs). If dofs_pytree is a dict, this passes the keys as positional args. I think this should either be:
args = tuple(dofs_pytree[name] for name in self.args_names)
return self.fun(*args, **self.kwargs)
or, if the custom loss functions are expected to accept keyword args:
return self.fun(**dofs_pytree, **self.kwargs)
- losses.py:132-136, custom_loss.grad_pytree()
same things. It currently does *dofs_pytree, which will be wrong if dofs_pytree is the new named dict. I would build
args = tuple(dofs_pytree[name] for name in self.args_names)
and differentiate with those positional args, then put the gradients back into buffer.
- losses.py:80, _ensure_unravelers
I think "or" is actually the right condition here. Using "and" would only rebuild when all of them are None, which could leave _dofs_to_args or _dofs_to_pytree unset. But not sure.
A small test for custom_loss.value_and_grad() with two dependencies, and dofs_to_pytree() followed by call_pytree() / grad_pytree() would probably catch the dict-vs-tuple issue above.
|
I think I addressed all the comments for this PR. |
|
Can you just add one assertion for value_and_grad() itself in test_custom_loss_named_unraveler()? |
rogeriojorge
left a comment
There was a problem hiding this comment.
Can you just add one assertion for value_and_grad() itself in test_custom_loss_named_unraveler()?
Otherwise, it's good for me to merge
Modifying losses.py to allow custom_losses to have a val_and_grad function instead of only a grad function. This is necessary for the augmented lagragian modifications in the parallel pull request, since in lbfgsb the val_and_grad option is more efficient and the only viable option. Changes on losses.py.