-
Notifications
You must be signed in to change notification settings - Fork 0
Description
After #44 I expected there to be a fairly tidy step involving updating downstream consumption of this package's public methods, but I'm finding some surprises. (If your brain works better with examples, the concrete question I'm trying to get at is "where should I add cache-checking/model-downloading logic so that it's actually used?")
openff-nagl
never importsopenff.nagl_models
, so there isn't even code there to update to use dynamic fetching and cachingopenff-toolkit
, for better or worse, usesvalidate_nagl_model_path
as the chokepoint through which a model name (as a string that looks a lot like a PyTorch model) is magically transformed into a (full) path on disk that corresponds to a model whichopenff.nagl.GNNModel
can gobble up- some examples (but no released OpenFF packages) use
list_available_nagl_models
, but this is just to show what models are available. There isn't an obvious use of this method aside from just seeing what files are available. In any scientific work that I can think of, and even hastily-made examples, the user actually declares which model they wish to load.
A constraint we are arguably working under, for better or worse, is how the toolkit currently loads models. There are absolutely paths forward for changing that in favor of better plumbing, but it does imply that we could have validate_nagl_model_path
be the point through which all logic and/or point users toward. (We might want to rename this function, but that can be done in a backwards-compatible manner with an alias.) I have a little prototype that explores this idea in a manner that might not require any code changes in other packages (#58)