Now that we have zero_grad()... Now maybe a global optimizer will be better than multiple local ones. So try that as well.