In the code , ` dx = self._alpha*np.power(1 - pt, self._gamma - 1) * (self._gamma * (-1 * pt * pro_) * np.log(pt) + pro_ * (1 - pt)) * 1.0 ` however, in my inference, they should be `dx = self._alpha*np.power(1 - pt, self._gamma - 1) * (self._gamma * (-1 * pt * pro_) * np.log(pt) + pt * (1 - pt)) * 1.0 ` Is that right?