Impact of parameter bounds on fit results due to `MINUIT` #2584

MoAly98 · 2025-04-09T09:59:37Z

MoAly98
Apr 9, 2025

During the development of alternative model uncertainty compute methods in cabinetry (see scikit-hep/cabinetry#221 and scikit-hep/cabinetry#535), we ran into some interesting behaviour due to MINUIT handling of parameters with bounds.

The external variables (exposed to user) corresponding to bound parameters in MINUIT undergo a transformation to be converted into internal ones, such that the internal variables are free to take any value while the external ones being bound to to their limits. This non-linear transformation is described in section 1.3.1 in the MINUIT user's guide. As a result of this transformation, a model that is perfectly linear as a function of a bound parameter, becomes non-linear during the minimisation.

This can introduce three effects.

In the round trip external -> internal -> external, numerical differences appear, ~~which were reported by @pfackeldey in feat: add parameter transformations pfackeldey/evermore#28~~ (these differences were due to choice of single vs double precision, but differences still exist near the parameter bounds due to the round trip).
The parameter errors extracted from the fit will be ill-defined near the parameter bounds, as explained in section 5.3.1 of [MINUIT user's guide].
Computatinal cost as the calculation of the trignometric functions for transformations are expensive and are called once per bound parameter (as pointed out by @pfackeldey).

We focus on no. 2. The parameter errors, from minuit.error are reported by transforming the interval around the parameter from the internal MINUIT coordinate space to the external user's space, via the Jacobian of the transformation. On the other hand, the covariance matrix reported by MINUIT through minuit.covariance (asestimated by HESSE or MIGRAD) is transformed into user space using only the derivative at the best-fit point (https://root-forum.cern.ch/t/fit-errors-from-covariance-matrix/20772).

In the case where the parameter interval is far from the parameter limits, the reported error and the square-root of the diagonal covariance elements are almost the same, as the linear approximation in the covariance transformation method holds. Once the parameter approaches its limit, the transformation function becomes non-linear, the Jacobian deviate from unity, and the two methods start reporting very different errors. Both methods are approximations, and hence there is no clear preference on what's best, and that's why in general MINUIT recommends not using limits at all, unless they are absolutely necessary to maintain a physical parameter value.

I wrote a very simple linear model, with 1 sample, 1 channel and 1 normalisation factor to demonstrate this effect (code shown in Snippet 1 below). The results are shown in this plot:

Which demonstrate that for a given fit result, the more we shrink the parameter bounds to be closer to that result, the more we see the strange behaviour due to the parameter transformations. In (Snippet 2), a fit is performed directly with minuit using the same pyhf model, testing the effect of the transformations as we impose a one-sided parameter bound, and when we remove parameter bounds completely. The results are:

======================
Bounds:  (None, None)
=======================
Errors:  0.25819574034591364
Covariance errors: 0.25819574034591364
Ratio (errors/cov): 1.0
======================
Bounds:  [0, None]
=======================
Errors:  0.2580464744846671
Covariance errors: 0.2581668173129976
Ratio (errors/cov): 0.9995338563275364
======================
Bounds:  [0, 10.0]
=======================
Errors:  0.2580040280532856
Covariance errors: 0.2581832729697779
Ratio (errors/cov): 0.9993057454325739

The reason one-sided (lower) bound give a different result to two-sided ones, despite the fitted interval being close to the lower bound, is because MINUIT uses a different transformation for one-sided parameter bounds, which is implemented here.

Snippet 1

import numpy as np
import pyhf
import matplotlib.pyplot as plt

# Helper to build spec with variable mu bounds
def make_spec(mu_bounds):
    return {
        'channels': [
            {
                'name': 'singlechannel',
                'samples': [
                    {
                        'name': 'signal',
                        'data': [10.0, 20.0],
                        'modifiers': [{'name': 'mu', 'type': 'normfactor', 'data': None}]
                    }
                ]
            }
        ],
        'measurements': [
            {
                'name': 'measurement',
                'config': {
                    'poi': 'mu',
                    'parameters': [
                        {'name': 'mu', 'inits': [2.0], 'bounds': [mu_bounds], 'fixed': False}
                    ]
                }
            }
        ],
        'observations': [{'name': 'singlechannel', 'data': [30.0, 30.0]}],
        'version': '1.0.0'
    }

# Sweep over increasing bound widths centered around 5
center = 2.0
widths = np.linspace(0.2, 15, 50)  # total bound width
bounds = [[center - float(w)/2, center + float(w)/2] for w in widths]

bestfits, errors_minuit, errors_cov = [], [], []

pyhf.set_backend("numpy", "minuit")
for b in bounds:
    spec = make_spec(b)
    ws = pyhf.Workspace(spec)
    model = ws.model()
    data = ws.data(model, include_auxdata=False)

    result, _, _, result_obj = pyhf.infer.mle.fit(
        data, model,
        return_uncertainties=True,
        return_correlations=True,
        return_fitted_val=True,
        return_result_obj=True
    )

    mu_hat = result_obj.minuit.values["mu"]
    err_minuit = result_obj.minuit.errors["mu"]
    err_cov = np.sqrt(result_obj.minuit.covariance.view()[0, 0])

    bestfits.append(mu_hat)
    errors_minuit.append(err_minuit)
    errors_cov.append(err_cov)

# Plot the ratio of uncertainties vs. total bound width
errors_minuit = np.array(errors_minuit)
errors_cov = np.array(errors_cov)
ratios = errors_minuit / errors_cov

mu_plus_1sigma = bestfits + errors_minuit
mu_minus_1sigma = bestfits - errors_minuit

fig, ax1 = plt.subplots()

# Left axis: ratio of errors
color = 'tab:blue'
ax1.set_xlabel("Total width of mu bounds")
ax1.set_ylabel("Uncertainty ratio (minuit.errors / sqrt(cov))", color=color)
ax1.plot(widths, ratios, label="Uncertainty ratio", color=color)
ax1.tick_params(axis='y', labelcolor=color)
ax1.axhline(1.0, linestyle="--", color="gray")

# Right axis: mu values
ax2 = ax1.twinx()
color_plus = 'tab:green'
color_minus = 'tab:red'
ax2.set_ylabel("mu", color='black')
ax2.plot(widths, mu_plus_1sigma, label="mu + 1σ", color=color_plus)
ax2.plot(widths, mu_minus_1sigma, label="mu - 1σ", color=color_minus)
ax2.tick_params(axis='y', labelcolor='black')

# draw the actual bounds
ax2.plot(widths, [b[1] for b in bounds], linestyle=':', color='black', label='upper bound')
ax2.plot(widths, [b[0] for b in bounds], linestyle=':', color='gray', label='lower bound')

# Legends and title
fig.tight_layout()
fig.suptitle("Effect of mu bound width on uncertainty and interval edges", fontsize=12)
fig.subplots_adjust(top=0.9)  # adjust for suptitle
ax1.grid(True)

# Combine legends from both axes
lines, labels = ax1.get_legend_handles_labels()
lines2, labels2 = ax2.get_legend_handles_labels()
ax2.legend(lines + lines2, labels + labels2, loc="lower right")

plt.show()

Snippet 2

import iminuit
import pyhf
import numpy as np

TOLERANCE = 1e-3

spec = {
        'channels': [
            {
                'name': 'singlechannel',
                'samples': [
                    {
                        'name': 'signal',
                        'data': [10.0, 20.0],
                        'modifiers': [{'name': 'mu', 'type': 'normfactor', 'data': None}]
                    }
                ]
            }
        ],
        'measurements': [
            {
                'name': 'measurement',
                'config': {
                    'poi': 'mu',
                    'parameters': [
                        {'name': 'mu', 'inits': [6.0], 'bounds': [[0,10.0]], 'fixed': False}
                    ]
                }
            }
        ],
        'observations': [{'name': 'singlechannel', 'data': [20.0, 40.0]}],
        'version': '1.0.0'
    }

ws = pyhf.Workspace(spec)
model = ws.model()
data = ws.data(model, include_auxdata=False)


def twice_nll_init_pars_data(model, init_pars, data):
    def _twice_nll(pars_, data_):
        return -2 * model.logpdf( np.asarray(pars_), data_)[0]

    return _twice_nll, init_pars, data

def prepare_iminuit(nll, init_pars, data, bounds):
    def likelihood(mu):
        return nll([mu], data)

    minuit = iminuit.Minuit(likelihood, mu=init_pars[0], grad=None,)

    minuit.strategy = 2
    minuit.print_level = 0
    minuit.tol = TOLERANCE
    minuit.limits = bounds

    def fit():
        minuit.migrad(ncall=100_000, use_simplex=False)
        bestfit = minuit.values
        errors = minuit.errors
        covariance = minuit.covariance
        return bestfit, errors, covariance
    minuit.reset()
    return fit

bounds = [(None,None), [0,None], [0, 10.0]]
for bound in bounds:
    nll, init_pars, data = twice_nll_init_pars_data(model, [1.0], data)
    iminuit_fit = prepare_iminuit(nll, [1.0], data, bound)
    bestfit, errors, covariance = iminuit_fit()
    print("======================")
    print("Bounds: ", bound)
    print("=======================")
    print("Errors: ", errors["mu"])
    print("Covariance errors:", covariance.view()[0,0]**0.5)
    print("Ratio (errors/cov):", errors["mu"]/covariance.view()[0,0]**0.5)

pfackeldey · 2025-04-09T20:25:08Z

pfackeldey
Apr 9, 2025
Maintainer

Nice summary!!

Just a comment:

In the round trip external -> internal -> external, numerical differences appear, which were reported by @pfackeldey in pfackeldey/evermore#28.

this is correct, but in my PR it's mainly because of my usage of single precision. With double precision there's no difference anymore. Sorry for the confusion here!
However, it's true that if your parameter value is close to the bounds the round-trip from external->internal->external may not be 100% accurate.

Another aspect to consider: I expect an overhead from these transformation that may be noticeable in runtime. Trigonometric functions (that are used by Minuit for the transformation if 2-sided bounds are present) are expensive, and I think(?) they are called per bounded parameter and per update step twice (external -> internal and back), so a few thousand times typically.

I'm not sure why pyhf enforces 2-sided bounds, maybe because one can't sample a free-floating parameter without bounds from a Uniform distribution? But it might be worth to investigate if that requirement could be removed.

1 reply

MoAly98 Apr 11, 2025
Author

Ah thanks for checking! I updated the description now to reflect both the point on precision and computational cost. I agreee that maybe it is worth considering allowing users more flexibility in setting/removing bounds. I am opening an issue to track this.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Impact of parameter bounds on fit results due to `MINUIT` #2584

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 1 comment 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Impact of parameter bounds on fit results due to MINUIT #2584

Uh oh!

Uh oh!

MoAly98 Apr 9, 2025

Snippet 1

Snippet 2

Replies: 1 comment · 1 reply

Uh oh!

pfackeldey Apr 9, 2025 Maintainer

Uh oh!

MoAly98 Apr 11, 2025 Author

Impact of parameter bounds on fit results due to `MINUIT` #2584

MoAly98
Apr 9, 2025

Replies: 1 comment 1 reply

pfackeldey
Apr 9, 2025
Maintainer

MoAly98 Apr 11, 2025
Author