Add section about Pathfinder diagnostic and using for inits

avehtari · web-flow · commit e6b900e5356d · 2024-11-15T18:33:55.000+02:00
diff --git a/src/reference-manual/pathfinder.qmd b/src/reference-manual/pathfinder.qmd
@@ -25,3 +25,29 @@ evaluations, with greater reductions for more challenging posteriors.
 While the evaluations in @zhang_pathfinder:2022 found that
 single-path and multi-path Pathfinder outperform ADVI for most of the models in the PosteriorDB evaluation set,
 we recognize the need for further experiments on a wider range of models.
+
+## Diagnosing Pathfinder
+
+Pathfinder diagnoses the accuracy of the approximation by computing the density ratio of the true posterior and 
+the approximation and using Pareto-$\hat{k}$ diagnostic (Vehtari et al., 2024) to assess whether these ratios can
+be used to improve the approximation via resmapling. /, the
+normalization for the posterior can be  estimated reliably (Section 3, Vehtari et al., 2024), which is the
+first requirement for reliable resampling.  If estimated Pareto-$\hat{k}$ for the ratios is smaller than 0.7,
+there is still need to further diagnose importance sampling estimates by taking into account also the expetant
+function (Section 2.2, Vehtari et al., 2024). If estimated Pareto-$\hat{k}$ is larger than 0.7, then the 
+estimate for the normalization is unreliable and any Mote Carlo estimate may have a big error. The resampled draws
+can still contain some useful information about the location and shape of the posterior which can be used in early
+parts of Bayesian workflow (Gelman et al, 2020).
+
+## Using Pathfinder for initializing MCMC
+
+If estimated Pareto-$\hat{k}$ for the ratios is smaller than 0.7, the resampled posterior draws are almost as
+good for initializing MCMC as would indepepent draws from the posterior be. If estimated Pareto-$\hat{k}$ for the 
+ratios is larger than 0.7, the Pathfinder draws are not reliable for posterior inference directly, but they are still 
+very likely better for initializing MCMC than random draws from an arbitrary pre-defined distribution (e.g. uniform from 
+-2 to 2 used by Stan by default). If Pareto-$\hat{k}$ is larger than 0.7, it is likely that one of the ratios is much bigger
+than others and the default resampling with replacement would produce copies of one unique draw. For initializing several
+Markov chains, it is better to use resampling without replacement to guarantee unique initialization for each chain. At the
+moment Stan allows turning off the resampling completely, and then the resampling without replacement can be done outside of
+Stan.
+