NoteOnMlfConvergenceBehaviour

MLF convergence behavior

We have observed that the MLF algorithms inside the programs mlf_align2d and mlf_refine3d may yield (much) better results than the somewhat older ML algorithms in the programs MLalign2D and MLrefine3D. However, there are a few important things to mention.

Clean your data!

We have observed for several data sets that the MLF algorithms are more sensitive to outliers than the ML algorithms. In the 2D case, the presence of outliers typically results in multiple classes converging to very noisy references to which very low numbers of particles contribute. In the 3D case, outliers are more difficult to detect, but unsatisfactory convergence behavior may indicate that outliers are indeed present.

Therefore, it appears to be crucial to clean your data from outliers. Inspect the power spectral densities and estimated CTFs for each micrograph and discard micrographs with astigmatism, drift or very weak signals. Also avoid particles with neighboring particles in the same image (as these cannot be modeled correctly) and avoid images with dust or bubbles in it. You may use the SortByStatistics program to detect outliers in the data. Don't be too greedy: it is often better to have fewer good particles than a lot of bad ones!

Consider early stopping

For several cases we have observed that the MLF algorithms have difficulties converging. The class averages keep changing, although even a hundred or more iterations have been performed. This behavior may again be caused by the presence of outliers or other factors that fall outside the scope of the statistical data model (as beautiful or simulated data sets tend to converge much quicker). A possibility to consider in these cases is to stop the algorithm before reaching convergence (early stopping). We have observed in several cases that the alignment after 20 iterations was considerably better than after 100 iterations.

Adjust the estimated signal-to-noise ratios

For very noisy data sets with relatively few particles, the estimation of the spectral signal-to-noise ratios (!SSNRs) at every iteration may become unreliable, yielding overestimated SSNRs. This results in the inclusion of too high frequencies in the refinement, yielding very noisy class averages (in 2D) or reconstruction (in 3D). If this happens, re-run the calculation with the -reduce_noise parameter set to values below 1 (e.g.-reduce 0.5) to lower the estimated SSNRs. Also, one may limit the highest frequencies to be taken into account in the refinement process using the -high parameter (e.g.-high 20 means that only frequencies up to 20 Angstrom will be taken into account, regardless the estimated SSNRs). The latter has an additional advantage that usually it results in a speed-up and a reduction of the memory requirement.

--Main.SjorsScheres - 28 Nov 2007

NoteOnMlfConvergenceBehaviour

MLF convergence behavior

Clean your data!

Consider early stopping

Adjust the estimated signal-to-noise ratios

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!