[phenixbb] geometry_minimization makes molprobity score worse

Pavel Afonine pafonine at lbl.gov
Wed Jul 7 23:27:45 PDT 2021

Hi James,

thanks for email and sharing your observations!

> Greetings all, and I hope this little observation helps improve things 
> somehow.
> I did not expect this result, but there it is. My MolProbity score 
> goes from 0.7 to 1.9 after a run of phenix.geometry_minimization
> I started with an AMBER-minimized model (based on 1aho), and that got 
> me my best MolProbity score so far (0.7). But, even with hydrogens and 
> waters removed the geometry_minimization run increases the clashscore 
> from 0 to 3.1 and Ramachandran favored drops from 98% to 88% with one 
> residue reaching the outlier level.

It is not a secret that 'standard geometry restraints' used in Phenix 
and alike (read Refmac, etc) are very simplistic. They are not aware of 
main chain preferential conformations (Ramachandran plot), favorable 
side chain rotamer conformations. They don't even have any 
electrostatic/attraction terms -- only anti-bumping repulsion! Standard 
geometry restraints won't like any NCI (non-covalent interaction) and 
likely will make interacting atoms break apart rather than stay close 
together interacting.

With this in mind any high quality (high-resolution) atomic model or the 
one optimized using sufficiently high-level QM is going to have a more 
realistic geometry than the result of geometry regularization against 
very simplistic restraints target. An example:


and previous papers on the topic.

> Just for comparison, with refmac5 in "refi type ideal" mode I see the 
> MolProbity rise to 1.13, but Clashscore remains zero, some Ramas go 
> from favored to allowed, but none rise to the level of outliers.

I believe this is because of the nature of minimizer used. Refmac uses 
2nd derivative based one, which in a nutshell means it can move the 
model much less (just a bit in vicinity of a local minimum) than any 
program that uses gradients only (like Phenix).

> Files and logs here:
> https://bl831.als.lbl.gov/~jamesh/bugreports/phenixmin_070721.tgz
> I suspect this might have something to do with library values for 
> main-chain bonds and angles?  They do seem to vary between programs. 
> Phenix having the shortest CA-CA distance by up to 0.08 A. After 
> running thorough minimization on a poly-A peptide I get:
> bond   amber   refmac  phenix  shelxl Stryer
>  C-N   1.330   1.339   1.331   1.325     1.32
>  N-CA  1.462   1.482   1.455   1.454     1.47
> CA-C   1.542   1.534   1.521   1.546     1.53
> CA-CA  3.862   3.874   3.794   3.854
> So, which one is "right" ?

I'd say they are all the same, within their 'sigmas' which are from 
memory about 0.02A:

elbow.where_is_that_cif_file phe

All the best!

More information about the phenixbb mailing list