[phenixbb] Picking Rfree in thin resolution shells using command line

Pavel Afonine pafonine at lbl.gov
Tue Jan 31 11:01:45 PST 2012

Hi Felix,

NCS: given state-of-the-art NCS restraints there is (probably) no 
clear-cut answer, but there are three ones: "definitely yes", 
"definitely no", and "try to find out". Obviously, at low enough 
resolution NCS should be always used (say ~2A and lower), simply because 
this provides a luxury of additional a priori information to alleviate 
the poor data-to-parameters ratio problem. Obviously, at high enough 
resolution (~1.5-1.7A or so) NCS should not be used since the amount of 
data may be enough to see actual differences between NCS copies, and 
using NCS would probably wipe out these difference (or at least there is 
such a risk). In the grey area, ~1.7-2.0A, one should try using vs not 
using NCS to know for sure.

Also, it may be good to mention that if the NCS groups are selected 
perfectly (that for example includes making sure to not apply NCS to 
atoms that do not obey NCS) then most likely NCS could be used at any 

Rfree: at very high resolution not including 5-10% of the data probably 
wouldn't hurt too much (provided that the data complete). Having free-R 
may be handy even at subatomic resolution: for example, to 
illustrate/prove that using IAS (Interatomic Scatterers model) or 
Multipoles actually improves your model and not overfits the data. Note, 
when multipolar model is used, it is 32 (or 28 - I forgot?) refinable 
parameters per atom, so the data-to-parameters ratio for a macromolecule 
may not be that great even at ~0.9-0.7A resolution! Computing less 
biased maps is another reason to keep the free set of reflections: note, 
the m and D in 2mFo-DFc and mFo-DFc maps have to be computed using free 


On 1/31/12 10:07 AM, Felix Frolow wrote:
> I have a question of  general significance: in what resolution NCS 
> restrains and Rfree become IRRELEVANT?
> Axel Brunger invented Rfree to save our necks from refining garbage 
> into the structure distantly looking like protein.
> Since than Rfree was idolized. However, there is a big difference 
> between structures at  4.1 Angstrom and 1.4 Angstrom.
> In small molecule crystallography we can easily achieve 10 or 20 
> observations per refined parameter (depends on presence or absence of 
> inversion center), therefore, no one care about Rfree in the small 
> molecules community.
> In the well ordered protein structures, the bulk water region is 
> working against us lowering diffraction strength contributing to 
> 1/Volume, but it is also on our side minimizing a volume occupied
> by protein molecules (less atoms, fewer parameters). I have a 
> structure (not yet published) where for 18000 protein atoms and about 
> 9000 other atoms  (water molecules, sulfate ions, sugars from 
> cryo-protection etc)
> there are  750,000 independent observations. It makes about 28 
> observations per atom and together with the chemical observations such 
> as bonds and angles which rarely differs from their classical values 
> defined by small structures, if we keep anomalous data properly scaled 
> and separated (there will be differences in good data sets that 
> depends on S atoms and some other ions in solute, or even oxygen atoms) -
> we have quite good ratio of observations per refined parameter.
> So my question is: Do WE and WHAT FOR need to mess with Rfree in 
> structures of relatively/very high resolutions?
> Dr Felix Frolow
> Professor of Structural Biology and Biotechnology
> Department of Molecular Microbiology
> and Biotechnology
> Tel Aviv University 69978, Israel
> Acta Crystallographica F, co-editor
> e-mail: mbfrolow at post.tau.ac.il <mailto:mbfrolow at post.tau.ac.il>
> Tel:  ++972-3640-8723
> Fax: ++972-3640-9407
> Cellular: 0547 459 608
> On Jan 31, 2012, at 17:35 , Pavel Afonine wrote:
>> Hi Simon,
>> the difference is well illustrated in
>> F. Fabiola, A. Korostelev and M. S. Chapman
>> Acta Cryst. (2006). D62, 227-238
>> Bias in cross-validated free R factors: mitigation of the effects of 
>> non-crystallographic symmetry
>> The question is whether we can reproduce it in the exact same set of 
>> test structures.
>> Pavel
>> On 1/31/12 1:46 AM, Simon Kolstoe wrote:
>>> Thanks for the interesting comments.
>>> I was just wondering what sort of "difference" we are expecting to 
>>> see? Is it just a case of preventing an artificially lowered Rfree 
>>> or is there an expectation to see a difference in the quality of the 
>>> electron density?
>>> Simon
>>> ---------------------------------------------------------------
>>> Dr Simon Kolstoe
>>> Laboratory for Protein Crystallography
>>> Wolfson Drug Discovery Unit
>>> University College London
>>> Rowland Hill Street, London NW3 2PF
>>> Tel: 020 7433 2765
>>> http://www.ucl.ac.uk/~rmhasek <http://www.ucl.ac.uk/%7Ermhasek>
>>> ---------------------------------------------------------------
>>> On 31 Jan 2012, at 08:47, A Leslie wrote:
>>>> Hi Randy,
>>>>                   I can't remember if I ever mentioned this to you, 
>>>> but when I was working on the HepB capsid structure (30 fold ncs if 
>>>> i remember correctly) I tried using a "thin shell within a thick 
>>>> shell" method of selecting Rfree, to avoid the issue that within a 
>>>> thin shell there are still relationships between those reflections 
>>>> within the shell and those just outside it. I forget the details, 
>>>> but I think I used a thin shell of 1-2 rlps wide for the 
>>>> reflections to be used for Rfree, but I also excluded from the 
>>>> refinement reflections within a thick shell 4-5 rlps wide (the thin 
>>>> shell was in the middle of the thick shell). Because this excluded 
>>>> so many reflections I could only have 3 thick/thin shells 
>>>> altogether, so I chose them at low, middle and highish resolution.
>>>> The upshot of all this was that it was no help at all. Almost 
>>>> regardless of, say, the relative weight I put on the Xray terms, or 
>>>> anything else I did, I could never get the Rfree to go up ! The 
>>>> strict NCS restraints were so strong that the refinement 
>>>> essentially always "behaved".
>>>> This for me destroyed all my faith in this thin shell idea !
>>>> So this is definitely NOT an example where it worked.
>>>> I have not sent this to the bulletin board because my memory of 
>>>> exactly what I did is a bit hazy, but the message was clear enough.
>>>> Cheers
>>>> Andrew
>>>> On 30 Jan 2012, at 17:06, Randy Read wrote:
>>>>> I'd be meaning to contribute to this debate, and now that I see my 
>>>>> name mentioned...
>>>>> I used to be a very strong believer in selecting the 
>>>>> cross-validation data in thin shells, when you have NCS.  I even 
>>>>> had a recollection (a case of false memory syndrome, it seems) 
>>>>> that we did this for our own case of 20-fold NCS, i.e. four copies 
>>>>> of the Shiga-like toxin B-subunit pentamer cocrystallized with the 
>>>>> Gb3 trisaccharide (Ling et al, 1998).
>>>>> As a believer in thin shells, I was trying to convince Pavel to 
>>>>> put an option for this in Phenix (like the one in sftools).  He 
>>>>> said that he'd never seen any evidence that it was necessary or 
>>>>> made any difference.  So I went back to the Shiga-like toxin 
>>>>> structure and started parallel refinements from the MR solution, 
>>>>> either choosing the cross-validation data randomly or in thin 
>>>>> shells.  And, guess what, I couldn't see any significant 
>>>>> difference in how well the refinement went, even though I was 
>>>>> pretty certain before doing that experiment that it would make a 
>>>>> big difference.  In fact, both refinements went pretty well.
>>>>> So if thin shells aren't necessary even in an extreme case of NCS, 
>>>>> then I suspect that they're not that useful in the more usual case 
>>>>> of lower-order NCS.
>>>>> In any case, there is a problem even with the thin shells (which 
>>>>> Bart Hazes pointed out even as he implemented it in sftools).  The 
>>>>> theory suggests that reflections within some distance in 
>>>>> reciprocal space of some reflection or a point related to it by an 
>>>>> NCS rotation should be correlated to the original reflection.  All 
>>>>> the points related by rotation will fall into the same resolution 
>>>>> shell but, since the reciprocal-space distance is related to the 
>>>>> inverse of the diameter of the molecule, the shell would have to 
>>>>> have some thickness, and the reflections at the edge of the shell 
>>>>> would still be correlated to reflections not in the shell.  So 
>>>>> even thin-shell cross-validation doesn't get around all the 
>>>>> theoretical problems.
>>>>> I'd be interested if someone has an example where it really does 
>>>>> make a difference, but in the meantime it's hard to argue with 
>>>>> Pavel's point of view!
>>>>> Regards,
>>>>> Randy
>>>>> On 30 Jan 2012, at 15:26, Nathaniel Echols wrote:
>>>>>> On Mon, Jan 30, 2012 at 3:43 AM, Simon 
>>>>>> Kolstoe<s.kolstoe at ucl.ac.uk <mailto:s.kolstoe at ucl.ac.uk>>  wrote:
>>>>>>> I see from a quick google that it is possible to pick my Rfree's 
>>>>>>> using thin resolution shells (coz I've got 20 fold NCS), however 
>>>>>>> as I am someone who tries to avoid the GUI where at all possible,
>>>>>> Why?  Some things are simply easier to do in the GUI, or at least 
>>>>>> more
>>>>>> obvious - otherwise we wouldn't bother writing one.
>>>>>>> could someone let me know what the command line way of doing 
>>>>>>> this is?
>>>>>> In phenix.refine, you probably want something like this (some
>>>>>> parameters optional, but the defaults are probably not what most
>>>>>> people expect):
>>>>>> xray_data.r_free_flags.generate=True
>>>>>> xray_data.r_free_flags.fraction=0.05
>>>>>> xray_data.r_free_flags.max_free=None
>>>>>> xray_data.r_free_flags.use_dataman_shells=True
>>>>>> xray_data.r_free_flags.n_shells=20
>>>>>> Randy and Paul claim that this doesn't help very much with the NCS
>>>>>> issue, however.
>>>>>> -Nat

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://phenix-online.org/pipermail/phenixbb/attachments/20120131/6b415663/attachment-0001.htm>

More information about the phenixbb mailing list